@@ -32,26 +32,46 @@ You will need to provide a configuration file, use one of the sample configurati
32
32
files as a template ([ ` scrapyd_k8s.sample-k8s.conf ` ] ( ./scrapyd_k8s.sample-k8s.conf )
33
33
or [ ` scrapyd_k8s.sample-docker.conf ` ] ( ./scrapyd_k8s.sample-docker.conf ) ).
34
34
35
+ The next section explains how to get this running Docker, Kubernetes or Local.
36
+ Then read on for an example of how to use the API.
37
+
35
38
### Docker
36
39
37
40
```
41
+ cp scrapyd_k8s.sample-docker.conf scrapyd_k8s.conf
42
+ docker build -t ghcr.io/q-m/scrapyd-k8s:latest .
38
43
docker run \
44
+ --rm \
39
45
-v ./scrapyd_k8s.conf:/opt/app/scrapyd_k8s.conf:ro \
40
46
-v /var/run/docker.sock:/var/run/docker.sock \
41
47
-v $HOME/.docker/config.json:/root/.docker/config.json:ro \
42
48
-u 0 \
49
+ -p 127.0.0.1:6800:6800 \
43
50
ghcr.io/q-m/scrapyd-k8s:latest
44
51
```
45
52
46
- This is not really recommended for production, as it exposes the Docker socket and
47
- runs as root. It may be useful to try things out.
53
+ You'll be able to talk to localhost on port ` 6800 ` .
54
+
55
+ Make sure to pull the spider image so it is known locally.
56
+ In case of the default example spider:
57
+
58
+ ``` sh
59
+ docker pull ghcr.io/q-m/scrapyd-k8-spider-example
60
+ ```
61
+
62
+ Note that running like this in Docker is not really recommended for production,
63
+ as it exposes the Docker socket and runs as root. It may be useful to try
64
+ things out.
65
+
48
66
49
67
### Kubernetes
50
68
51
69
1 . Create the spider namespace: ` kubectl create namespace scrapyd `
52
70
2 . Adapt the spider configuration in [ ` kubernetes.yaml ` ] ( ./kubernetes.yaml ) (` scrapyd_k8s.conf ` in configmap)
53
71
3 . Create the resources: ` kubectl create -f kubernetes.yaml `
54
72
73
+ You'll be able to talk to the ` scrapyd-k8s ` service on port ` 6800 ` .
74
+
55
75
### Local
56
76
57
77
For development, or just a quick start, you can also run this application locally.
@@ -62,15 +82,105 @@ Requirements:
62
82
- Either [ Docker] ( https://www.docker.com/ ) or [ Kubernetes] ( https://kubernetes.io/ ) setup and accessible
63
83
(scheduling will require Kubernetes 1.24+)
64
84
65
- Copy a sample configuration to ` scrapyd_k8s.conf ` and specify your project details.
85
+ This will work with either Docker or Kubernetes (provided it is setup).
86
+ For example, for Docker:
87
+
88
+ ``` sh
89
+ cp scrapyd_k8s.sample_docker.conf scrapyd_k8s.conf
90
+ python3 app.py
91
+ ```
92
+
93
+ You'll be able to talk to localhost on port ` 6800 ` .
66
94
67
- For Docker, you probably need to pull the image
95
+ For Docker, make sure to pull the spider image so it is known locally.
96
+ In case of the default example spider:
68
97
69
98
``` sh
70
99
docker pull ghcr.io/q-m/scrapyd-k8-spider-example
71
100
```
72
101
73
- TODO finish this section
102
+
103
+ ## Accessing the API
104
+
105
+ With ` scrapyd-k8s ` running and setup, you can access it. Here we assume that
106
+ it listens on ` localhost:6800 ` (for Kubernetes, you would use
107
+ the service name ` scrapyd-k8s:6800 ` instead).
108
+
109
+ ``` sh
110
+ curl http://localhost:6800/daemonstatus.json
111
+ ```
112
+
113
+ > ``` json
114
+ > {"spiders":0,"status":"ok"}
115
+ > ```
116
+
117
+ ```sh
118
+ curl http://localhost:6800/listprojects.json
119
+ ```
120
+
121
+ > ``` json
122
+ > {"projects":["example"],"status":"ok"}
123
+ > ```
124
+
125
+ ```sh
126
+ curl 'http://localhost:6800/listversions.json?project=example'
127
+ ```
128
+
129
+ > ``` json
130
+ > {"status":"ok","versions":["latest"]}
131
+ > ```
132
+
133
+ ```sh
134
+ curl 'http://localhost:6800/listspiders.json?project=example&_version=latest'
135
+ ```
136
+
137
+ > ``` json
138
+ > {"spiders":["quotes"],"status":"ok"}
139
+ > ```
140
+
141
+ ```sh
142
+ curl 'http://localhost:6800/schedule.json?project=example&_version=latest'
143
+ ```
144
+
145
+ > ``` json
146
+ > {"spiders":["quotes"],"status":"ok"}
147
+ > ```
148
+
149
+ ```sh
150
+ curl http://localhost:6800/listjobs.json
151
+ ```
152
+ ``` json
153
+ {
154
+ "finished" :[],
155
+ "pending" :[],
156
+ "running" :[{"id" :" e9b81fccbec211eeb3b109f30f136c01" ,"project" :" example" ,"spider" :" quotes" ,"state" :" pending" }],
157
+ "status" :" ok"
158
+ }
159
+ ```
160
+
161
+ To see what the spider has done, look at the container logs:
162
+
163
+ ``` sh
164
+ docker ps -a
165
+ ```
166
+
167
+ > ```
168
+ > CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES
169
+ > 8c514a7ac917 ghcr.io/q-m/scrapyd-k8s-spider-example:latest "scrapy crawl quotes" 42s ago Exited (0) 30s ago scrapyd_example_cb50c27cbec311eeb3b109f30f136c01
170
+ > ```
171
+
172
+ ```sh
173
+ docker logs 8c514a7ac917
174
+ ```
175
+
176
+ > ```
177
+ > [scrapy.utils.log] INFO: Scrapy 2.11.0 started (bot: example)
178
+ > ...
179
+ > [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/>
180
+ > {'text': 'The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.', 'author': 'Albert Einstein', 'tags': 'change'}
181
+ > ...
182
+ > [scrapy.core.engine] INFO: Spider closed (finished)
183
+ > ```
74
184
75
185
76
186
## Spider as Docker image
0 commit comments