Skip to content

Conversation

slotix
Copy link

@slotix slotix commented May 12, 2019

Following the issue #20 I've built docker image for se-scraper.

Build docker image

docker build -t se-scraper .

Ready-to-use image is available at https://hub.docker.com/r/slotix/se-scraper

Run se-scraper.

docker run -it -e HOST=0.0.0.0 -e PORT=3000 -p 3000:3000 slotix/se-scraper

Curl dockerized se-scraper service.

curl -XPOST http://0.0.0.0:3000 -H 'Content-Type: application/json' \
-d '{
    "user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36",
    "random_user_agent":true,
    "sleep_range":"",
"search_engine":"baidu",
    "debug":true,
    "verbose":true,
    "keywords":[ "cat",  "mouse" ],
    "keyword_file":"",
    "num_pages":1,
    "headless":true,
    "chrome_flags":[ ],
    "output_file":"examples/results/baidu.json",
    "block_assets":false,
    "custom_func":"",
    "proxy":"",
    "proxy_file":"",
    "test_evasion":false,
    "apply_evasion_techniques":true,
    "log_ip_address":false,
    "log_http_headers":false,
    "puppeteer_cluster_config":{
        "timeout":600000,
        "monitor":false,
        "concurrency":1,
        "maxConcurrency":1
    }
}'

@slotix slotix mentioned this pull request May 14, 2019
@slotix
Copy link
Author

slotix commented Aug 14, 2019

Thank you for adding support of docker in master branch. Closing this pull request...

@slotix slotix closed this Aug 14, 2019
Copy link

@binyoucai binyoucai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curl -XPOST http://localhost:3000 -H 'Content-Type: application/json'
-d '{
"user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36",
"random_user_agent":true,
"sleep_range":"",
"search_engine":"baidu",
"debug":true,
"verbose":true,
"keywords":[ "python" ],
"keyword_file":"",
"num_pages":1,
"headless":true,
"chrome_flags":[ ],
"output_file":"examples/results/baidu.json",
"block_assets":false,
"custom_func":"",
"proxy":"http://SUNFOXGJ2CTQQ60:08IjxC3B@http-proxy-t2.dobel.cn:9180",
"proxy_file":"",
"test_evasion":false,
"apply_evasion_techniques":true,
"log_ip_address":false,
"log_http_headers":false,
"puppeteer_cluster_config":{
"timeout":600000,
"monitor":false,
"concurrency":1,
"maxConcurrency":1
}
}'

"output_file":"examples/results/google.json",
"block_assets":false,
"custom_func":"",
"proxy":"http://proxy:24000",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have an account and password, after my test, the request failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants