PollScraper¶
A production-ready web scraping utility, built to monitor polling data hosted by the Economist data team.
Artifacts from the latest build can be downloaded in the Actions tab.
Artifacts from the latest daily run can be downloaded in the Actions tab.
The build pipeline is also run as a cron job that executes at 17:30 daily, so these artifacts also reflect the most recent poll results.
Setup¶
$ python3.8 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements_dev.txt
Run Pipeline¶
$ # For information on pollscraper argument:
$ pollscraper --help
$ # To scrape polls, and calculate trends:
$ pollscraper --url https://cdn-dev.economistdatateam.com/jobs/pds/code-test/index.html --results_dir data --quiet
Building documentation¶
$ make servedocs
Deployment¶
$ bumpversion --current-version <current_version> minor # possible: major / minor / patch
$ git push
$ git push --tags
- Free software: MIT license
- Documentation: https://pollscraper.readthedocs.io.
TODO¶
Separation of Concerns - separate CI and CD pipelinesAdd separate badges for each new pipelineParameterize the HTTP requests via ClickTidy up documentation, remove stale references such as PyPi
Credits¶
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.