Closed
Description
Despite #5263 being resolved, it looks like the data dumps weren't uploaded on July 1st :/
Relevant URL?
- https://github.com/internetarchive/openlibrary/wiki/Generating-Data-Dumps
- https://archive.org/details/ol_exports?sort=-publicdate
Related issues and pull requests:
- Data dump for reading log stats #3989
- 2021-01 datadumps smaller than expected (missing data?) #4621
- 2021-01-31 combined data dump missing data/out of date data #4671
- ol-home0 provisioning needs rsync ferm/input #4723
- Add helper eg /data/ol_dump_editions_latest.txt.gz links for ratings/readinglog #5546
- Manually kick data dump for 2021-09 #5673
- Manually kick data dump for 2021-10 #5719
- Fix cron job for oldump and sitemaps #5892 - Worth reading!
- Datadumps: logger is blocking the cron job #6158
- Fix shebang lines in scripts/oldump.py and scripts/sitemaps/sitemaps.py #6163
Related files:
docker-compose.production.yml
definescron-jobs
Docker container.docker/ol-cron-start.sh
sets up the cron tasks.- olsystem:
/etc/cron.d/openlibrary.ol_home0
defines the actual job- modify and then to reactivate do:
crontab /etc/cron.d/openlibrary.ol_home0
Also: https://cron.help - internetarchive/olsystem#140
- modify and then to reactivate do:
scripts/oldump.sh
is the script that gets run.
Proposal & Constraints
- Run manually for now