obtasks/DIR-3098/commit.md
2019-04-29 14:49:56 +01:00

17 lines
620 B
Markdown

DIR-3098 Extend OBDFCASCRAPE to be able to upload German NCA data
# Summary
* Added archiving - The contents of the artefact folder is compressed
* Exteded archiving to France.
* Added uploading to S3
* Flattened folder structure, removed timestamping of folders within the archive to make it easier for the Ingestion process to handle
* Added an index of the pages scraped to aid ingestion
* Changed filenames to ensure there are no collisions, items are prefixed with either ps_, em or ci_.
* Implemented Amazon SQS message service so that an announcement can be made when a new file is avilable for ingestion