cc-tracking/tasks/DIN-329 NL Fixes/Note 2019-05-29T11.34.48.md
2019-05-29 16:04:31 +01:00

1.4 KiB

Note 2019-05-29T11.34.48

As discussed, I've added all the defects linked to the Netherlands scrapper to this ticket:

Defect 001 - Json file data and main page screenshot are missing for CI entity: ABN AMRO Groenbank BV Attached screenshots NL_Defect_001 and NL_Defect_001a are linked to the above issue

Defect 002 - Not all 'Category' data is captured in the JSON file for some CI. Bank is captured but the rest are ignored. Attached screenshots NL_Defect_002 is linked to the above issue

Defect 003 - JSON files data and main page screenshots are missing for these PI entities: detail.jsp?id=4366080d9645e911811b005056b60a9d&locale=en_GB detail.jsp?id=bf85dc049745e911811b005056b60a9d&locale=en_GB detail.jsp?id=bf85dc049745e911811b005056b60a9d&locale=en_GB Attached screenshots NL_Defect_003 and NL_Defect_003a are linked to the above issue

The entity we currently have on the NCA register website was also not scrapped: detail.jsp?id=8d41e2ab5948e311b55a005056b672cf Attached screenshots NL_Defect_003b is linked to the above issue

Defect 1

The 'CS' ABN Amro Groenbank B V is indexed but not processed

Defect 2

NIBC Bank N.V. has a category with 2 items but one item is logged.

Defect 3

Coliding filenames

Solution

The ID query string value is taken from the href link, and the first 8 characters are extracted as a short hash.

This short hash is added to the filename when it is created.