2019-05-29

This commit is contained in:
martind2000 2019-05-29 16:04:31 +01:00
parent c2b3ecc30c
commit f0de17cd3b
5 changed files with 87 additions and 0 deletions

View File

@ -0,0 +1,20 @@
Note 2019-05-22T10.10.51
========================
This has been having random crashes on PPE so it has failed to complete for a while. Logging information wasn't being useful. It was returning that an error had occured bu did not display what the error was or where it was located.
I have gone through the code and added some additional error catching and triggering for he restart and ran it locally yesterday. It completed with the only issue being when I was moving the laptop between rooms.
I will check this in and hopefully have it moved to PPE this wek.
2019-05-23T14:46:30.012]
2019-05-23T14:48:07.532]
[2019-05-23T15:07:37.821]
[2019-05-23T15:09:15.341] [Level { level: 20000, levelStr: 'INFO', colour: 'green' }] (IT) - We didnt transition back correctly, forcing another click..
[2019-05-23T15:10:52.845] [Level { level: 20000, levelStr: 'INFO', colour: 'green' }] (IT) - We didnt transition back correctly, forcing another click..
[2019-

View File

@ -0,0 +1,20 @@
290
// wait for loading shroud to go away
await this.page.waitForSelector('div.loading', { 'visible':false, 'timeout':25000 });
let btnSuccess = false;
let breakCount = 0;
do {
await this.page.waitForSelector('button.btn.btn-success', { 'visible':true, 'timeout':45000 }).then(async (elm) => {
await elm.click({ 'delay':Scraper.notARobot() });
}).catch(() => {
btnSuccess = true;
});
await this._randomWait(this.page, 1, 1, 'preparePSSearch btnSuccess');
breakCount++;
}
while(!btnSuccess && breakCount < 5);

View File

@ -0,0 +1,44 @@
Note 2019-05-29T11.34.48
========================
As discussed, I've added all the defects linked to the Netherlands scrapper to this ticket:
Defect 001 - Json file data and main page screenshot are missing for CI entity: ABN AMRO Groenbank BV
Attached screenshots NL_Defect_001 and NL_Defect_001a are linked to the above issue
Defect 002 - Not all 'Category' data is captured in the JSON file for some CI. Bank is captured but the rest are ignored.
Attached screenshots NL_Defect_002 is linked to the above issue
Defect 003 - JSON files data and main page screenshots are missing for these PI entities:
detail.jsp?id=4366080d9645e911811b005056b60a9d&locale=en_GB
detail.jsp?id=bf85dc049745e911811b005056b60a9d&locale=en_GB
detail.jsp?id=bf85dc049745e911811b005056b60a9d&locale=en_GB
Attached screenshots NL_Defect_003 and NL_Defect_003a are linked to the above issue
The entity we currently have on the NCA register website was also not scrapped:
detail.jsp?id=8d41e2ab5948e311b55a005056b672cf
Attached screenshots NL_Defect_003b is linked to the above issue
Defect 1
---
The 'CS' ABN Amro Groenbank B V is indexed but not processed
Defect 2
---
NIBC Bank N.V. has a category with 2 items but one item is logged.
Defect 3
---
Coliding filenames
**Solution**
The ID query string value is taken from the href link, and the first 8 characters are extracted as a short hash.
This short hash is added to the filename when it is created.

Binary file not shown.

3
tasks/trash/10 Normal file
View File

@ -0,0 +1,3 @@
Note 2019-05-22T10.46.40
========================