36 lines
1.9 KiB
Markdown
36 lines
1.9 KiB
Markdown
Comment
|
|
========================
|
|
|
|
DIR-3741 - French NCA Scrapper Tool - Missing Credit Institution Data In S3 Bucket
|
|
|
|
Summary: French NCA Scrapper Tool - Missing Credit Institution Data In S3 Bucket Id: DIR-3741 Created at: Fri May 03 12:46:50 BST 2019 Updated at: Fri May 03 15:48:03 BST 2019 Description:
|
|
The Scrapper failed to extract a number of Credit Institution data from the French NCA Register.
|
|
The following CI were missing: AGENCE FRANCE LOCALE Al Khaliji France Allianz banque Amundi Andbank Monaco S.A.M. Arkéa banking services ARKEA DIRECT BANK Arkéa public sector SCF Bank Audi France
|
|
French NCA Register URL: https://www.regafi.fr/spip.php?page=results&type=advanced&idsecteur=3&lang=en&denomination=&siren=&cib=&bic=&nom=&sirenagent=&num=&cat=01-TBR07&retrait=0
|
|
|
|
|
|
|
|
|
|
|
|
During the indexing process for Credit Institutes, it has to limit what is actually processed to just entities which match 'legal entity/ company'. It does so by checking the contents of a specific field.
|
|
|
|
However, the rows that wer failing had an additional URL inside them which changed the contents of the row, this meant the check was being applied to the wrong field.
|
|
|
|
The check now takes this into consideration and works with both types of rows, without URL / with URL.
|
|
|
|
Missing data sample:
|
|
```
|
|
{
|
|
"link": "/spip.php?type=advanced&id_secteur=3&lang=en&denomination=&siren=&cib=&bic=&nom=&siren_agent=&num=&cat=01-TBR07&retrait=0&pg=2&page=af&id=70",
|
|
"title": "AGENCE FRANCE LOCALE"
|
|
},
|
|
{
|
|
"link": "/spip.php?type=advanced&id_secteur=3&lang=en&denomination=&siren=&cib=&bic=&nom=&siren_agent=&num=&cat=01-TBR07&retrait=0&pg=3&page=af&id=9024",
|
|
"title": "Al Khaliji France"
|
|
},
|
|
{
|
|
"link": "/spip.php?type=advanced&id_secteur=3&lang=en&denomination=&siren=&cib=&bic=&nom=&siren_agent=&num=&cat=01-TBR07&retrait=0&pg=3&page=af&id=8909",
|
|
"title": "Allianz banque"
|
|
}
|
|
|
|
``` |