Skip to content

Indexer failed to handle invalid data type of the dataset contact point information #3612

@t83714

Description

@t83714

Indexer failed to handle invalid data type of the dataset contact point information

Some data source metadata API might not be configured to output valid metadata type for some metadata field.

e.g. below is an example that the contact point information has been output as an integer (likely an ID)

Although the problem should be fixed at the upstream data source (via correct metadata mapping config), the indexer should still try its best to accomodate the invalid type, instead of throwing an error (probably can print a warning in the log).

Could not parse item: Record(ds-logan-https://www.arcgis.com/home/item.html?id=32757b6fc2a84751ac2ead235d53a839,Yasmin's hoop op een toekomst,Map(source -> {"id":"logan","name":"Logan City Council","type":"project-open-data-dataset","url":"https://data-logancity.opendata.arcgis.com/api/feed/dcat-us/1.1.json"}, dcat-dataset-strings -> {"contactPoint":548686,"description":"<p>Ik ben Yasmin, ik ben 27 jaar oud. Wat een thuis hoorde te zijn was geen thuis want thuis liepen de talibansoldaten door de straat, werd ik geslagen door mijn vader en was er te weinig geld voor eten. Toen ik 17 geworden was dacht mijn vader dat ik misschien wel geld op zou leveren dus zocht mijn vader een man voor mij. Toen ik daarachter kwam vluchtte ik weg, voor hoop op een toekomst waarin ik misschien wel een thuis zou vinden. Mijn moeder kwam achter mijn vluchtplannen kwam gaf ze me wat geld en een gezinsfoto. Ik pakte mijn tas in met de weinige spullen die ik had en ging naar Kandahar met de bus.</p>","issued":"2025-06-25T09:13:16.000Z","keywords":[""],"landingPage":"https://data-logancity.opendata.arcgis.com/apps/32757b6fc2a84751ac2ead235d53a839","modified":"2025-06-28T09:34:49.000Z","publisher":"548686","spatial":"{{extent:computeSpatialProperty}}","temporal":{},"themes":["geospatial"],"title":"Yasmin's hoop op een toekomst"}, dataset-publisher -> {"publisher":{"aspects":{"organization-details":{"email":"{{orgContactEmail}}","name":"548686","title":"548686"},"source":{"id":"logan","name":"Logan City Council","type":"project-open-data-organization","url":"https://data-logancity.opendata.arcgis.com/api/feed/dcat-us/1.1.json"}},"id":"org-logan-548686","name":"548686"}}, dataset-distributions -> {"distributions":[{"aspects":{"dataset-format":{"confidenceLevel":33,"format":"WEB"},"dcat-distribution-strings":{"accessURL":"https://data-logancity.opendata.arcgis.com/apps/32757b6fc2a84751ac2ead235d53a839","format":"Web Page","license":"","mediaType":"text/html","title":"ArcGIS Hub Dataset"},"project-open-data-distribution":{"@type":"dcat:Distribution","accessURL":"https://data-logancity.opendata.arcgis.com/apps/32757b6fc2a84751ac2ead235d53a839","format":"Web Page","mediaType":"text/html","title":"ArcGIS Hub Dataset"},"source":{"id":"logan","name":"Logan City Council","type":"project-open-data-distribution","url":"https://data-logancity.opendata.arcgis.com/api/feed/dcat-us/1.1.json"}},"id":"dist-logan-https://www.arcgis.com/home/item.html?id=32757b6fc2a84751ac2ead235d53a839-0","name":"ArcGIS Hub Dataset"},{"aspects":{"dataset-format":{"confidenceLevel":33,"format":"ARCGIS"},"dcat-distribution-strings":{"accessURL":"https://storymaps.arcgis.com/stories/32757b6fc2a84751ac2ead235d53a839","format":"ArcGIS GeoServices REST API","license":"","mediaType":"application/json","title":"ArcGIS GeoService"},"project-open-data-distribution":{"@type":"dcat:Distribution","accessURL":"https://storymaps.arcgis.com/stories/32757b6fc2a84751ac2ead235d53a839","format":"ArcGIS GeoServices REST API","mediaType":"application/json","title":"ArcGIS GeoService"},"source":{"id":"logan","name":"Logan City Council","type":"project-open-data-distribution","url":"https://data-logancity.opendata.arcgis.com/api/feed/dcat-us/1.1.json"}},"id":"dist-logan-https://www.arcgis.com/home/item.html?id=32757b6fc2a84751ac2ead235d53a839-1","name":"ArcGIS GeoService"}]}, dataset-quality-rating -> {"dataset-linked-data-rating":{"score":0,"weighting":1}}),Some(f2ebbc53-0cd3-4864-adc0-a0e5cc57015b),Some(0))
java.lang.RuntimeException: spray.json.DeserializationException: Expected String as JsString, but got 548686

We should also make sure indexer won't stop crawling the next batch when an empty list is fetched only because of parsing errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions