Skip to content

datastore doesn't report row number for errors on upsert/create #7748

@wardi

Description

@wardi

CKAN version

all

Describe the bug

datastore_create and datastore_upsert can fail while loading data and will report an error but do not indicate which row was being processed when the error occurred.

Steps to reproduce

e.g.

ckanapi action package_create name=foo
ckanapi action datastore_create \
  resource:'{"package_id":"foo"}' records:'[{"a":1}, {"a":"two"}]'

raises:

ckan.logic.ValidationError: None - {
  'records': ['invalid input syntax for type integer: "two"']
}

Expected behavior

There should be an indication of which row triggered the failure to make it easier to find bad data in large updates

Additional details

There's no established pattern for returning the position of an error within a list in other ckan api calls, other than doing something like (for an error in the 5th row):

ckan.logic.ValidationError: None - {
  'records': [
    {},{},{},{},{"a":['invalid input syntax for type integer: "two"']}
  ]
}

But:

  • we don't get back information from postgres on the column name for type errors
  • if upserting hundreds of rows we might need to create lists with hundreds of empty dicts just to indicate a position
  • this is not backwards compatible for clients expecting a list of strings in records

I'm proposing to do this instead (for an error in the 5th row):

ckan.logic.ValidationError: None - {
  'records': ['invalid input syntax for type integer: "two"'],
  'records_row': 4
  ]
}

i.e. add a 0-based records_row value when returning an error where we know which row in records triggered the database error.

Love to hear your thoughts on this this change.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions