-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Description
CKAN version
all
Describe the bug
datastore_create
and datastore_upsert
can fail while loading data and will report an error but do not indicate which row was being processed when the error occurred.
Steps to reproduce
e.g.
ckanapi action package_create name=foo
ckanapi action datastore_create \
resource:'{"package_id":"foo"}' records:'[{"a":1}, {"a":"two"}]'
raises:
ckan.logic.ValidationError: None - {
'records': ['invalid input syntax for type integer: "two"']
}
Expected behavior
There should be an indication of which row triggered the failure to make it easier to find bad data in large updates
Additional details
There's no established pattern for returning the position of an error within a list in other ckan api calls, other than doing something like (for an error in the 5th row):
ckan.logic.ValidationError: None - {
'records': [
{},{},{},{},{"a":['invalid input syntax for type integer: "two"']}
]
}
But:
- we don't get back information from postgres on the column name for type errors
- if upserting hundreds of rows we might need to create lists with hundreds of empty dicts just to indicate a position
- this is not backwards compatible for clients expecting a list of strings in
records
I'm proposing to do this instead (for an error in the 5th row):
ckan.logic.ValidationError: None - {
'records': ['invalid input syntax for type integer: "two"'],
'records_row': 4
]
}
i.e. add a 0-based records_row
value when returning an error where we know which row in records triggered the database error.
Love to hear your thoughts on this this change.
Metadata
Metadata
Assignees
Labels
No labels