-
Notifications
You must be signed in to change notification settings - Fork 2.1k
package_update: faster dataset metadata update #8421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This just needs a few more tests |
ckan/logic/action/update.py
Outdated
if not dataset_changed and original_package.get('resources' | ||
) == data_dict.get('resources'): | ||
# flag no change for caller | ||
context['package_update_unchanged'] = 'before validation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in love with this part but it's the simplest possible way I could think to let the activity extension skip creating useless activity entries. I don't like that this is a new Context value that's a really specific case not a general solution.
Alternatives I considered:
- Create a generic way to return extra information in the API envelope via the context that could be used by other actions. This would be useful for API users so they would know if an update actually occurred after any of the package or resource updating actions.
- Create a new interface for the activity plugin that could be passed both the
original_package
and updateddata_dict
frompackage_update
so it doesn't have to callpackage_show
itself or and the generic signal handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are getting rid of an ugly context var and introducing an ugly context var so it evens out :)
Create a generic way to return extra information in the API envelope via the context that could be used by other actions.
You mean something like context["extra_data"]["package_update_unchanged"] = "before validation"
?
This looks like encouraging sharing info in the context, so perhaps I prefer a specific key that is clear where it comes from. Unless I misunderstood?
This would be useful for API users so they would know if an update actually occurred after any of the package or resource updating actions.
Not sure what you mean here. You mean an update happening e.g. in a chained action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amercader I'm thinking something like context['updated_entities']['package'] = [id]
and then expose that updated_entities
in the API response envelope or something so that API users can know when something has been modified as a result of the action they call. Wouldn't need to implement it for everything right away but it would naturally extend to resources, groups, orgs, users etc. and bulk actions that may update many things at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed at meeting I'll implement this as context['changed_entities']
to potentially encompass creation and deletion in the future, add this value to the response envelope for API calls and document it in the changelog.
Performance measurements added to description |
ckan/logic/action/update.py
Outdated
if not dataset_changed and original_package.get('resources' | ||
) == data_dict.get('resources'): | ||
# flag no change for caller | ||
context['package_update_unchanged'] = 'before validation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are getting rid of an ugly context var and introducing an ugly context var so it evens out :)
Create a generic way to return extra information in the API envelope via the context that could be used by other actions.
You mean something like context["extra_data"]["package_update_unchanged"] = "before validation"
?
This looks like encouraging sharing info in the context, so perhaps I prefer a specific key that is clear where it comes from. Unless I misunderstood?
This would be useful for API users so they would know if an update actually occurred after any of the package or resource updating actions.
Not sure what you mean here. You mean an update happening e.g. in a chained action?
Fixes #8407
Proposed fixes:
IDatasetForm.resource_validation_dependencies
for declaring dataset fields that may affect resource validation (default: none)metadata_modified
value on datasets - useful for harvesting or mirroring ckan sitesmetadata_modified
when dataset or resource metadata has changedallow_partial_update
context parameter and instead allow normal API users to:resources
users
,datasets
andgroups
Features: