Replies: 6 comments 7 replies
-
Could the resource dictionary then be preserved as a default at dataset level so that a given resource could be deleted+replaced without losing the dictionary? or could load multiple resources with the same schema adopting the dataset dictionary (could test first to see if the dataset default dict is OK for adoption by each resource). Would simplify data sets where annual data is appended. |
Beta Was this translation helpful? Give feedback.
-
Suggested model to support many types of data dictionaries:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
data dictionary-related features compared between CSVW, json schema and data package table schema:
|
Beta Was this translation helpful? Give feedback.
-
Also noting here as an argument for: The datastore schema info is not accessible due to table locking when the datapusher has done a truncate/load in a separate thread. For loads that take a significant amount of time, this can lead to gateway errors on the front end as the datadictionary on the resource page is waiting on a lock that might take minutes or more in the case of a large table. |
Beta Was this translation helpful? Give feedback.
-
@PatLittle mentioned this useful tool: https://github.com/WPRDC/little-lexicographer |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Since the first introduction of the data dictionary feature #3414 there has been some discomfort with its implementation as json-encoded column comments.
Column comments have the benefit of being removed automatically when a column is removed and being accessible from within the datastore e.g. by
datastore_search_sql
(this hasn't been widely used AFAIK)But there are some real drawbacks:
Let's consider moving the column comment data from the datastore database to the main ckan database. Fields can be indexed by resource id and column name so they will retain information if columns are removed or reordered.
The
datastore_create
anddatastore_info
APIs will continue to be able to update and read the data dictionary for backwards compatibility, but new endpoints will be added for CRUD operations on data dictionaries that don't rely on the datastore.We'll need some way to prune or clean up old data dictionary entries that no longer apply, but we have a pattern for this with other CLI commands.
Note
I'm avoiding the more complicated issue of merging data dictionaries and table schemas used in ckanext-validation in this discussion to focus on this smaller step to improve functionality while maintaining compatibility for current users
Beta Was this translation helpful? Give feedback.
All reactions