Skip to content

Fix, document and rewrite tests for dataset relationships (or remove) #4212

@amercader

Description

@amercader

A feature that has been part of CKAN since its early version but few people are aware of are dataset (package) relationships. These allow to relate datasets with different kinds of relationships. These can be managed via the API, including v3 (actions). We wrongly assumed that they were tested in the legacy tests but the tests were not run (see #4157).

We need to decide if it's worth keeping as feature or remove it (and of course it can be implemented separately in an extension).

Current status

Relationships have they own model and look like this:

        {
            "comment": "Test relationship between datasets", 
            "object": "test-dataset-jope", 
            "subject": "test-angola-dataset", 
            "type": "links_to"
        }

There are CRUD actions for them (documented with up to date docstrings)

ckan/logic/action/update.py
415:def package_relationship_update(context, data_dict):

ckan/logic/action/create.py
495:def package_relationship_create(context, data_dict):

ckan/logic/action/get.py
905:def package_relationships_list(context, data_dict):

ckan/logic/action/delete.py
237:def package_relationship_delete(context, data_dict):

A quick test shows that all the above are working. What is not working is the integration with the dataset dict, ie they are not returned as part of the package_show output:

        "relationships_as_object": [], 
        "relationships_as_subject": [], 

Test-wise the feature is completely lacking. There were some legacy tests, but like all the staff in #4157 they were not being run because of the test class name. We need tests that covered all the above four actions, plus things like the integration with the dataset dict.

There are also issues related to the purging of datasets with relationships: #2186

What we need in order to keep this feature?

Personally I'd be happy to keep this in core as long as there is:

  • A fix for relationships appearing in dataset dict
  • Fix purging issues
  • Tests (rewrite these)
  • Documentation. I think on that front we are mostly fine with the action docstrings but a high level overview page would be nice

A UI for managing the relationships would of course be nice, but I don't think it's a requirement to keep the feature. It can be built later or separately on an extension

My proposal is to reach to the community to see if someone 1) Someone is actually using this (unlikely) 2) This is a useful feature 3) Someone is willing to help with the tasks above.

If the answer to the three is no, we remove the feature.

@ckan/core does this approach sound good?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions