Problem: It's possible to use graph database as the main data storage or to efficiently solve workflows with graph databases

**Problem hypothesis**

1. **Is it possible to retain lossless graph database harvesting in CKAN.**
2. SQL database that's used right now for CKAN puts a limitation on amount of CKAN applications. As an example triplestore approach is used widely in Switzerland, thus CKAN with SQL database should be heavily customized (if it's possible) to fit in.
3. SQL database doesn't support required standards in full. What standards are required and what parts of them are not supported?


**Problem discovery**
_Gathering evidence here. Who mentioned the problem? How they solve the problem now? Are they ready to commit or provide feedback after delivery?_

- We know that sources of data for Swiss and EU customers are sometimes graph databases. Can we get examples? 
- It was similar issue from past experience with a german client, it was around DCAT-AP.
- Harvesting of external catalogs in DCAT format. During this process they lose connections between the catalog entities (and relations). 





**Problem statement**
_Formulate the problem found during discovery stage_



-------------------------------------------------------


**Solution hypothesis**
_Formulate the problem found during discovery stage_

1. We can have 2 copies of metadata: original (graph), frictionless.
2. Can we have a single DB?
3. What if I can choose in what format I would like to store the data. _Would we have a database for each format? It sound's like it easier to get our of sync._

**Solution discovery**
_Log here everything you've found during the discovery_

2 solutions discussed and being researched:
- Lossless harvesting storage with 2 databases (graph and postgres) https://github.com/ckan/ckan/issues/7514
- Generic solution to store triples https://github.com/ckan/ckan/issues/7515

- Top level entity in DCAT is the catalog, we don't have this structure in CKAN. It's no possibility to do 1:1 mapping from DCAT to CKAN. Some metadata fields can't be mapped.
- Now 1 package is 1 row in the table. We can store triplets with a plugin for triplets. Thus we can import original data and keep it as a copy.
- What if we have a generic solution?
- It was a project in Taiwan involving tripplestore but it was highly customized -> ⚠️ We need solution that would work for more people.

**Validation**
_Why the solution is trustworthy? What makes it strong?_


**Questions to consider:**

Is this change going to break current installations?

Can we provide a backwards compatibility?

How easy is gonna be for current implementations to migrate to this new release?

Do current versions of CKAN have the adequate resources/support to migrate to this new version?

Are we going to change the database schema?

Are we going to change the API?

Are we going to deprecate Interfaces?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problem: It's possible to use graph database as the main data storage or to efficiently solve workflows with graph databases #7489

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem: It's possible to use graph database as the main data storage or to efficiently solve workflows with graph databases #7489

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions