-
Notifications
You must be signed in to change notification settings - Fork 85
DB models for custom taxonomies #6421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6421 +/- ##
==========================================
+ Coverage 87.61% 87.65% +0.04%
==========================================
Files 459 465 +6
Lines 29494 29723 +229
Branches 3280 3303 +23
==========================================
+ Hits 25841 26054 +213
- Misses 2948 2960 +12
- Partials 705 709 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@@ -354,133 +346,7 @@ def clean_datasets( | |||
) | |||
|
|||
|
|||
def activate_taxonomy_parents( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for moving all these to a service class! 😌
return validate_and_create_taxonomy(db, model, validation_schema, data) | ||
|
||
return validate_and_create_taxonomy(db, model, validation_schema, data) | ||
# Legacy validation functions removed - functionality moved to TaxonomyService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this comment -- it reads like cursor explaining its changes lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You right 😄
src/fides/api/models/taxonomy.py
Outdated
|
||
|
||
class Taxonomy(Base, FidesBase): | ||
"""The SQL model for taxonomy resources.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we maybe add some info to the docstring about how the out-of-the-box taxonomies relate to this model? so that new devs (or us in 2 years lol) running into this code don't have to search around for an explanation
src/fides/api/models/taxonomy.py
Outdated
id = Column( | ||
String(255), nullable=False, index=False, default=FidesBase.generate_uuid | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually we inherit the id
column from FidesBase
right? is there no issue with re-defining the id column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also don't we need a name
column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for the id
column, I'm just overriding the base definition so it's not treated as a primary key. As for name
, it's inherited from FidesBase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay I totally missed that on my first read. So what's the primary key of this table?
src/fides/api/models/taxonomy.py
Outdated
__table_args__ = ( | ||
# Simple two-column index for queries | ||
Index("ix_allowed_usage_lookup", "source_taxonomy_key", "target_type"), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need this index given that you've defined (source_taxonomy_key, target_type)
as a composite PK with index=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I got rid of the extra index definition
@abstractmethod | ||
def get_model(self, taxonomy_type: str) -> Type: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be able to type this more strictly; maybe we can define a TaxonomyType
union type or a base class that other types inherit from?
Actually what I'd really like to know is if we can actually use overloads to do something like the following,(ab?)using the fact that a string literal is actually a type:
@overloads
def get_model(self, taxonomy_type: Literal['data_category']) -> DataCategory:
@overloads
def get_model(self, taxonomy_type: Literal['data_use'])->DataUse:
.... other overloads
so if you call get_model('data_category')
the type checker will actually know the return type is DataCategory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk if this would be done here or below in the concrete class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice! I didn't know about @overloads
in Python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I too didn't know they existed but whenever I find myself thinking "In Typescript I could do this way cleaner" I go and look up if there's a way to do it in Python 😆 and there was!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the recommendations, I think it's a lot cleaner now!
src/fides/api/models/taxonomy.py
Outdated
id = Column( | ||
String(255), nullable=False, index=False, default=FidesBase.generate_uuid | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for the id
column, I'm just overriding the base definition so it's not treated as a primary key. As for name
, it's inherited from FidesBase
.
src/fides/api/models/taxonomy.py
Outdated
|
||
# Create TaxonomyAllowedUsage objects for each applies_to value | ||
allowed_usages = [ | ||
TaxonomyAllowedUsage( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was getting this error from mypy
src/fides/api/models/taxonomy.py:111: error: Unexpected column "source_taxonomy_key" for model
"TaxonomyAllowedUsage" [misc]
TaxonomyAllowedUsage(
^
src/fides/api/models/taxonomy.py:111: error: Unexpected column "target_type" for model
"TaxonomyAllowedUsage" [misc]
TaxonomyAllowedUsage(
^
But I just found out you could do this instead (I was missing the Column[str]
annotations)
source_taxonomy_key: Column[str] = Column(
String,
ForeignKey("taxonomy.fides_key", ondelete="CASCADE"),
primary_key=True,
index=True,
)
target_type: Column[str] = Column(
String, primary_key=True
) # Can be "system", "dataset", OR a taxonomy key like "data_categories"
src/fides/api/models/taxonomy.py
Outdated
__table_args__ = ( | ||
# Simple two-column index for queries | ||
Index("ix_allowed_usage_lookup", "source_taxonomy_key", "target_type"), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I got rid of the extra index definition
@abstractmethod | ||
def get_model(self, taxonomy_type: str) -> Type: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice! I didn't know about @overloads
in Python
return validate_and_create_taxonomy(db, model, validation_schema, data) | ||
|
||
return validate_and_create_taxonomy(db, model, validation_schema, data) | ||
# Legacy validation functions removed - functionality moved to TaxonomyService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You right 😄
fides
|
Project |
fides
|
Branch Review |
main
|
Run status |
|
Run duration | 00m 54s |
Commit |
|
Committer | Adrian Galvan |
View all properties for this run ↗︎ |
Test results | |
---|---|
|
0
|
|
0
|
|
0
|
|
0
|
|
5
|
View all changes introduced in this branch ↗︎ |
Closes ENG-1073
Description Of Changes
Taxonomy
: defines a taxonomy type and its allowed targets viaapplies_to
TaxonomyAllowedUsage
: declares which target types/taxonomies a taxonomy can apply toTaxonomyElement
: hierarchical elements within a taxonomy (supports parent/child)TaxonomyUsage
: records applications of one taxonomy’s element to another elementdata_categories
,data_uses
,data_subjects
,system_groups
).TaxonomyService
with centralized validation and hierarchy activation/deactivation behavior, and a base handler abstraction with a legacy handler implementation.data_category
,data_use
,data_subject
) through the newTaxonomyService
to unify validation and reactivation behavior.Potential caveats:
applies_to
target that’s currently referenced bytaxonomy_usage
will fail due to FK constraints; usages must be removed first.TaxonomyElement
with existing children is restricted at the DB level; delete or move children first.Code Changes
src/fides/api/models/taxonomy.py
(addsTaxonomy
,TaxonomyAllowedUsage
,TaxonomyElement
,TaxonomyUsage
)src/fides/api/alembic/migrations/versions/3baf42d251a6_add_generic_taxonomy_models.py
src/fides/api/api/v1/endpoints/generic_overrides.py
(create/update for data uses/categories/subjects now useTaxonomyService
; legacy inline validation removed)src/fides/api/service/deps.py
(newget_taxonomy_service
)src/fides/api/db/base.py
(imports new models)src/fides/service/taxonomy/__init__.py
src/fides/service/taxonomy/handlers/base.py
src/fides/service/taxonomy/handlers/legacy_handler.py
src/fides/service/taxonomy/taxonomy_service.py
src/fides/service/taxonomy/utils.py
.fides/db_dataset.yml
(alignments for taxonomy-related filtering)tests/ops/models/test_taxonomy.py
tests/service/test_taxonomy_service.py
Steps to Confirm
docker exec fides bash -c "pytest -q tests/ops/models/test_taxonomy.py"
docker exec fides bash -c "pytest -q tests/service/test_taxonomy_service.py"
/api/v1/data_category
with a new category; expect 201 and generated/validated key behavior./api/v1/data_category
to toggleactive
and verify parent/child cascade semantics via subsequent GETs./api/v1/data_use
and/api/v1/data_subject
.taxonomy
,taxonomy_allowed_usage
,taxonomy_element
,taxonomy_usage
.taxonomy
.applies_to
target that’s in use (should fail).TaxonomyElement
with children (should be restricted).Pre-Merge Checklist
CHANGELOG.md
updateddb-migration
label to the entryhigh-risk
label if you believe the FK constraints or activation/deactivation logic could have performance/regression riskmain
downgrade()
migration is correct and works