Skip to content

Conversation

galvana
Copy link
Contributor

@galvana galvana commented Aug 1, 2025

Description Of Changes

Continuation of #6397.

Code Changes

  • Adding similar missing table checks for SQLConnector.mask_data

Steps to Confirm

  1. Start Fides and the Admin UI
  2. Run the python qa bigquery_table_not_found setup QA scenario
  3. Submit an erasure request for customer-1@example.com
  4. The request should finish, the customer_profile collection should be skipped for the access steps and complete (with 0 results) for the erasure steps
  5. Run the python qa bigquery_table_not_found setup --table customer QA scenario (note the customer param)
  6. Submit an erasure request for customer-1@example.com
  7. Verify the access errors and that the erasure doesn't run

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

Copy link

vercel bot commented Aug 1, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
fides-plus-nightly ⬜️ Ignored (Inspect) Visit Preview Aug 2, 2025 0:25am
fides-privacy-center ⬜️ Ignored (Inspect) Aug 2, 2025 0:25am

@galvana galvana requested a review from thabofletcher August 1, 2025 19:44
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR extends the missing table detection functionality from PR #6397 to erasure operations by adding the same error handling pattern to the SQLConnector.mask_data method that was previously only available in retrieve_data. The changes ensure consistent behavior between access and erasure operations when encountering missing tables in SQL databases, particularly important for BigQuery scenarios.

The core implementation wraps the SQL execution in mask_data with a try-catch block that checks table existence when exceptions occur. For collections without dependencies (leaf collections), it raises TableNotFound to allow graceful skipping. For collections with dependencies, it raises ConnectionException to prevent data integrity issues by failing the operation entirely.

The PR also updates the BigQuery QA scenario to properly handle erase_after dependency references when creating test datasets with missing tables, and adds comprehensive test coverage for both leaf and dependency collection scenarios during erasure operations. This ensures the privacy request execution engine maintains proper error handling and data integrity across both access and erasure phases.

Important Files Changed

Files Modified
Filename Score Overview
src/fides/api/service/connectors/sql_connector.py 4/5 Adds missing table detection to mask_data method with try-catch error handling
tests/ops/service/connectors/test_sql_connector.py 4/5 Comprehensive test coverage for mask_data missing table scenarios
qa/scenarios/bigquery_table_not_found.py 4/5 Updates QA scenario to handle erase_after references in dataset modifications
tests/ops/service/privacy_request/test_bigquery_privacy_requests.py 4/5 Integration tests for BigQuery erasure operations with missing tables

Confidence score: 4/5

  • This PR extends existing functionality with well-tested patterns and maintains consistency across access and erasure operations
  • Score reflects solid implementation following established patterns, but complexity of error handling logic warrants careful review
  • Pay close attention to the error handling logic in sql_connector.py to ensure dependency checking works correctly

Sequence Diagram

sequenceDiagram
    participant User
    participant QAScenario as "QA Scenario"
    participant FidesAPI as "Fides API"
    participant SQLConnector as "SQL Connector"
    participant BigQueryConnector as "BigQuery Connector"
    participant Database as "BigQuery Database"

    User->>QAScenario: "Run QA scenario with table parameter"
    QAScenario->>QAScenario: "Create modified dataset with renamed table"
    QAScenario->>FidesAPI: "Create dataset in Fides"
    QAScenario->>FidesAPI: "Create system and connection"
    QAScenario->>FidesAPI: "Link dataset to connection"
    
    User->>FidesAPI: "Submit erasure request for customer-1@example.com"
    FidesAPI->>SQLConnector: "Execute mask_data for each collection"
    
    loop "For each row to mask"
        SQLConnector->>SQLConnector: "Generate update statement"
        alt "SQL dry run disabled"
            SQLConnector->>Database: "Execute update statement"
            Database-->>SQLConnector: "Exception: Table not found"
            SQLConnector->>SQLConnector: "Get qualified table name"
            SQLConnector->>BigQueryConnector: "Check if table exists"
            BigQueryConnector->>Database: "Query table metadata"
            Database-->>BigQueryConnector: "Table does not exist"
            BigQueryConnector-->>SQLConnector: "Return false"
            
            alt "Collection has outgoing dependencies"
                SQLConnector-->>FidesAPI: "Raise ConnectionException (hard error)"
                FidesAPI-->>User: "Request fails with error"
            else "Collection is leaf node"
                SQLConnector-->>FidesAPI: "Raise TableNotFound (skip collection)"
                FidesAPI->>FidesAPI: "Skip collection and continue"
                FidesAPI-->>User: "Request completes successfully"
            end
        else "SQL dry run enabled"
            SQLConnector->>SQLConnector: "Log SQL statement instead of executing"
        end
    end
Loading

4 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@thabofletcher thabofletcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Tested, including the recent change to verify BIGQUERY_DATASET

Copy link

codecov bot commented Aug 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.11%. Comparing base (532c649) to head (daa9da3).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6414      +/-   ##
==========================================
+ Coverage   87.09%   87.11%   +0.01%     
==========================================
  Files         455      455              
  Lines       29248    29254       +6     
  Branches     3242     3243       +1     
==========================================
+ Hits        25475    25485      +10     
+ Misses       3048     3044       -4     
  Partials      725      725              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@galvana galvana merged commit 76eb279 into main Aug 2, 2025
17 checks passed
@galvana galvana deleted the adding-missing-table-detection-for-erasures branch August 2, 2025 00:32
Copy link

cypress bot commented Aug 2, 2025

fides    Run #13202

Run Properties:  status check passed Passed #13202  •  git commit 76eb2795d6: Adding missing table detection for erasures (#6414)
Project fides
Branch Review main
Run status status check passed Passed #13202
Run duration 01m 03s
Commit git commit 76eb2795d6: Adding missing table detection for erasures (#6414)
Committer Adrian Galvan
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 5
View all changes introduced in this branch ↗︎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants