Skip to content

Conversation

tvandort
Copy link
Contributor

@tvandort tvandort commented Aug 11, 2025

Closes ENG-1141

Description Of Changes

Configure S3 role assumption in client creation code so that all uses support role assumption.

Code Changes

  • list your code changes here

Steps to Confirm

  1. list any manual steps for reviewers to confirm the changes

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

Copy link

vercel bot commented Aug 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Skipped Deployments
Project Status Preview Comments Updated (UTC)
fides-plus-nightly ⬜️ Ignored Visit Preview Aug 11, 2025 9:13pm
fides-privacy-center ⬜️ Ignored Aug 11, 2025 9:13pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR centralizes AWS S3 role assumption configuration by moving the logic from individual call sites into the get_s3_client utility function. Previously, each caller of get_s3_client needed to explicitly pass the aws_s3_assume_role_arn from the configuration. Now, the get_s3_client function itself checks for a configured role ARN in CONFIG.credentials.get('storage', {}).get('aws_s3_assume_role_arn') and uses it as a fallback when no explicit assume_role_arn parameter is provided.

The change affects two key files: src/fides/api/tasks/storage.py where the hardcoded role assumption parameter is removed from the upload_to_s3 function, and src/fides/api/util/aws_util.py where the configuration lookup logic is added. This refactoring ensures consistent role assumption behavior across all S3 operations in the Fides application without requiring code changes at every usage point. The implementation maintains backward compatibility by giving precedence to explicitly passed assume_role_arn parameters over the configured default.

This change fits into the broader Fides architecture by centralizing AWS configuration management, which aligns with the privacy management framework's need for consistent and secure cloud storage operations. The modification supports deployment scenarios where a specific IAM role must be assumed for all S3 operations, allowing configuration through environment variables or the fides.toml config file.

PR Description Notes:

  • The "Code Changes" and "Steps to Confirm" sections are incomplete with placeholder text
  • The pre-merge checklist is entirely unchecked

Important Files Changed

Files Changed
Filename Score Overview
src/fides/api/tasks/storage.py 4/5 Removes hardcoded role assumption parameter from upload_to_s3 function call
src/fides/api/util/aws_util.py 4/5 Adds centralized configuration lookup for AWS S3 role assumption in get_s3_client

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it centralizes existing functionality without changing core behavior
  • Score reflects clean refactoring with maintained backward compatibility and consistent error handling patterns
  • Pay close attention to aws_util.py to ensure the configuration precedence logic works as intended

Sequence Diagram

sequenceDiagram
    participant User
    participant "upload_to_s3" as Upload
    participant "get_s3_client" as S3Client
    participant "get_aws_session" as AWSSession
    participant "get_assumed_role_session" as AssumeRole
    participant "AWS STS" as STS
    participant "AWS S3" as S3

    User->>Upload: "upload_to_s3(storage_secrets, data, bucket_name, file_key, resp_format, privacy_request, document, auth_method)"
    Upload->>S3Client: "get_s3_client(auth_method, storage_secrets)"
    S3Client->>AWSSession: "get_aws_session(auth_method, storage_secrets, assume_role_arn)"
    
    alt auth_method == SECRET_KEYS
        AWSSession->>AWSSession: "Create session with access keys"
    else auth_method == AUTOMATIC
        AWSSession->>AWSSession: "Create automatic session"
    end
    
    AWSSession->>STS: "get_caller_identity()"
    STS-->>AWSSession: "Valid credentials confirmed"
    
    alt assume_role_arn exists
        AWSSession->>AssumeRole: "get_assumed_role_session(assume_role_arn, sts_client, region_name)"
        AssumeRole->>STS: "assume_role(RoleArn, RoleSessionName)"
        STS-->>AssumeRole: "Temporary credentials"
        AssumeRole->>AssumeRole: "Create session with temp credentials"
        AssumeRole-->>AWSSession: "Assumed role session"
    end
    
    AWSSession-->>S3Client: "AWS session"
    S3Client->>S3Client: "Create S3 client with signature v4 config"
    S3Client-->>Upload: "S3 client"
    
    Upload->>Upload: "write_to_in_memory_buffer(resp_format, data, privacy_request)"
    Upload->>S3: "upload_fileobj(Fileobj, Bucket, Key)"
    S3-->>Upload: "Upload complete"
    
    Upload->>S3: "create_presigned_url_for_s3(s3_client, bucket_name, file_key)"
    S3-->>Upload: "Presigned URL"
    Upload-->>User: "Presigned URL"
Loading

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link

codecov bot commented Aug 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.38%. Comparing base (0261f71) to head (e57200d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6443   +/-   ##
=======================================
  Coverage   87.38%   87.38%           
=======================================
  Files         457      457           
  Lines       29274    29276    +2     
  Branches     3245     3245           
=======================================
+ Hits        25582    25584    +2     
  Misses       2983     2983           
  Partials      709      709           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -95,10 +97,16 @@ def get_s3_client(
If an `assume_role_arn` is provided, the secrets will be used to
assume that role and return a Session instantiated with that role.
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would callout in the docstring here that we fallback/default to the configured storage credentials.

i don't think we've got any cases yet where we're calling get_s3_client outside of the "storage" functionality, but it's not inconceivable that we'd want e.g. an s3 monitor to call this utility function, and it may not rely on these same credentials.

i don't think that hypothetical should prevent us from making this update, but i do think that we should make it clear to any potential future callers of this very broad utility function that they need to be a bit careful about the assumptions it makes on where it's going to look for the assume_role_arn...

Copy link
Contributor

@adamsachs adamsachs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@adamsachs adamsachs merged commit 15ad474 into main Aug 11, 2025
17 checks passed
@adamsachs adamsachs deleted the ENG-1141 branch August 11, 2025 21:20
@greptile-apps greptile-apps bot mentioned this pull request Aug 11, 2025
16 tasks
Copy link

cypress bot commented Aug 11, 2025

fides    Run #13223

Run Properties:  status check passed Passed #13223  •  git commit 15ad474c58: Configure role assumption in create client. (#6443)
Project fides
Branch Review main
Run status status check passed Passed #13223
Run duration 01m 00s
Commit git commit 15ad474c58: Configure role assumption in create client. (#6443)
Committer Tom Van Dort
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 5
View all changes introduced in this branch ↗︎

adamsachs added a commit that referenced this pull request Aug 11, 2025
Co-authored-by: Eliana Rosselli <eliana@ethyca.com>
Co-authored-by: Adam Sachs <adam@ethyca.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants