Skip to content

Conversation

thomasyopes
Copy link
Contributor

@thomasyopes thomasyopes commented Jul 9, 2025

Issues:

Description

Testing

Check each PR.

Release Plan

  • ⚠️ Points to master
  • Merge this

Summary by CodeRabbit

  • New Features
    • Added support for Snowflake integration by introducing new configuration options for Snowflake credentials.
    • Enabled creation of AWS IAM roles and policies to allow secure Snowflake access to S3 data within the analytics platform.

Thomas Yopes and others added 3 commits July 9, 2025 13:26
Ref: ENG-602

Ref: #1040
Signed-off-by: Thomas Yopes <thomasyopes@Mac.attlocal.net>
Ref: ENG-602

Ref: #1040
Signed-off-by: Thomas Yopes <thomasyopes@Mac.attlocal.net>
…owflake-infra

feat(ap): adding snowflake s3 integration role
Copy link

coderabbitai bot commented Jul 9, 2025

Walkthrough

This change extends the analytics platform configuration to include a new snowflake integration section with two string fields. The analytics platform stack is updated to create an AWS IAM policy and role for Snowflake integration, granting specific S3 permissions using configuration-driven values.

Changes

File(s) Change Summary
packages/infra/config/analytics-platform-config.ts Added snowflake property with integrationUserArn and integrationExternalId fields to AnalyticsPlatformConfig interface.
packages/infra/config/example.ts Extended config.analyticsPlatform with a new snowflake object containing integrationUserArn and integrationExternalId.
packages/infra/lib/analytics-platform/analytics-platform-stack.ts Added IAM policy and role for Snowflake integration, using new config fields for principal ARN and external ID; updated stack logic.

Sequence Diagram(s)

sequenceDiagram
    participant Config as analytics-platform-config.ts
    participant Stack as analytics-platform-stack.ts
    participant AWS as AWS IAM/S3
    participant Snowflake as Snowflake

    Config->>Stack: Provide snowflake.integrationUserArn & integrationExternalId
    Stack->>AWS: Create IAM Policy (S3 access for snowflake prefix)
    Stack->>AWS: Create IAM Role (assumable by integrationUserArn with externalId)
    Snowflake->>AWS: Assume Role using integrationUserArn & integrationExternalId
    AWS->>Snowflake: Grant S3 access per policy
Loading

Possibly related PRs

  • feat(ap): adding snowflake s3 integration role #4167: The main PR and the retrieved PR both add the same Snowflake integration IAM role and policy to the analytics platform stack and extend the same configuration interface with identical properties, indicating they are directly related.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

npm error code ERR_SSL_WRONG_VERSION_NUMBER
npm error errno ERR_SSL_WRONG_VERSION_NUMBER
npm error request to https://10.0.0.28:4873/punycode/-/punycode-2.3.1.tgz failed, reason: C0DCD35BEC7F0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:354:
npm error
npm error A complete log of this run can be found in: /.npm/_logs/2025-07-09T23_06_05_486Z-debug-0.log


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3dba7cb and 8f1fe9e.

📒 Files selected for processing (3)
  • packages/infra/config/analytics-platform-config.ts (1 hunks)
  • packages/infra/config/example.ts (1 hunks)
  • packages/infra/lib/analytics-platform/analytics-platform-stack.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
`**/*.{js,jsx,ts,tsx}`: Don’t use null inside the app, only on code interacting ...

**/*.{js,jsx,ts,tsx}: Don’t use null inside the app, only on code interacting with external interfaces/services, like DB and HTTP; convert to undefined before sending inwards into the code
Use const whenever possible
Use async/await instead of .then()
Naming: classes, enums: PascalCase
Naming: constants, variables, functions: camelCase
Naming: file names: kebab-case
Naming: Don’t use negative names, like notEnabled, prefer isDisabled
If possible, use decomposing objects for function parameters
Prefer Nullish Coalesce (??) than the OR operator (||) when you want to provide a default value
Avoid creating arrow functions
Use truthy syntax instead of in - i.e., if (data.link) not if ('link' in data)
While handling errors, keep the stack trace around: if you create a new Error (e.g., MetriportError), make sure to pass the original error as the new one’s cause so the stack trace is available upstream.
max column length is 100 chars
multi-line comments use /** */
top-level comments go after the import (save pre-import to basic file header, like license)
move literals to constants declared after imports when possible

📄 Source: CodeRabbit Inference Engine (.cursorrules)

List of files the instruction was applied to:

  • packages/infra/config/example.ts
  • packages/infra/config/analytics-platform-config.ts
  • packages/infra/lib/analytics-platform/analytics-platform-stack.ts
`**/*.{ts,tsx}`: Use types whenever possible

**/*.{ts,tsx}: Use types whenever possible

📄 Source: CodeRabbit Inference Engine (.cursorrules)

List of files the instruction was applied to:

  • packages/infra/config/example.ts
  • packages/infra/config/analytics-platform-config.ts
  • packages/infra/lib/analytics-platform/analytics-platform-stack.ts
`**/*.ts`: - Use the Onion Pattern to organize a package's code in layers - Try ...

**/*.ts: - Use the Onion Pattern to organize a package's code in layers

  • Try to use immutable code and avoid sharing state across different functions, objects, and systems
  • Try to build code that's idempotent whenever possible
  • Prefer functional programming style functions: small, deterministic, 1 input, 1 output
  • Minimize coupling / dependencies
  • Avoid modifying objects received as parameter
  • Only add comments to code to explain why something was done, not how it works
  • Naming
    • classes, enums: PascalCase
    • constants, variables, functions: camelCase
    • file names: kebab-case
    • table and column names: snake_case
    • Use meaningful names, so whoever is reading the code understands what it means
    • Don’t use negative names, like notEnabled, prefer isDisabled
    • For numeric values, if the type doesn’t convey the unit, add the unit to the name
  • Typescript
    • Use types
    • Prefer const instead of let
    • Avoid any and casting from any to other types
    • Type predicates: only applicable to narrow down the type, not to force a complete type conversion
    • Prefer deconstructing parameters for functions instead of multiple parameters that might be of
      the same type
    • Don’t use null inside the app, only on code interacting with external interfaces/services,
      like DB and HTTP; convert to undefined before sending inwards into the code
    • Use async/await instead of .then()
    • Use the strict equality operator ===, don’t use abstract equality operator ==
    • When calling a Promise-returning function asynchronously (i.e., not awaiting), use .catch() to
      handle errors (see processAsyncError and emptyFunction depending on the case)
    • Date and Time
      • Always use buildDayjs() to create dayjs instances
      • Prefer dayjs.duration(...) to create duration consts and keep them as duration
  • Prefer Nullish Coalesce (??) than the OR operator (||) to provide a default value
  • Avoid creating arrow functions
  • Use truthy syntax instead of in - i.e., if (data.link) not if ('link' in data)
  • Error handling
    • Pass the original error as the new one’s cause so the stack trace is persisted
    • Error messages should have a static message - add dynamic data to MetriportError's additionalInfo prop
    • Avoid sending multiple events to Sentry for a single error
  • Global constants and variables
    • Move literals to constants declared after imports when possible (avoid magic numbers)
    • Avoid shared, global objects
  • Avoid using console.log and console.error in packages other than utils, infra and shared,
    and try to use out().log instead
  • Avoid multi-line logs
    • don't send objects as a second parameter to console.log() or out().log()
    • don't create multi-line strings when using JSON.stringify()
  • Use eslint to enforce code style
  • Use prettier to format code
  • max column length is 100 chars
  • multi-line comments use /** */
  • scripts: top-level comments go after the import

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • packages/infra/config/example.ts
  • packages/infra/config/analytics-platform-config.ts
  • packages/infra/lib/analytics-platform/analytics-platform-stack.ts
🧠 Learnings (4)
📓 Common learnings
Learnt from: leite08
PR: metriport/metriport#0
File: :0-0
Timestamp: 2025-03-05T18:43:30.389Z
Learning: For backmerge PRs at metriport/metriport, only verify two things: (1) that the source branch is `master` and destination branch is `develop`, and (2) that there's a link to at least one PR (usually a "patch" PR) in the description. No need for detailed review comments or updates to the PR description unless there's an issue with these criteria.
Learnt from: leite08
PR: metriport/metriport#3611
File: packages/utils/src/fhir-converter/s3-to-lambda.ts:29-47
Timestamp: 2025-04-05T16:02:02.517Z
Learning: For release PRs (where head branch is `develop` and base branch is `master`), don't suggest code changes as these have already been reviewed in previous PRs. Instead, focus only on checking if the PR description follows the template and practices, and if the PR is pointing to the correct base branch (`master`).
Learnt from: lucasdellabella
PR: metriport/metriport#3907
File: packages/terminology/src/seed/seed-ndc-from-fda.ts:91-113
Timestamp: 2025-05-28T05:28:26.896Z
Learning: On PRs with "release" in the title, only provide very high priority feedback and avoid lower priority suggestions like refactoring or optimization improvements.
Learnt from: lucasdellabella
PR: metriport/metriport#0
File: :0-0
Timestamp: 2025-03-14T13:44:44.651Z
Learning: Avoid adding AI-generated summaries to PRs with "RELEASE" in the title.
Learnt from: leite08
PR: metriport/metriport#3749
File: packages/core/src/command/patient-import/record/create-headers-file.ts:16-27
Timestamp: 2025-04-28T22:57:36.763Z
Learning: Focus improvement suggestions on feature branches rather than release PRs. Release PRs should be treated with higher caution as they're focused on getting tested code into production.
Learnt from: leite08
PR: metriport/metriport#3975
File: packages/core/src/command/consolidated/search/fhir-resource/__tests__/search-consolidated.test.ts:8-9
Timestamp: 2025-06-07T16:31:23.929Z
Learning: User prefers to minimize changes in release PRs to reduce risk, even when acknowledging that suggested improvements would be beneficial. They prioritize stability over minor improvements in release contexts.
Learnt from: thomasyopes
PR: metriport/metriport#3484
File: packages/api/src/routes/internal/medical/organization.ts:7-15
Timestamp: 2025-03-18T23:49:35.819Z
Learning: Typo fixes and other minor code quality improvements should not be addressed in release PRs to minimize risk and maintain stability.
Learnt from: thomasyopes
PR: metriport/metriport#4167
File: packages/infra/lib/analytics-platform/analytics-platform-stack.ts:56-64
Timestamp: 2025-07-09T20:41:08.549Z
Learning: In the analytics platform Snowflake integration (packages/infra/lib/analytics-platform/analytics-platform-stack.ts), the `integrationUserArn` field in the config temporarily contains an AWS account ID rather than an ARN. This is planned to be updated to an actual ARN in a second pass, at which point the IAM Role's assumedBy should be changed from AccountPrincipal to ArnPrincipal.
packages/infra/config/example.ts (4)
Learnt from: thomasyopes
PR: metriport/metriport#4167
File: packages/infra/lib/analytics-platform/analytics-platform-stack.ts:56-64
Timestamp: 2025-07-09T20:41:08.549Z
Learning: In the analytics platform Snowflake integration (packages/infra/lib/analytics-platform/analytics-platform-stack.ts), the `integrationUserArn` field in the config temporarily contains an AWS account ID rather than an ARN. This is planned to be updated to an actual ARN in a second pass, at which point the IAM Role's assumedBy should be changed from AccountPrincipal to ArnPrincipal.
Learnt from: leite08
PR: metriport/metriport#3550
File: packages/api/src/external/commonwell/__tests__/patient.test.ts:5-5
Timestamp: 2025-04-01T19:59:22.396Z
Learning: The file at `packages/api/src/aws/app-config.ts` contains domain feature flags (not AWS AppConfig specific functionality) and is planned to be moved to the domain folder in packages/core in a future PR.
Learnt from: leite08
PR: metriport/metriport#3550
File: packages/api/src/external/commonwell/__tests__/patient.test.ts:5-5
Timestamp: 2025-04-01T20:57:29.282Z
Learning: PR #3574 is the follow-up PR that moves the domain feature flags from packages/api/src/aws/app-config.ts to packages/core/src/command/feature-flags/domain-ffs.ts.
Learnt from: lucasdellabella
PR: metriport/metriport#3948
File: packages/infra/config/example.ts:179-180
Timestamp: 2025-06-03T14:41:10.149Z
Learning: Example configuration files (like example.ts) use placeholder values and don't need to match real environment-specific constants or actual deployment values. They're meant to demonstrate structure and format.
packages/infra/config/analytics-platform-config.ts (1)
Learnt from: thomasyopes
PR: metriport/metriport#4167
File: packages/infra/lib/analytics-platform/analytics-platform-stack.ts:56-64
Timestamp: 2025-07-09T20:41:08.549Z
Learning: In the analytics platform Snowflake integration (packages/infra/lib/analytics-platform/analytics-platform-stack.ts), the `integrationUserArn` field in the config temporarily contains an AWS account ID rather than an ARN. This is planned to be updated to an actual ARN in a second pass, at which point the IAM Role's assumedBy should be changed from AccountPrincipal to ArnPrincipal.
packages/infra/lib/analytics-platform/analytics-platform-stack.ts (5)
Learnt from: thomasyopes
PR: metriport/metriport#4167
File: packages/infra/lib/analytics-platform/analytics-platform-stack.ts:56-64
Timestamp: 2025-07-09T20:41:08.549Z
Learning: In the analytics platform Snowflake integration (packages/infra/lib/analytics-platform/analytics-platform-stack.ts), the `integrationUserArn` field in the config temporarily contains an AWS account ID rather than an ARN. This is planned to be updated to an actual ARN in a second pass, at which point the IAM Role's assumedBy should be changed from AccountPrincipal to ArnPrincipal.
Learnt from: leite08
PR: metriport/metriport#3817
File: .github/workflows/deploy-staging.yml:206-214
Timestamp: 2025-05-12T18:29:48.116Z
Learning: In the Metriport CI/CD workflows, the IHE stack deployment is intentionally triggered by general infrastructure changes (infra-lambdas) rather than having its own dedicated change detection, as it needs to stay in sync with other infrastructure components.
Learnt from: thomasyopes
PR: metriport/metriport#3709
File: packages/infra/lib/ehr-nested-stack.ts:470-486
Timestamp: 2025-04-23T19:33:45.489Z
Learning: The `refreshEhrBundles` Lambda in the EhrNestedStack uses the API to refresh bundles, rather than accessing the S3 bucket directly. This is why it doesn't need the EHR_BUNDLE_BUCKET_NAME environment variable or direct S3 bucket permissions, unlike other similar Lambdas.
Learnt from: leite08
PR: metriport/metriport#3545
File: packages/infra/lib/api-stack.ts:383-391
Timestamp: 2025-03-28T23:03:04.409Z
Learning: When creating a nested stack in AWS CDK, don't include "NestedStack" in the ID parameter since the class name already has that suffix.
Learnt from: thomasyopes
PR: metriport/metriport#3871
File: packages/infra/lib/surescripts-stack.ts:189-189
Timestamp: 2025-05-21T20:57:09.547Z
Learning: When creating AWS S3 buckets with AWS CDK, if the bucketName property is not provided, CDK will generate a default name rather than failing. In the Metriport codebase, properties like bucket names should be made required in type definitions rather than adding runtime validation.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: check-pr / lint-build-test
  • GitHub Check: Analyze (javascript)
  • GitHub Check: ihe_stack / deploy
  • GitHub Check: secrets_cdk_stack / deploy
  • GitHub Check: infra-api-lambdas / deploy
  • GitHub Check: Analyze (javascript)
  • GitHub Check: check-pr / lint-build-test
🔇 Additional comments (5)
packages/infra/config/analytics-platform-config.ts (1)

6-9: LGTM! Clean interface extension for Snowflake integration.

The new snowflake configuration section properly extends the interface with the required fields for AWS IAM integration.

packages/infra/config/example.ts (1)

243-246: LGTM! Example configuration follows proper format.

The placeholder values demonstrate the expected structure for the Snowflake integration configuration.

packages/infra/lib/analytics-platform/analytics-platform-stack.ts (3)

3-3: LGTM! Required import for IAM resources.

The IAM import is needed for the new Snowflake integration infrastructure.


18-18: LGTM! Proper bucket reference extraction.

Extracting the bucket reference into a const improves readability and enables reuse in the IAM policy.


29-66: LGTM! Snowflake S3 integration follows AWS best practices.

The implementation correctly:

  • Scopes S3 permissions to the "snowflake" prefix for security
  • Uses external ID for additional authentication security
  • Grants appropriate S3 actions for Snowflake integration
  • Uses AccountPrincipal (temporary implementation as noted in previous learnings)

The IAM policy and role configuration aligns with Snowflake's documented S3 integration requirements.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@thomasyopes thomasyopes added this pull request to the merge queue Jul 10, 2025
Merged via the queue into master with commit f37410a Jul 10, 2025
65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants