ENG-722 Data transformation on pt level with updates #4245

leite08 · 2025-07-25T01:44:57Z

Dependencies

none

Description

remove DB tx and fix some mappings
remove /fhir-to-csv/transform endpoint
remove cx level processing
update S3 and job table structure to contain pt ID

Testing

Local
- Run 2 pts in parallel
- Rebuild 2 pts in parallel
Branch to staging
- Run all pts of a cx using the bulk-insert-patients.ts script
Staging
- Run all pts of a cx using the bulk-insert-patients.ts script
Sandbox
- none
Production
- Rebuild the cx's whole pt roster

Release Plan

Merge this

Summary by CodeRabbit

New Features
- Enhanced MedicationDispense and MedicationAdministration data extraction with expanded and reordered field mappings.
- Added new identifier mapping for FamilyMemberHistory condition data.
- Introduced patient status tracking during data transformation and Snowflake integration.
- Added new file prefix formatting for output data organization.
Improvements
- Streamlined data transformation to focus on single-patient processing.
- Improved error handling for missing data and S3 bundle downloads.
- Updated Snowflake table naming conventions to include patient IDs and enhanced logging.
- Standardized job ID generation and usage across bulk patient insert operations.
- Increased Lambda function timeout for improved processing reliability.
- Refined Snowflake data loading with explicit stage management and patient status updates.
Chores
- Removed unused dependencies from requirements.
- Cleaned up and simplified internal API routes and related code.
Bug Fixes
- Fixed issues with missing or inconsistent patient ID handling during transformation and database operations.

linear · 2025-07-25T01:44:59Z

ENG-722 Finish v0 of Snowflake integration

coderabbitai · 2025-07-25T01:45:11Z

Walkthrough

This change set removes the FHIR-to-CSV transform API endpoint and its client, refactors the FHIR-to-CSV data transformation pipeline to operate on a single patient per job, and updates configuration mappings for FamilyMemberHistory and Medication resources. It also reworks Snowflake integration to use patient-specific job tables and status tracking, adds logging, and updates utility functions and dependencies.

Changes

File(s) / Path(s)	Change Summary
packages/api/src/routes/internal/analytics-platform/index.ts	Removed the POST route handler and import for `/internal/analytics-platform/fhir-to-csv/transform`.
packages/core/src/command/analytics-platform/api/start-fhir-to-csv-transform.ts	Deleted file: removed the function and type for initiating FHIR-to-CSV transformation via API.
packages/data-transformation/fhir-to-csv/main.py	Refactored to process a single patient per job, removed api_url logic, improved error handling, updated S3 key structure, enforced patient_id requirement, adjusted Snowflake calls, added patient status tracking, and simplified table operations.
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_FamilyMemberHistory_condition.ini	Added `familymemberhistory_id = id` mapping in both `[Struct]` and `[root_paths]` sections.
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationAdministration.ini	Added mappings for `subject_type`, `context_reference`, `context_type`; reordered `effectiveperiod_start` in `[Struct]` and `[root_paths]`.
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationDispense.ini	Added extensive new field mappings for MedicationDispense resource in `[Struct]` and `[root_paths]`; no deletions.
packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py	Added patient_id to key functions, split and refactored table setup, added logging, removed explicit commit, removed `rename_job_tables`, added `set_patient_status` and stage management functions, updated table name logic to include patient_id.
packages/data-transformation/fhir-to-csv/src/utils/database.py	Updated `format_database_name` to use "ANALYTICS2_", updated `format_job_table_name` to include patient_id, added `format_patient_status_table_name`, and changed `format_stage_name` to use cx_id, patient_id, and job_id.
packages/utils/src/snowflake/bulk-insert-patients.ts	Changed jobId generation to one per batch (timestamp-based), removed `uuidv4` import, updated logs to include jobId.
packages/data-transformation/fhir-to-csv/requirements.txt	Removed `numpy==2.3.1` dependency.
packages/infra/lib/analytics-platform/analytics-platform-stack.ts	Increased Lambda function timeout for `fhirToCsvTransform` from 2 to 10 minutes; adjusted dependent timeout for `fhirToCsv` accordingly.
packages/data-transformation/fhir-to-csv/src/utils/file.py	Added new function `create_output_file_prefix` to generate S3 key prefixes for output files based on parameters.
packages/data-transformation/fhir-to-csv/src/parseFhir/parseFhir.py	Removed timestamp from log message format, keeping only log level and message.
packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts	Enhanced log message in `startFhirToCsvTransform` to include patient ID and job ID for better traceability.
packages/data-transformation/fhir-to-csv/Dockerfile.lambda	Added `PYTHONUNBUFFERED=1` environment variable to Dockerfile to ensure unbuffered Python output.

Sequence Diagram(s)

sequenceDiagram
    participant SQS
    participant bulk-insert-patients.ts
    participant Lambda (main.py)
    participant S3
    participant Snowflake

    bulk-insert-patients.ts->>SQS: Send message (patientId, jobId, ...)
    SQS->>Lambda (main.py): Trigger with event (patientId, jobId, ...)
    Lambda (main.py)->>S3: Download bundle for patientId
    alt Bundle missing
        Lambda (main.py)->>Lambda (main.py): Raise error, abort
    else Bundle found
        Lambda (main.py)->>Lambda (main.py): Transform bundle to CSV(s)
        Lambda (main.py)->>S3: Upload CSV(s) (key includes patientId)
        Lambda (main.py)->>Snowflake: setup_database(cxId)
        Lambda (main.py)->>Snowflake: set_patient_status(cxId, patientId, "processing")
        Lambda (main.py)->>Snowflake: create_job_tables(cxId, patientId, jobId)
        Lambda (main.py)->>Snowflake: copy_into_job_table(..., patientId, jobId, ...)
        Lambda (main.py)->>Snowflake: append_job_tables(cxId, patientId, jobId, rebuild_flag)
        Lambda (main.py)->>Snowflake: set_patient_status(cxId, patientId, "completed")
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

ENG-602 FhirToCsv lambda calls the transform one directly #4217: Modifies the FhirToCsv lambda to call the transform function directly instead of through the API, related to removal of the FHIR-to-CSV transform API endpoint in this PR.
RELEASE ENG-602 MVP Snowflake #4215: Adds the POST route handler and client function for the FHIR-to-CSV transform API endpoint that this PR removes, indicating a direct code-level reversal or removal of that feature.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

npm error code ERR_SSL_WRONG_VERSION_NUMBER
npm error errno ERR_SSL_WRONG_VERSION_NUMBER
npm error request to https://10.0.0.28:4873/punycode/-/punycode-2.3.1.tgz failed, reason: C00CAC46737F0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:354:
npm error
npm error A complete log of this run can be found in: /.npm/_logs/2025-07-25T22_48_38_454Z-debug-0.log

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e64efcf and 1f3b75b.

📒 Files selected for processing (2)

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts (1 hunks)
packages/data-transformation/fhir-to-csv/Dockerfile.lambda (1 hunks)

✅ Files skipped from review due to trivial changes (1)

packages/data-transformation/fhir-to-csv/Dockerfile.lambda

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursorrules)

**/*.{js,jsx,ts,tsx}: Don’t use null inside the app, only on code interacting with external interfaces/services, like DB and HTTP; convert to undefined before sending inwards into the code
Use const whenever possible
Use async/await instead of .then()
Naming: classes, enums: PascalCase
Naming: constants, variables, functions: camelCase
Naming: file names: kebab-case
Naming: Don’t use negative names, like notEnabled, prefer isDisabled
If possible, use decomposing objects for function parameters
Prefer Nullish Coalesce (??) than the OR operator (||) when you want to provide a default value
Avoid creating arrow functions
Use truthy syntax instead of in - i.e., if (data.link) not if ('link' in data)
While handling errors, keep the stack trace around: if you create a new Error (e.g., MetriportError), make sure to pass the original error as the new one’s cause so the stack trace is available upstream.
max column length is 100 chars
multi-line comments use /** */
top-level comments go after the import (save pre-import to basic file header, like license)
move literals to constants declared after imports when possible

Files:

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts

**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (.cursorrules)

Use types whenever possible

Files:

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts

**/*.ts

⚙️ CodeRabbit Configuration File

**/*.ts: - Use the Onion Pattern to organize a package's code in layers

Try to use immutable code and avoid sharing state across different functions, objects, and systems

Try to build code that's idempotent whenever possible

Prefer functional programming style functions: small, deterministic, 1 input, 1 output

Minimize coupling / dependencies

Avoid modifying objects received as parameter

Only add comments to code to explain why something was done, not how it works

Naming

classes, enums: PascalCase

constants, variables, functions: camelCase

file names: kebab-case

table and column names: snake_case

Use meaningful names, so whoever is reading the code understands what it means

Don’t use negative names, like notEnabled, prefer isDisabled

For numeric values, if the type doesn’t convey the unit, add the unit to the name

Typescript

Use types

Prefer const instead of let

Avoid any and casting from any to other types

Type predicates: only applicable to narrow down the type, not to force a complete type conversion

Prefer deconstructing parameters for functions instead of multiple parameters that might be of
the same type

Don’t use null inside the app, only on code interacting with external interfaces/services,
like DB and HTTP; convert to undefined before sending inwards into the code

Use async/await instead of .then()

Use the strict equality operator ===, don’t use abstract equality operator ==

When calling a Promise-returning function asynchronously (i.e., not awaiting), use .catch() to
handle errors (see processAsyncError and emptyFunction depending on the case)

Date and Time

Always use buildDayjs() to create dayjs instances

Prefer dayjs.duration(...) to create duration consts and keep them as duration

Prefer Nullish Coalesce (??) than the OR operator (||) to provide a default value

Avoid creating arrow functions

U...

Files:

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts

🧠 Learnings (2)

📓 Common learnings

Learnt from: leite08
PR: metriport/metriport#3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: leite08
PR: metriport/metriport#3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file 2025-04-24_02_create-patient_import_mapping-table.ts, which ensures data integrity.

Learnt from: thomasyopes
PR: metriport/metriport#3427
File: packages/core/src/external/ehr/api/sync-patient.ts:16-55
Timestamp: 2025-03-11T20:42:46.516Z
Learning: In the patient synchronization architecture, the flow follows this pattern: (1) `ehr-sync-patient-cloud.ts` sends messages to an SQS queue, (2) the `ehr-sync-patient` Lambda consumes these messages, and (3) the Lambda uses the `syncPatient` function to make the API calls to process the patient data.

Learnt from: thomasyopes
PR: metriport/metriport#3882
File: packages/core/src/external/ehr/command/get-bundle-by-resource-type.ts:31-37
Timestamp: 2025-06-05T14:09:59.683Z
Learning: When EHR handlers in mapping objects are set to `undefined` and there are downstream PRs mentioned in the PR objectives, this typically indicates a staged rollout approach where the core structure is implemented first and specific handler implementations follow in subsequent PRs.

Learnt from: RamilGaripov
PR: metriport/metriport#3676
File: packages/core/src/command/hl7v2-subscriptions/hl7v2-to-fhir-conversion/shared.ts:1-10
Timestamp: 2025-04-16T00:25:25.196Z
Learning: The circular dependency between shared.ts (importing getPatientIdsOrFail) and adt/utils.ts (using unpackPidFieldOrFail) will be addressed in a follow-up PR.

Learnt from: thomasyopes
PR: metriport/metriport#3466
File: packages/api/src/routes/ehr/elation/chart.ts:20-37
Timestamp: 2025-03-17T15:30:34.647Z
Learning: The request body parsing in packages/api/src/routes/ehr/elation/chart.ts is tracked with TODO referencing issue #2170 and will be handled in a separate PR, not in PR #3466.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the `/fhir-to-csv` and `/fhir-to-csv/transform` endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the `/fhir-to-csv/batch` endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: metriport/metriport#4193
File: packages/carequality-sdk/src/client/carequality-fhir.ts:198-208
Timestamp: 2025-07-16T00:54:56.156Z
Learning: User leite08 accepted the risk of parameter mutation in packages/carequality-sdk/src/client/carequality-fhir.ts when modifying org.active = false directly on input parameters, choosing to proceed despite the violation of the "Avoid modifying objects received as parameter" coding guideline.

Learnt from: leite08
PR: metriport/metriport#4215
File: packages/core/src/command/analytics-platform/config.ts:4-13
Timestamp: 2025-07-20T01:18:28.093Z
Learning: User leite08 prefers that code improvement suggestions (like consolidating duplicated schemas) be made on feature PRs rather than release PRs. Such improvements should be deferred to follow-up PRs during the release cycle.

Learnt from: leite08
PR: metriport/metriport#3648
File: packages/utils/src/shared/patient-create.ts:35-53
Timestamp: 2025-04-10T17:30:30.368Z
Learning: The function `mergePatients` in the patient-create.ts file is intentionally designed to take all properties from the first patient (p1) and only update address and contact fields, as the system supports duplicated patients specifically to allow multiple addresses across different records.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:93-97
Timestamp: 2025-07-19T14:04:34.757Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes accepts SQL injection risk in the append_job_tables function's DELETE query where patient_id is directly interpolated, based on their assessment that they control all inputs to the system.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:0-0
Timestamp: 2025-07-18T09:54:43.598Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes prefers to log and swallow exceptions in the copy_into_temp_table function rather than re-raising them. This allows the data transformation job to continue processing other files even if one file fails to copy to Snowflake, supporting partial success scenarios in batch data processing.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:38.906Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, S3 upload operations should be allowed to hard fail without try-catch blocks. The team prefers immediate failure over error handling and continuation for S3 upload operations in the FHIR-to-CSV transformation process.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:47.448Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, the user thomasyopes prefers to let Snowflake operations (create_temp_tables, copy_into_temp_table, append_temp_tables, rename_temp_tables) hard fail rather than adding explicit try-catch error handling blocks.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

Learnt from: RamilGaripov
PR: metriport/metriport#3962
File: packages/api/src/sequelize/migrations/2025-06-17_00_create-cohort-tables.ts:50-56
Timestamp: 2025-06-17T20:03:18.931Z
Learning: In the Metriport codebase, the patient table ID is defined as DataTypes.STRING in the database schema, not UUID. Foreign key references to patient.id should use DataTypes.STRING to maintain consistency.

Learnt from: thomasyopes
PR: metriport/metriport#3970
File: packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts:17-17
Timestamp: 2025-06-06T16:45:31.832Z
Learning: The writeMedicationToChart function in packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts returns a response that is not currently used by any consumers, so changes to its return type are not breaking changes in practice.

Learnt from: leite08
PR: metriport/metriport#4013
File: packages/core/src/fhir-to-cda/cda-templates/components/procedures.ts:104-105
Timestamp: 2025-06-12T22:53:09.606Z
Learning: User leite08 prefers responses in English only.

Learnt from: leite08
PR: metriport/metriport#3991
File: packages/core/src/external/aws/s3.ts:621-630
Timestamp: 2025-06-10T22:20:21.203Z
Learning: User leite08 prefers that all review comments are written in English.

Learnt from: leite08
PR: metriport/metriport#3912
File: packages/core/src/command/consolidated/search/fhir-resource/search-consolidated.ts:74-83
Timestamp: 2025-06-04T00:14:38.041Z
Learning: User leite08 prefers incremental progress in refactoring scenarios, accepting some inconsistency as a stepping stone towards better architecture rather than requiring perfect consistency immediately.

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts (16)

Learnt from: thomasyopes
PR: #4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the /fhir-to-csv and /fhir-to-csv/transform endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the /fhir-to-csv/batch endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: #3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: keshavsaharia
PR: #3885
File: packages/core/src/external/sftp/surescripts/client.ts:1-134
Timestamp: 2025-05-28T19:23:20.179Z
Learning: In packages/core/src/external/sftp/surescripts/client.ts, the standalone getPatientLoadFileName function intentionally omits the transmission ID from the filename, which differs from the class method getPatientLoadFileName. This difference in filename generation is expected behavior.

Learnt from: thomasyopes
PR: #3427
File: packages/lambdas/src/ehr-sync-patient.ts:31-33
Timestamp: 2025-03-11T19:12:22.472Z
Learning: In this codebase, a specific logging pattern is used: console.log is used for early logging before context is available, then a prefixed logger (created with prefixedLog) is initialized after parsing data to include contextual information (like ehr, cxId, practiceId, patientId) in all subsequent log messages.

Learnt from: thomasyopes
PR: #3970
File: packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts:17-17
Timestamp: 2025-06-06T16:45:31.832Z
Learning: The writeMedicationToChart function in packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts returns a response that is not currently used by any consumers, so changes to its return type are not breaking changes in practice.

Learnt from: thomasyopes
PR: #4090
File: packages/core/src/command/conversion-fhir/conversion-fhir-direct.ts:18-25
Timestamp: 2025-06-25T01:56:08.732Z
Learning: In packages/core/src/command/conversion-fhir/conversion-fhir-direct.ts, the ConversionFhirDirect class is designed for local testing purposes and does not require comprehensive error handling for HTTP requests, as confirmed by the user thomasyopes.

Learnt from: RamilGaripov
PR: #3676
File: packages/core/src/command/hl7v2-subscriptions/hl7v2-to-fhir-conversion/shared.ts:1-10
Timestamp: 2025-04-16T00:25:25.196Z
Learning: The circular dependency between shared.ts (importing getPatientIdsOrFail) and adt/utils.ts (using unpackPidFieldOrFail) will be addressed in a follow-up PR.

Learnt from: leite08
PR: #3944
File: packages/core/src/command/consolidated/search/fhir-resource/ingest-lexical.ts:53-53
Timestamp: 2025-06-01T02:28:19.913Z
Learning: In packages/core/src/command/consolidated/search/fhir-resource/ingest-lexical.ts, the processErrors function intentionally throws MetriportError to bubble errors up the call stack rather than handling them locally. This is by design - errors from ingestPatientConsolidated should propagate upward rather than being caught at immediate calling locations.

Learnt from: thomasyopes
PR: #4090
File: packages/core/src/command/conversion-fhir/conversion-fhir-cloud.ts:30-30
Timestamp: 2025-06-25T01:55:42.627Z
Learning: In packages/core/src/command/conversion-fhir/conversion-fhir-cloud.ts, the FHIR converter Lambda endpoint is controlled by the team and guaranteed to return valid JSON in the expected Bundle format, so JSON.parse() error handling and type validation are not necessary for this specific endpoint.

Learnt from: RamilGaripov
PR: #3962
File: packages/api/src/sequelize/migrations/2025-06-17_00_create-cohort-tables.ts:50-56
Timestamp: 2025-06-17T20:03:18.931Z
Learning: In the Metriport codebase, the patient table ID is defined as DataTypes.STRING in the database schema, not UUID. Foreign key references to patient.id should use DataTypes.STRING to maintain consistency.

Learnt from: thomasyopes
PR: #4061
File: packages/api/src/external/ehr/shared/job/bundle/create-resource-diff-bundles/run-job.ts:34-47
Timestamp: 2025-06-19T22:44:49.393Z
Learning: In packages/api/src/external/ehr/shared/job/bundle/create-resource-diff-bundles/run-job.ts, the team prefers to keep refreshEhrBundles operations grouped using fire-and-forget pattern rather than awaiting all operations with Promise.allSettled(). This allows individual refresh operations to run independently without blocking the job runner.

Learnt from: RamilGaripov
PR: #3976
File: packages/api/src/command/medical/cohort/create-cohort.ts:12-12
Timestamp: 2025-06-18T21:06:10.803Z
Learning: The codebase uses template literals with variable interpolation in out() function calls as their standard logging pattern, e.g., out(\createCohort - cx: ${cxId}`)`. This is not a violation of their coding guidelines about avoiding multi-line logs.

Learnt from: thomasyopes
PR: #3882
File: packages/core/src/external/ehr/lambdas/get-bundle-by-resource-type/ehr-get-bundle-by-resource-type-cloud.ts:0-0
Timestamp: 2025-05-28T19:20:47.442Z
Learning: In packages/core/src/external/ehr/lambdas/get-bundle-by-resource-type/ehr-get-bundle-by-resource-type-cloud.ts, the EHR get bundle by resource type Lambda endpoint is guaranteed to return valid JSON, so JSON.parse() error handling is not necessary for this specific endpoint.

Learnt from: leite08
PR: #4034
File: packages/core/src/command/patient-import/patient-or-record-failed.ts:19-30
Timestamp: 2025-06-16T17:03:23.069Z
Learning: In patient import error handling functions like setPatientOrRecordFailed, the team prefers Promise.all over Promise.allSettled to ensure atomic all-or-nothing behavior - if either the job update or patient record update fails, the entire operation should fail rather than allowing partial success.

Learnt from: thomasyopes
PR: #3427
File: packages/core/src/external/ehr/api/sync-patient.ts:16-55
Timestamp: 2025-03-11T20:42:46.516Z
Learning: In the patient synchronization architecture, the flow follows this pattern: (1) ehr-sync-patient-cloud.ts sends messages to an SQS queue, (2) the ehr-sync-patient Lambda consumes these messages, and (3) the Lambda uses the syncPatient function to make the API calls to process the patient data.

Learnt from: thomasyopes
PR: #3882
File: packages/core/src/external/ehr/command/get-bundle-by-resource-type/ehr-get-bundle-by-resource-type-cloud.ts:27-49
Timestamp: 2025-05-28T19:22:09.281Z
Learning: In packages/core/src/external/ehr/command/get-bundle-by-resource-type/ehr-get-bundle-by-resource-type-cloud.ts, the EHR get bundle by resource type Lambda endpoint is guaranteed to return valid JSON, so JSON.parse() error handling is not necessary for this specific endpoint.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: secrets_cdk_stack / deploy
GitHub Check: ihe_stack / deploy
GitHub Check: location_services_cdk_stack / deploy
GitHub Check: infra-api-lambdas / deploy
GitHub Check: infra-hl7-notification / deploy
GitHub Check: mllp-server / deploy
GitHub Check: api / deploy
GitHub Check: check-pr / lint-build-test
GitHub Check: check-pr / lint-build-test
GitHub Check: check-pr / lint-build-test
GitHub Check: check-pr / lint-build-test
GitHub Check: Analyze (javascript)

🔇 Additional comments (1)

packages/core/src/command/analytics-platform/fhir-to-csv/command/fhir-to-csv-transform.ts (1)

20-20: LGTM! Enhanced logging improves observability.

The addition of patient ID and job ID to the log message provides better traceability for the FHIR to CSV transform process, especially valuable given the shift to patient-level processing mentioned in the PR objectives.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch eng-722-snowflake-per-pt

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

leite08 · 2025-07-25T01:49:50Z

packages/data-transformation/fhir-to-csv/main.py


    output_bucket_and_file_keys_and_table_names = []
    for file in local_output_files:
-        output_file_key = f"{dwh}/{transform_name}/{cx_id}/{job_id}/{file.replace('/', '_')}"
+        output_file_key = f"{dwh}/{transform_name}/{cx_id}/{patient_id}/{job_id}/{file.replace('/', '_')}"


important, added the pt ID on the S3 path

leite08 · 2025-07-25T01:50:18Z

packages/data-transformation/fhir-to-csv/main.py

    else:
-        patient_ids = [patient_id] if patient_id is not None and patient_id != "" else get_patient_ids(api_url, cx_id)


Can't be used on the cx level, needs pt ID

leite08 · 2025-07-25T01:51:05Z

packages/data-transformation/fhir-to-csv/main.py

+        raise ValueError("PATIENT_ID is not set")
+    if not job_id:
+        raise ValueError("JOB_ID is not set")
+    job_id = job_id.replace(":", "-").replace(".", "-")


ISO date to a format that can use used in the DB

leite08 · 2025-07-25T01:52:44Z

...sformation/fhir-to-csv/src/parseFhir/configurations/config_FamilyMemberHistory_condition.ini

@@ -5,6 +5,7 @@ writemode = append
 anchor = condition

 [Struct]
+familymemberhistory_id = id


Allows to join by ID w/ the familymemberhistory table

leite08 · 2025-07-25T01:53:29Z

packages/data-transformation/fhir-to-csv/src/parseNdjsonBundle/parseNdjsonBundle.py

@@ -38,7 +38,7 @@ def parse(input_path: str, outputs_folder: str) -> list[str]:
    # Process each resource type separately
    for resource_type, config_files in config_groups.items():
        # First, filter the input file for just this resource type
-        filtered_input = f'{outputs_folder}/temp_{resource_type.lower()}.ndjson'
+        filtered_input = f'{outputs_folder}/temp_{patient_id}_{resource_type.lower()}.ndjson'


important, so we can remove existing data based on the pt ID

This temp file gets deleted as a heads up. I believe that the "filename" column is the derived from the path + filename, so as long as new file is along the patient ID path, it'll appear in the column.

I believe the file name column is - normalized version of "output_file_path"

Taken to Slack - https://metriport.slack.com/archives/C0616FCPAKZ/p1753410857448749?thread_ts=1752623015.505869&cid=C0616FCPAKZ

Done, reverted.

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (1)

packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py (1)

86-90: Potential data integrity issue with LIKE pattern matching

The DELETE query uses LIKE '%{patient_id}%' which could unintentionally match other patients' data if one patient_id is a substring of another (e.g., patient "123" would match "1234", "0123", etc.).

Consider using a more precise matching pattern or storing patient_id in a dedicated column for exact matching.

🧹 Nitpick comments (1)

packages/data-transformation/fhir-to-csv/main.py (1)

48-53: Improve exception handling with context

When re-raising exceptions, use from to preserve the exception chain for better debugging.

         except s3_client.exceptions.ClientError as e:
             if e.response['Error']['Code'] == '404':
                 logging.warning(f"Bundle {bundle_key} not found in input bucket {input_bucket}")
-                raise ValueError("Bundle not found") 
+                raise ValueError("Bundle not found") from e
             else:
                 raise e

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ccd202 and 1c401d5.

📒 Files selected for processing (10)

packages/api/src/routes/internal/analytics-platform/index.ts (0 hunks)
packages/core/src/command/analytics-platform/api/start-fhir-to-csv-transform.ts (0 hunks)
packages/data-transformation/fhir-to-csv/main.py (2 hunks)
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_FamilyMemberHistory_condition.ini (2 hunks)
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationAdministration.ini (4 hunks)
packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationDispense.ini (1 hunks)
packages/data-transformation/fhir-to-csv/src/parseNdjsonBundle/parseNdjsonBundle.py (2 hunks)
packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py (3 hunks)
packages/data-transformation/fhir-to-csv/src/utils/database.py (1 hunks)
packages/utils/src/snowflake/bulk-insert-patients.ts (2 hunks)

💤 Files with no reviewable changes (2)

packages/api/src/routes/internal/analytics-platform/index.ts
packages/core/src/command/analytics-platform/api/start-fhir-to-csv-transform.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursorrules)

**/*.{js,jsx,ts,tsx}: Don’t use null inside the app, only on code interacting with external interfaces/services, like DB and HTTP; convert to undefined before sending inwards into the code
Use const whenever possible
Use async/await instead of .then()
Naming: classes, enums: PascalCase
Naming: constants, variables, functions: camelCase
Naming: file names: kebab-case
Naming: Don’t use negative names, like notEnabled, prefer isDisabled
If possible, use decomposing objects for function parameters
Prefer Nullish Coalesce (??) than the OR operator (||) when you want to provide a default value
Avoid creating arrow functions
Use truthy syntax instead of in - i.e., if (data.link) not if ('link' in data)
While handling errors, keep the stack trace around: if you create a new Error (e.g., MetriportError), make sure to pass the original error as the new one’s cause so the stack trace is available upstream.
max column length is 100 chars
multi-line comments use /** */
top-level comments go after the import (save pre-import to basic file header, like license)
move literals to constants declared after imports when possible

Files:

packages/utils/src/snowflake/bulk-insert-patients.ts

**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (.cursorrules)

Use types whenever possible

Files:

packages/utils/src/snowflake/bulk-insert-patients.ts

**/*.ts

⚙️ CodeRabbit Configuration File

**/*.ts: - Use the Onion Pattern to organize a package's code in layers

Try to use immutable code and avoid sharing state across different functions, objects, and systems

Try to build code that's idempotent whenever possible

Prefer functional programming style functions: small, deterministic, 1 input, 1 output

Minimize coupling / dependencies

Avoid modifying objects received as parameter

Only add comments to code to explain why something was done, not how it works

Naming

classes, enums: PascalCase

constants, variables, functions: camelCase

file names: kebab-case

table and column names: snake_case

Use meaningful names, so whoever is reading the code understands what it means

Don’t use negative names, like notEnabled, prefer isDisabled

For numeric values, if the type doesn’t convey the unit, add the unit to the name

Typescript

Use types

Prefer const instead of let

Avoid any and casting from any to other types

Type predicates: only applicable to narrow down the type, not to force a complete type conversion

Prefer deconstructing parameters for functions instead of multiple parameters that might be of
the same type

Don’t use null inside the app, only on code interacting with external interfaces/services,
like DB and HTTP; convert to undefined before sending inwards into the code

Use async/await instead of .then()

Use the strict equality operator ===, don’t use abstract equality operator ==

When calling a Promise-returning function asynchronously (i.e., not awaiting), use .catch() to
handle errors (see processAsyncError and emptyFunction depending on the case)

Date and Time

Always use buildDayjs() to create dayjs instances

Prefer dayjs.duration(...) to create duration consts and keep them as duration

Prefer Nullish Coalesce (??) than the OR operator (||) to provide a default value

Avoid creating arrow functions

U...

Files:

packages/utils/src/snowflake/bulk-insert-patients.ts

🧠 Learnings (9)

📓 Common learnings

Learnt from: leite08
PR: metriport/metriport#3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: leite08
PR: metriport/metriport#3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file 2025-04-24_02_create-patient_import_mapping-table.ts, which ensures data integrity.

Learnt from: thomasyopes
PR: metriport/metriport#3427
File: packages/core/src/external/ehr/api/sync-patient.ts:16-55
Timestamp: 2025-03-11T20:42:46.516Z
Learning: In the patient synchronization architecture, the flow follows this pattern: (1) `ehr-sync-patient-cloud.ts` sends messages to an SQS queue, (2) the `ehr-sync-patient` Lambda consumes these messages, and (3) the Lambda uses the `syncPatient` function to make the API calls to process the patient data.

Learnt from: thomasyopes
PR: metriport/metriport#3882
File: packages/core/src/external/ehr/command/get-bundle-by-resource-type.ts:31-37
Timestamp: 2025-06-05T14:09:59.683Z
Learning: When EHR handlers in mapping objects are set to `undefined` and there are downstream PRs mentioned in the PR objectives, this typically indicates a staged rollout approach where the core structure is implemented first and specific handler implementations follow in subsequent PRs.

Learnt from: RamilGaripov
PR: metriport/metriport#3676
File: packages/core/src/command/hl7v2-subscriptions/hl7v2-to-fhir-conversion/shared.ts:1-10
Timestamp: 2025-04-16T00:25:25.196Z
Learning: The circular dependency between shared.ts (importing getPatientIdsOrFail) and adt/utils.ts (using unpackPidFieldOrFail) will be addressed in a follow-up PR.

Learnt from: thomasyopes
PR: metriport/metriport#3466
File: packages/api/src/routes/ehr/elation/chart.ts:20-37
Timestamp: 2025-03-17T15:30:34.647Z
Learning: The request body parsing in packages/api/src/routes/ehr/elation/chart.ts is tracked with TODO referencing issue #2170 and will be handled in a separate PR, not in PR #3466.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the `/fhir-to-csv` and `/fhir-to-csv/transform` endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the `/fhir-to-csv/batch` endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: metriport/metriport#4193
File: packages/carequality-sdk/src/client/carequality-fhir.ts:198-208
Timestamp: 2025-07-16T00:54:56.156Z
Learning: User leite08 accepted the risk of parameter mutation in packages/carequality-sdk/src/client/carequality-fhir.ts when modifying org.active = false directly on input parameters, choosing to proceed despite the violation of the "Avoid modifying objects received as parameter" coding guideline.

Learnt from: leite08
PR: metriport/metriport#4215
File: packages/core/src/command/analytics-platform/config.ts:4-13
Timestamp: 2025-07-20T01:18:28.093Z
Learning: User leite08 prefers that code improvement suggestions (like consolidating duplicated schemas) be made on feature PRs rather than release PRs. Such improvements should be deferred to follow-up PRs during the release cycle.

Learnt from: leite08
PR: metriport/metriport#3648
File: packages/utils/src/shared/patient-create.ts:35-53
Timestamp: 2025-04-10T17:30:30.368Z
Learning: The function `mergePatients` in the patient-create.ts file is intentionally designed to take all properties from the first patient (p1) and only update address and contact fields, as the system supports duplicated patients specifically to allow multiple addresses across different records.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:93-97
Timestamp: 2025-07-19T14:04:34.757Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes accepts SQL injection risk in the append_job_tables function's DELETE query where patient_id is directly interpolated, based on their assessment that they control all inputs to the system.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:38.906Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, S3 upload operations should be allowed to hard fail without try-catch blocks. The team prefers immediate failure over error handling and continuation for S3 upload operations in the FHIR-to-CSV transformation process.

Learnt from: leite08
PR: metriport/metriport#3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file, which prevents duplicates in the mapping between patient imports and data pipeline requests.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:47.448Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, the user thomasyopes prefers to let Snowflake operations (create_temp_tables, copy_into_temp_table, append_temp_tables, rename_temp_tables) hard fail rather than adding explicit try-catch error handling blocks.

Learnt from: thomasyopes
PR: metriport/metriport#4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:0-0
Timestamp: 2025-07-18T09:54:43.598Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes prefers to log and swallow exceptions in the copy_into_temp_table function rather than re-raising them. This allows the data transformation job to continue processing other files even if one file fails to copy to Snowflake, supporting partial success scenarios in batch data processing.

Learnt from: keshavsaharia
PR: metriport/metriport#3885
File: packages/utils/src/surescripts/generate-patient-load-file.ts:92-94
Timestamp: 2025-05-30T01:11:35.300Z
Learning: In V0 of the Surescripts implementation (packages/utils/src/surescripts/generate-patient-load-file.ts), there is always only 1 facility ID, so the current facility_ids parsing logic with substring operations is sufficient and additional validation is not necessary.

Learnt from: leite08
PR: metriport/metriport#4013
File: packages/core/src/fhir-to-cda/cda-templates/components/procedures.ts:104-105
Timestamp: 2025-06-12T22:53:09.606Z
Learning: User leite08 prefers responses in English only.

Learnt from: leite08
PR: metriport/metriport#3991
File: packages/core/src/external/aws/s3.ts:621-630
Timestamp: 2025-06-10T22:20:21.203Z
Learning: User leite08 prefers that all review comments are written in English.

Learnt from: leite08
PR: metriport/metriport#3912
File: packages/core/src/command/consolidated/search/fhir-resource/search-consolidated.ts:74-83
Timestamp: 2025-06-04T00:14:38.041Z
Learning: User leite08 prefers incremental progress in refactoring scenarios, accepting some inconsistency as a stepping stone towards better architecture rather than requiring perfect consistency immediately.

packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_FamilyMemberHistory_condition.ini (1)

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationDispense.ini (5)

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

Learnt from: keshavsaharia
PR: #4075
File: packages/core/src/external/surescripts/fhir/medication-request.ts:76-83
Timestamp: 2025-06-25T18:42:07.231Z
Learning: In packages/core/src/external/surescripts/fhir/medication-request.ts, the getDispenseNote and getDosageInstruction functions are intentionally identical because the dashboard requires both the note and dosageInstruction fields of MedicationRequest to be populated with detail.directions for proper note display functionality.

Learnt from: thomasyopes
PR: #3970
File: packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts:17-17
Timestamp: 2025-06-06T16:45:31.832Z
Learning: The writeMedicationToChart function in packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts returns a response that is not currently used by any consumers, so changes to its return type are not breaking changes in practice.

Learnt from: keshavsaharia
PR: #4045
File: packages/core/src/external/surescripts/fhir/coverage.ts:0-0
Timestamp: 2025-06-18T18:34:10.489Z
Learning: Coverage resources in Surescripts FHIR conversion are currently excluded from bundles to prevent skewing data lift metrics. The team plans to examine available insurance data thoroughly before including properly structured Coverage resources with mandatory FHIR R4 elements like beneficiary and payor references.

Learnt from: thomasyopes
PR: #3891
File: packages/api/src/routes/internal/ehr/patient.ts:142-142
Timestamp: 2025-06-20T15:35:00.546Z
Learning: In the EHR resource diff contribution system, there are no "invalid" resource types. The system is designed to gracefully handle any resource type string by attempting to find a corresponding bundle, and if no bundle exists for that resource type, it will simply exit early rather than throwing an error.

packages/utils/src/snowflake/bulk-insert-patients.ts (19)

Learnt from: keshavsaharia
PR: #3885
File: packages/core/src/external/sftp/surescripts/client.ts:1-134
Timestamp: 2025-05-28T19:23:20.179Z
Learning: In packages/core/src/external/sftp/surescripts/client.ts, the standalone getPatientLoadFileName function intentionally omits the transmission ID from the filename, which differs from the class method getPatientLoadFileName. This difference in filename generation is expected behavior.

Learnt from: leite08
PR: #3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: RamilGaripov
PR: #3962
File: packages/api/src/sequelize/migrations/2025-06-17_00_create-cohort-tables.ts:50-56
Timestamp: 2025-06-17T20:03:18.931Z
Learning: In the Metriport codebase, the patient table ID is defined as DataTypes.STRING in the database schema, not UUID. Foreign key references to patient.id should use DataTypes.STRING to maintain consistency.

Learnt from: thomasyopes
PR: #4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the /fhir-to-csv and /fhir-to-csv/transform endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the /fhir-to-csv/batch endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: RamilGaripov
PR: #3676
File: packages/core/src/command/hl7v2-subscriptions/hl7v2-to-fhir-conversion/shared.ts:1-10
Timestamp: 2025-04-16T00:25:25.196Z
Learning: The circular dependency between shared.ts (importing getPatientIdsOrFail) and adt/utils.ts (using unpackPidFieldOrFail) will be addressed in a follow-up PR.

Learnt from: leite08
PR: #3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file 2025-04-24_02_create-patient_import_mapping-table.ts, which ensures data integrity.

Learnt from: thomasyopes
PR: #3427
File: packages/core/src/external/ehr/api/sync-patient.ts:16-55
Timestamp: 2025-03-11T20:42:46.516Z
Learning: In the patient synchronization architecture, the flow follows this pattern: (1) ehr-sync-patient-cloud.ts sends messages to an SQS queue, (2) the ehr-sync-patient Lambda consumes these messages, and (3) the Lambda uses the syncPatient function to make the API calls to process the patient data.

Learnt from: leite08
PR: #4034
File: packages/core/src/command/patient-import/patient-or-record-failed.ts:19-30
Timestamp: 2025-06-16T17:03:23.069Z
Learning: In patient import error handling functions like setPatientOrRecordFailed, the team prefers Promise.all over Promise.allSettled to ensure atomic all-or-nothing behavior - if either the job update or patient record update fails, the entire operation should fail rather than allowing partial success.

Learnt from: keshavsaharia
PR: #3885
File: packages/utils/src/surescripts/generate-patient-load-file.ts:92-94
Timestamp: 2025-05-30T01:11:35.300Z
Learning: In V0 of the Surescripts implementation (packages/utils/src/surescripts/generate-patient-load-file.ts), there is always only 1 facility ID, so the current facility_ids parsing logic with substring operations is sufficient and additional validation is not necessary.

Learnt from: leite08
PR: #3953
File: packages/core/src/command/consolidated/search/fhir-resource/tests/search-consolidated-setup.ts:28-30
Timestamp: 2025-06-03T21:02:23.374Z
Learning: The makePatient() function returns PatientWithId type which requires the id field to be present, ensuring patient.id is never undefined.

Learnt from: lucasdellabella
PR: #4098
File: packages/api/src/routes/medical/dtos/tcm-encounter-dto.ts:10-22
Timestamp: 2025-06-27T01:50:14.227Z
Learning: In packages/api/src/routes/medical/dtos/tcm-encounter-dto.ts, the patient fields patientData.firstName, patientData.lastName, and patientData.dob are guaranteed to be non-nullable values, so defensive null/undefined checks are not needed when accessing these fields.

Learnt from: leite08
PR: #3814
File: packages/core/src/command/consolidated/search/fhir-resource/search-semantic.ts:36-36
Timestamp: 2025-05-19T13:53:09.828Z
Learning: In the getConsolidatedPatientData function from @metriport/core/command/consolidated/consolidated-get, only the patient parameter is required. Other parameters like requestId, resources, dateFrom, dateTo, fromDashboard, and forceDataFromFhir are all optional.

Learnt from: RamilGaripov
PR: #3976
File: packages/api/src/routes/medical/patient.ts:541-543
Timestamp: 2025-06-18T21:05:22.256Z
Learning: In packages/api/src/routes/medical/patient.ts, inline schema definitions like cohortIdSchema are acceptable and don't need to be moved to separate schema files when the user prefers to keep them inline.

Learnt from: thomasyopes
PR: #3383
File: packages/api/src/routes/ehr-internal/canvas/internal/patient.ts:10-24
Timestamp: 2025-03-05T21:42:32.929Z
Learning: The processPatientsFromAppointments endpoint in the Canvas API is intentionally designed to be asynchronous (fire-and-forget pattern). The endpoint returns a 200 OK response immediately after initiating the process, without waiting for it to complete. Error handling is handled through the .catch() method with processAsyncError rather than awaiting the promise.

Learnt from: thomasyopes
PR: #3788
File: packages/api/src/external/ehr/shared/command/create-resource-diff-bundles.ts:0-0
Timestamp: 2025-05-08T19:16:10.258Z
Learning: Avoid using Promise.all for parallel execution of functions that access the cxMapping table, as this may lead to database locking issues. Serial execution (using for loops with await) is preferred for such operations even if it might be slower for large datasets.

Learnt from: thomasyopes
PR: #4061
File: packages/api/src/external/ehr/shared/job/bundle/create-resource-diff-bundles/run-job.ts:34-47
Timestamp: 2025-06-19T22:44:49.393Z
Learning: In packages/api/src/external/ehr/shared/job/bundle/create-resource-diff-bundles/run-job.ts, the team prefers to keep refreshEhrBundles operations grouped using fire-and-forget pattern rather than awaiting all operations with Promise.allSettled(). This allows individual refresh operations to run independently without blocking the job runner.

Learnt from: keshavsaharia
PR: #4099
File: packages/utils/src/surescripts/find-largest.ts:84-84
Timestamp: 2025-06-26T06:26:17.643Z
Learning: In utility scripts for Surescripts analysis tools, when orgName parameter is undefined, prefer using date timestamps as fallbacks for directory naming rather than using cxId or other identifiers. This creates unique directories for each run and helps with organization.

Learnt from: thomasyopes
PR: #3427
File: packages/lambdas/src/ehr-sync-patient.ts:31-33
Timestamp: 2025-03-11T19:12:22.472Z
Learning: In this codebase, a specific logging pattern is used: console.log is used for early logging before context is available, then a prefixed logger (created with prefixedLog) is initialized after parsing data to include contextual information (like ehr, cxId, practiceId, patientId) in all subsequent log messages.

Learnt from: thomasyopes
PR: #3997
File: packages/api/src/command/job/patient/status/cancel.ts:30-34
Timestamp: 2025-06-12T23:25:08.139Z
Learning: cancelPatientJob is intentionally non-idempotent: if the current job status is already cancelled, the function must throw a BadRequestError rather than silently succeeding.

packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_MedicationAdministration.ini (6)

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

Learnt from: thomasyopes
PR: #3970
File: packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts:17-17
Timestamp: 2025-06-06T16:45:31.832Z
Learning: The writeMedicationToChart function in packages/api/src/external/ehr/athenahealth/command/write-back/medication.ts returns a response that is not currently used by any consumers, so changes to its return type are not breaking changes in practice.

Learnt from: keshavsaharia
PR: #3885
File: packages/core/src/external/surescripts/schema/load.ts:65-71
Timestamp: 2025-05-30T03:42:53.380Z
Learning: In Surescripts schema files like packages/core/src/external/surescripts/schema/load.ts, the key property in field definitions is only used for direct one-to-one field mappings. When multiple fields are derived from the same source property but with different transformations (like extracting date vs. time portions), they should not have key properties as they represent transformations rather than direct mappings.

Learnt from: thomasyopes
PR: #3936
File: packages/core/src/external/ehr/athenahealth/index.ts:472-482
Timestamp: 2025-06-06T15:37:44.571Z
Learning: In AthenaHealth medication creation, even if medication statements have the same start/stop dates, they should not be deduplicated because there may be relevant differences beyond dates (dosage, administration routes, prescriber info, etc.) that make each statement unique and worth creating separately.

Learnt from: thomasyopes
PR: #3981
File: packages/core/src/external/ehr/canvas/index.ts:1376-1379
Timestamp: 2025-06-10T20:14:27.998Z
Learning: Canvas EHR only supports the MedicationStatement statuses "active", "stopped", and "entered-in-error"; it does not accept "completed".

Learnt from: keshavsaharia
PR: #4075
File: packages/core/src/external/surescripts/fhir/medication-request.ts:76-83
Timestamp: 2025-06-25T18:42:07.231Z
Learning: In packages/core/src/external/surescripts/fhir/medication-request.ts, the getDispenseNote and getDosageInstruction functions are intentionally identical because the dashboard requires both the note and dosageInstruction fields of MedicationRequest to be populated with detail.directions for proper note display functionality.

packages/data-transformation/fhir-to-csv/src/parseNdjsonBundle/parseNdjsonBundle.py (7)

Learnt from: thomasyopes
PR: #4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the /fhir-to-csv and /fhir-to-csv/transform endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the /fhir-to-csv/batch endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: #3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/parseFhir.py:94-175
Timestamp: 2025-07-17T19:58:54.738Z
Learning: The user thomasyopes prefers to keep the parseFhir.py file in packages/data-transformation/fhir-to-csv/src/parseFhir/ as-is, declining code style improvements suggested by static analysis tools like Ruff.

Learnt from: RamilGaripov
PR: #3962
File: packages/api/src/models/medical/cohort.ts:0-0
Timestamp: 2025-06-17T23:14:46.017Z
Learning: In the Metriport codebase, cohort deletion is handled with application-level validation that prevents deletion of cohorts if there are any patients assigned to them. This is why the cohortId foreign key constraint in patient_cohort table does not have cascade delete behavior - it's intentionally asymmetric. The patientId has cascade delete to clean up cohort assignments when patients are deleted, but cohortId lacks cascade delete because cohorts with patients should not be deletable at all according to business rules.

Learnt from: thomasyopes
PR: #3771
File: packages/api/src/external/ehr/healthie/command/process-patients-from-appointments.ts:121-128
Timestamp: 2025-05-01T16:15:20.823Z
Learning: In the Healthie EHR system, patient IDs are globally unique across practices, making patientId alone sufficient for deduplication without needing to include practiceId.

Learnt from: leite08
PR: #3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file 2025-04-24_02_create-patient_import_mapping-table.ts, which ensures data integrity.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/config_DocumentReference.ini:1-188
Timestamp: 2025-07-17T19:57:25.199Z
Learning: In packages/data-transformation/fhir-to-csv/src/parseFhir/configurations/ directory, configuration files like config_DocumentReference.ini should be kept as-is with their hard-coded field mappings, as confirmed by user thomasyopes. The team prefers to maintain these static configuration files rather than implementing dynamic mapping generation.

packages/data-transformation/fhir-to-csv/main.py (18)

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:38.906Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, S3 upload operations should be allowed to hard fail without try-catch blocks. The team prefers immediate failure over error handling and continuation for S3 upload operations in the FHIR-to-CSV transformation process.

Learnt from: thomasyopes
PR: #4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the /fhir-to-csv and /fhir-to-csv/transform endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the /fhir-to-csv/batch endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: #3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:47.448Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, the user thomasyopes prefers to let Snowflake operations (create_temp_tables, copy_into_temp_table, append_temp_tables, rename_temp_tables) hard fail rather than adding explicit try-catch error handling blocks.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:93-97
Timestamp: 2025-07-19T14:04:34.757Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes accepts SQL injection risk in the append_job_tables function's DELETE query where patient_id is directly interpolated, based on their assessment that they control all inputs to the system.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:0-0
Timestamp: 2025-07-18T09:54:43.598Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes prefers to log and swallow exceptions in the copy_into_temp_table function rather than re-raising them. This allows the data transformation job to continue processing other files even if one file fails to copy to Snowflake, supporting partial success scenarios in batch data processing.

Learnt from: leite08
PR: #3859
File: packages/utils/src/consolidated/check-patient-consolidated.ts:28-32
Timestamp: 2025-05-26T16:24:10.245Z
Learning: For utility scripts in the packages/utils directory, developers are expected to update the main constants at the top of the file (like bundleFilePath, cxId, patientId) rather than implementing command-line argument parsing or environment variable reading. This is the preferred pattern for configuring utility scripts in this codebase.

Learnt from: keshavsaharia
PR: #3885
File: packages/core/src/external/sftp/surescripts/client.ts:1-134
Timestamp: 2025-05-28T19:23:20.179Z
Learning: In packages/core/src/external/sftp/surescripts/client.ts, the standalone getPatientLoadFileName function intentionally omits the transmission ID from the filename, which differs from the class method getPatientLoadFileName. This difference in filename generation is expected behavior.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/parseFhir.py:94-175
Timestamp: 2025-07-17T19:58:54.738Z
Learning: The user thomasyopes prefers to keep the parseFhir.py file in packages/data-transformation/fhir-to-csv/src/parseFhir/ as-is, declining code style improvements suggested by static analysis tools like Ruff.

Learnt from: leite08
PR: #3412
File: packages/api/.env.test:12-12
Timestamp: 2025-04-24T01:57:35.097Z
Learning: The PATIENT_IMPORT_BUCKET_NAME environment variable is specific to the API package and doesn't need to be added to other packages' environment configurations.

Learnt from: lucasdellabella
PR: #3855
File: packages/core/src/command/hl7-notification/hl7-notification-webhook-sender-direct.ts:88-104
Timestamp: 2025-05-16T02:10:01.792Z
Learning: When logging file paths or other strings that might contain patient identifiers like patientId or cxId, these should be masked or redacted (e.g., using regex replacement) to avoid logging Protected Health Information (PHI) to CloudWatch or error tracking systems like Sentry, ensuring HIPAA compliance.

Learnt from: leite08
PR: #3612
File: packages/core/src/command/conversion-result/send-multiple-conversion-results.ts:48-62
Timestamp: 2025-04-05T01:03:51.989Z
Learning: Patient IDs in this codebase are UUIDs and don't contain PII, so they're safe to include in logs and error contexts.

Learnt from: RamilGaripov
PR: #3962
File: packages/api/src/sequelize/migrations/2025-06-17_00_create-cohort-tables.ts:50-56
Timestamp: 2025-06-17T20:03:18.931Z
Learning: In the Metriport codebase, the patient table ID is defined as DataTypes.STRING in the database schema, not UUID. Foreign key references to patient.id should use DataTypes.STRING to maintain consistency.

Learnt from: thomasyopes
PR: #3883
File: packages/api/src/external/ehr/shared/job/bundle/create-resource-diff-bundles/start-jobs-across-ehrs.ts:38-46
Timestamp: 2025-05-28T19:42:17.301Z
Learning: Athena patient IDs necessarily contain a "." character - this is a required format for all valid Athena patient IDs.

Learnt from: thomasyopes
PR: #3973
File: packages/core/src/external/ehr/athenahealth/shared.ts:8-16
Timestamp: 2025-06-06T18:17:23.577Z
Learning: For internal functions in the Metriport codebase where inputs are controlled by the development team, input validation for required parameters like cxId and practiceId is not needed and should be avoided to keep functions simple.

Learnt from: RamilGaripov
PR: #3675
File: packages/core/src/command/hl7v2-subscriptions/hl7v2-to-fhir-converter.ts:58-59
Timestamp: 2025-04-23T08:43:00.803Z
Learning: In the Metriport codebase, cxId is expected to be a UUID format (containing only alphanumeric characters and hyphens), which doesn't require URL encoding when used in query parameters.

Learnt from: leite08
PR: #3869
File: packages/core/src/external/fhir/export/string/resources/medication-request.ts:93-94
Timestamp: 2025-05-21T16:49:07.877Z
Learning: Raw ISO date strings in FHIR resources (like authoredOn in MedicationRequest) should be preserved as-is without additional formatting when converting resources to string representations.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:54-58
Timestamp: 2025-07-18T09:54:01.790Z
Learning: The user thomasyopes prefers to keep the setupSnowflake.py file in packages/data-transformation/fhir-to-csv/src/setupSnowflake/ as-is, declining suggestions for additional error handling around configuration file parsing.

packages/data-transformation/fhir-to-csv/src/utils/database.py (9)

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:93-97
Timestamp: 2025-07-19T14:04:34.757Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes accepts SQL injection risk in the append_job_tables function's DELETE query where patient_id is directly interpolated, based on their assessment that they control all inputs to the system.

Learnt from: RamilGaripov
PR: #3962
File: packages/api/src/sequelize/migrations/2025-06-17_00_create-cohort-tables.ts:50-56
Timestamp: 2025-06-17T20:03:18.931Z
Learning: In the Metriport codebase, the patient table ID is defined as DataTypes.STRING in the database schema, not UUID. Foreign key references to patient.id should use DataTypes.STRING to maintain consistency.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/main.py:0-0
Timestamp: 2025-07-18T09:53:47.448Z
Learning: In packages/data-transformation/fhir-to-csv/main.py, the user thomasyopes prefers to let Snowflake operations (create_temp_tables, copy_into_temp_table, append_temp_tables, rename_temp_tables) hard fail rather than adding explicit try-catch error handling blocks.

Learnt from: thomasyopes
PR: #4165
File: packages/api/src/routes/internal/analytics-platform/index.ts:28-28
Timestamp: 2025-07-18T09:33:29.581Z
Learning: In packages/api/src/routes/internal/analytics-platform/index.ts, the patientId parameter handling is intentionally inconsistent: the /fhir-to-csv and /fhir-to-csv/transform endpoints require patientId (using getFromQueryOrFail) for single patient processing, while the /fhir-to-csv/batch endpoint treats patientId as optional (using getFromQuery) because it's designed to run over all patients when no specific patient ID is provided.

Learnt from: leite08
PR: #3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file 2025-04-24_02_create-patient_import_mapping-table.ts, which ensures data integrity.

Learnt from: leite08
PR: #3814
File: packages/api/src/routes/internal/medical/patient-consolidated.ts:141-174
Timestamp: 2025-05-20T21:26:26.804Z
Learning: The functionality introduced in packages/api/src/routes/internal/medical/patient-consolidated.ts is planned to be refactored in downstream PR #3857, including improvements to error handling and validation.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py:0-0
Timestamp: 2025-07-18T09:54:43.598Z
Learning: In packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py, the user thomasyopes prefers to log and swallow exceptions in the copy_into_temp_table function rather than re-raising them. This allows the data transformation job to continue processing other files even if one file fails to copy to Snowflake, supporting partial success scenarios in batch data processing.

Learnt from: leite08
PR: #3412
File: packages/api/src/models/medical/patient-import-mapping.ts:16-41
Timestamp: 2025-04-26T22:17:45.248Z
Learning: The patient_import_mapping table already has a unique constraint on the fields "job_id", "row_number", and "data_pipeline_request_id" defined in the migration file, which prevents duplicates in the mapping between patient imports and data pipeline requests.

Learnt from: thomasyopes
PR: #4165
File: packages/data-transformation/fhir-to-csv/src/parseFhir/parseFhir.py:94-175
Timestamp: 2025-07-17T19:58:54.738Z
Learning: The user thomasyopes prefers to keep the parseFhir.py file in packages/data-transformation/fhir-to-csv/src/parseFhir/ as-is, declining code style improvements suggested by static analysis tools like Ruff.

packages/data-transformation/fhir-to-csv/src/setupSnowflake/setupSnowflake.py (10)