-
Notifications
You must be signed in to change notification settings - Fork 598
fix: add pipeline_name to PipelineRun and change logging default #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please make sure all the checkboxes are checked:
|
WalkthroughThis pull request introduces several modifications across the repository. A period was added to the pull request template for grammatical correctness. The add function’s import source in the API was updated, and the function logic was simplified by removing detailed file handling. The deprecated add_v2 file was removed. Pipeline run logging functions were updated to include a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant RunTasks
participant Logger
Client->>RunTasks: Initiates tasks with pipeline_name
RunTasks->>Logger: log_pipeline_run_start(pipeline_id, pipeline_name, dataset_id, data)
alt Tasks succeed
RunTasks->>Logger: log_pipeline_run_complete(pipeline_run_id, pipeline_id, pipeline_name, dataset_id, data)
else Tasks fail
RunTasks->>Logger: log_pipeline_run_error(pipeline_run_id, pipeline_id, pipeline_name, dataset_id, data, error)
end
Possibly related PRs
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
cognee/shared/logging_utils.py (1)
46-48
: Environment-based log level configurationThis is a good enhancement that adds support for configuring log levels via the environment variable
LOG_LEVEL
. This follows the twelve-factor app methodology for configuration.There's a minor typo in the variable name
loggin_name_mapping
(should belogging_name_mapping
).- loggin_name_mapping = logging.getLevelNamesMapping() + logging_name_mapping = logging.getLevelNamesMapping() - log_level = log_level if log_level else loggin_name_mapping[os.getenv("LOG_LEVEL", "INFO")] + log_level = log_level if log_level else logging_name_mapping[os.getenv("LOG_LEVEL", "INFO")]cognee/api/v1/add/add.py (2)
24-34
: Consider a more robust approach for the first_run logic.Attaching a static attribute to the function works but can be error-prone if the process restarts or the function is reloaded in certain contexts. A module-level or configuration-based flag might be more transparent.
46-47
: Consider using logger instead of print statements.Although printing the pipeline status can be helpful when debugging locally, using the logger would keep output consistent and more flexible for production environments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
poetry.lock
is excluded by!**/*.lock
,!**/*.lock
pyproject.toml
is excluded by!**/*.toml
📒 Files selected for processing (11)
.github/pull_request_template.md
(1 hunks)cognee/api/v1/add/__init__.py
(1 hunks)cognee/api/v1/add/add.py
(1 hunks)cognee/api/v1/add/add_v2.py
(0 hunks)cognee/modules/pipelines/models/PipelineRun.py
(1 hunks)cognee/modules/pipelines/operations/log_pipeline_run_complete.py
(2 hunks)cognee/modules/pipelines/operations/log_pipeline_run_error.py
(2 hunks)cognee/modules/pipelines/operations/log_pipeline_run_start.py
(2 hunks)cognee/modules/pipelines/operations/run_tasks.py
(2 hunks)cognee/shared/logging_utils.py
(4 hunks)cognee/tasks/ingestion/ingest_data.py
(2 hunks)
💤 Files with no reviewable changes (1)
- cognee/api/v1/add/add_v2.py
🧰 Additional context used
🧬 Code Definitions (4)
cognee/api/v1/add/__init__.py (1)
cognee/api/v1/add/add.py (1)
add
(15-47)
cognee/modules/pipelines/operations/log_pipeline_run_error.py (1)
cognee/modules/pipelines/models/PipelineRun.py (2)
PipelineRun
(14-26)PipelineRunStatus
(8-11)
cognee/modules/pipelines/operations/run_tasks.py (3)
cognee/modules/pipelines/operations/log_pipeline_run_start.py (1)
log_pipeline_run_start
(8-35)cognee/modules/pipelines/operations/log_pipeline_run_complete.py (1)
log_pipeline_run_complete
(8-35)cognee/modules/pipelines/operations/log_pipeline_run_error.py (1)
log_pipeline_run_error
(8-41)
cognee/api/v1/add/add.py (2)
cognee/modules/pipelines/operations/run_tasks.py (1)
run_tasks
(272-297)cognee/tasks/ingestion/ingest_data.py (1)
ingest_data
(17-153)
🪛 Ruff (0.8.2)
cognee/api/v1/add/__init__.py
1-1: .add.add
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: docker-compose-test
🔇 Additional comments (28)
cognee/shared/logging_utils.py (5)
17-17
: Improved flexibility in log level configurationThe default value for
level
has been changed fromINFO
toNone
, which will allow for more dynamic log level determination based on environment variables or other configuration sources.
22-22
: Documentation update to reflect parameter changeThe documentation has been updated to accurately reflect that the default value for the
level
parameter is nowNone
.
35-35
: Improved flexibility in setup_logging configurationChanging the default value of
log_level
fromINFO
toNone
allows for more flexible configuration of logging levels through environment variables.
39-39
: Documentation update to reflect parameter changeThe documentation has been updated to accurately reflect the new default behavior of the
log_level
parameter.
141-151
: Suppression of SQLAlchemy warnings for higher log levelsGood addition of conditional warning filters to suppress specific SQLAlchemy warnings when the log level is set higher than WARNING. This helps reduce noise in logs when running in production environments where these warnings might not be relevant.
.github/pull_request_template.md (1)
7-7
: Grammar fix in DCO AffirmationAdded a period at the end of the sentence for proper punctuation.
cognee/modules/pipelines/models/PipelineRun.py (1)
23-23
: Added pipeline_name column to PipelineRun modelThe addition of the
pipeline_name
column is in line with the PR objectives and will enhance the logging and tracking capabilities of pipeline runs by providing a human-readable identifier.cognee/api/v1/add/__init__.py (1)
1-1
:✅ Verification successful
Updated import source for add function
The import has been changed from
add_v2
toadd
, which aligns with the removal of theadd_v2.py
file mentioned in the PR summary. This is a good change that clarifies the correct source of the function.Note: The static analysis tool flagged this import as unused, but this is likely a false positive as
__init__.py
files typically re-export symbols to make them available at the package level.
🏁 Script executed:
#!/bin/bash # Check if the add function is imported elsewhere in the codebase rg -l "from cognee.api.v1.add import add" --type pyLength of output: 97
Import Update Verified: Approved
The updated import source in
cognee/api/v1/add/__init__.py
has been validated. Theadd
function is correctly re-exported and its usage has been confirmed incognee/api/v1/add/routers/get_add_router.py
. This change properly replaces the removedadd_v2
and clarifies the correct source of the function. The false positive flag from the static analysis tool can be safely ignored.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.add.add
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
cognee/tasks/ingestion/ingest_data.py (2)
21-21
: Pipeline name change reflects functionality more accuratelyChanging the pipeline name from "file_load_from_filesystem" to "metadata_extraction_pipeline" better represents the actual purpose of this pipeline, which primarily extracts metadata from files rather than just loading them.
122-122
: Improved code readabilityAdding this blank line improves code readability by properly separating the permission-giving operations from the return statement.
cognee/modules/pipelines/operations/log_pipeline_run_start.py (2)
8-8
: Function signature updated to include pipeline_name parameterThe addition of the
pipeline_name
parameter aligns with the PR objective of adding pipeline name tracking to the system. This change complements the database model updates mentioned in the PR summary.
20-20
: Pipeline name properly stored in PipelineRun modelThe
pipeline_name
parameter is correctly passed to the PipelineRun constructor, ensuring the pipeline name gets stored in the database.cognee/modules/pipelines/operations/log_pipeline_run_error.py (3)
1-1
: Removed unused importGood cleanup by removing the unused
uuid4
import, maintaining code cleanliness.
9-15
: Updated function signature with pipeline_name parameterThe error logging function has been properly updated to include the
pipeline_name
parameter, maintaining consistency with the other pipeline logging functions.
25-25
: Pipeline name properly included in error logsThe
pipeline_name
is correctly passed to the PipelineRun constructor, ensuring error logs contain the pipeline name for better traceability.cognee/modules/pipelines/operations/log_pipeline_run_complete.py (3)
1-1
: Removed unused importRemoving the unused
uuid4
import improves code cleanliness.
9-9
: Added pipeline_name parameter for completion loggingThe completion logging function has been properly updated to include the
pipeline_name
parameter, completing the consistent implementation across all pipeline logging functions.
20-20
: Pipeline name included in completion logsThe
pipeline_name
is correctly passed to the PipelineRun constructor for completion logs, ensuring consistent tracking throughout the pipeline lifecycle.cognee/modules/pipelines/operations/run_tasks.py (3)
280-280
: Add pipeline_name logging looks good.This line ensures that the pipeline name is logged from the start of the process, which improves traceability. No issues found with this addition.
289-291
: Verify the argument passed to run_tasks_with_telemetry.Although these lines correctly finalize the pipeline run with the pipeline name, please check line 286 where
run_tasks_with_telemetry
is called withpipeline_id
instead ofpipeline_name
. This may be an unintentional mismatch.Would you like to confirm usage in line 286 with a quick repository search to ensure consistency?
294-296
: Properly capturing pipeline_name on error.Including pipeline_name in the error path helps diagnose issues quickly. The logic here is consistent with the rest of the code.
cognee/api/v1/add/add.py (7)
1-5
: Updated imports appear correct.These changes streamline the module references, and there are no apparent issues.
12-12
: UUID import for pipeline construction.Importing
uuid5
andNAMESPACE_OID
is appropriate for generating consistent dataset or pipeline identifiers.
16-16
: Updated type hints look good.Using lowercase
list[...]
is consistent with modern Python typing conventions.
20-20
: Comment clarifies database creation.This comment improves readability and communicates intent for setting up databases.
39-39
: Task list definition is concise and clear.Creating tasks for directory resolution and ingestion is straightforward and maintainable.
41-41
: Generating dataset_id with uuid5.Using dataset name to derive a stable UUID is a solid approach to maintain consistent IDs.
42-43
: Invoking run_tasks with a custom pipeline_name.Passing
"add_pipeline"
as the pipeline name clearly reflects the operation's nature. No concerns identified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/tests/test_library.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (33)
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_eval_framework_test / test
- GitHub Check: run_simple_example_test / test
- GitHub Check: Test on macos-15
- GitHub Check: run_notebook_test / test
- GitHub Check: Test on macos-15
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_simple_example_test / test
- GitHub Check: run_networkx_metrics_test / test
- GitHub Check: Test on macos-15
- GitHub Check: Test on macos-13
- GitHub Check: Test on macos-13
- GitHub Check: run_notebook_test / test
- GitHub Check: Test on macos-13
- GitHub Check: run_notebook_test / test
- GitHub Check: Test on ubuntu-22.04
- GitHub Check: test
- GitHub Check: chromadb test
- GitHub Check: Test on ubuntu-22.04
- GitHub Check: test
- GitHub Check: test
- GitHub Check: Test on ubuntu-22.04
- GitHub Check: test
- GitHub Check: test
- GitHub Check: Test on ubuntu-22.04
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: Build Cognee Backend Docker App Image
- GitHub Check: run_simple_example_test
- GitHub Check: docker-compose-test
Description
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.