-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Create Python script pipes component docs ADOPT-1618 #31574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Create Python script pipes component docs ADOPT-1618 #31574
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-jp8212r7k-elementl.vercel.app Direct link to changed pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some suggestions, but this looks great overall, thanks! I recommond putting this in /guides/build/external-pipelines
instead, like the "Build pipelines with Spark Connect or Databricks Connect" doc.
@@ -0,0 +1,133 @@ | |||
--- | |||
title: 'Dagster & Python scripts with components' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
title: 'Dagster & Python scripts with components' | |
title: Build pipelines with Python scripts |
sidebar_position: 403 | ||
--- | ||
|
||
Dagster provides a ready-to-use `PythonScriptComponent` which can be used to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using Dagster Pipes, allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dagster provides a ready-to-use `PythonScriptComponent` which can be used to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using Dagster Pipes, allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts. | |
Dagster provides a `PythonScriptComponent` that you can use to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using [Dagster Pipes](/guides/build/external-pipelines), allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts. |
|
||
<CliInvocationExample contents="source ../.venv/bin/activate" /> | ||
|
||
## 2. Scaffold a Python script component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## 2. Scaffold a Python script component | |
## 2. Scaffold a Python script component definition |
|
||
## 2. Scaffold a Python script component | ||
|
||
Now that you have a Dagster project, you can scaffold a Python script component. You'll need to provide a name for your component. In this example, we'll create a component that will execute a Python script to process sales data and generate a revenue report. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you have a Dagster project, you can scaffold a Python script component. You'll need to provide a name for your component. In this example, we'll create a component that will execute a Python script to process sales data and generate a revenue report. | |
Now that you have a Dagster project, you can scaffold a Python script component definition. In this example, we'll create a component definition called `generate_revenue_report` that will execute a Python script to process sales data and generate a revenue report. |
|
||
<CliInvocationExample path="docs_snippets/docs_snippets/guides/components/integrations/python-script-component/3-tree.txt" /> | ||
|
||
## 3. Create your Python script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## 3. Create your Python script | |
## 3. Create a Python script (if needed) |
With Dagster Pipes, you can: | ||
|
||
- **Log structured information**: Use `context.log.info()` to send logs directly to Dagster | ||
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI | |
- **Report asset metadata**: Use <PyObject section="libraries" module="dagster-pipes" object="PipesContext.report_asset_materialization()" displayText="context.report_asset_materialization()" /> to attach rich metadata that appears in the Dagster UI. |
|
||
- **Log structured information**: Use `context.log.info()` to send logs directly to Dagster | ||
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI | ||
- **Handle errors**: Exception information is automatically captured and reported to Dagster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Handle errors**: Exception information is automatically captured and reported to Dagster | |
- **Handle errors**: Exception information is automatically captured and reported to Dagster. |
|
||
### Orchestrate multiple Python scripts | ||
|
||
You can define multiple Python script components in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can define multiple Python script components in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets: | |
You can define multiple Python script component instances in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets: |
language="yaml" | ||
/> | ||
|
||
Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together but run separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together but run separately. | |
Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together, but run separately. |
|
||
### Automate Python scripts | ||
|
||
You can configure when assets should be automatically materialized using automation conditions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can configure when assets should be automatically materialized using automation conditions: | |
You can configure when assets should be automatically materialized using [declarative automation](/guides/automate/declarative-automation) conditions: |
Summary & Motivation
followed other integration docs structure. added some pipes specific advanced usage
How I Tested These Changes
https://08-05-create-python-script-pipes-component-docs-adopt-1618.archive.dagster-docs.io/guides/build/components/integrations/python-script-component-tutorial
and unit test for snapshots
Changelog