Skip to content

Create Python script pipes component docs ADOPT-1618 #31574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

yuhan
Copy link
Contributor

@yuhan yuhan commented Aug 5, 2025

Summary & Motivation

followed other integration docs structure. added some pipes specific advanced usage

How I Tested These Changes

https://08-05-create-python-script-pipes-component-docs-adopt-1618.archive.dagster-docs.io/guides/build/components/integrations/python-script-component-tutorial

and unit test for snapshots

Changelog

Insert changelog entry or delete this section.

@yuhan yuhan requested a review from a team as a code owner August 5, 2025 23:37
Copy link
Contributor Author

yuhan commented Aug 5, 2025

Copy link
Contributor

@neverett neverett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions, but this looks great overall, thanks! I recommond putting this in /guides/build/external-pipelines instead, like the "Build pipelines with Spark Connect or Databricks Connect" doc.

@@ -0,0 +1,133 @@
---
title: 'Dagster & Python scripts with components'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: 'Dagster & Python scripts with components'
title: Build pipelines with Python scripts

sidebar_position: 403
---

Dagster provides a ready-to-use `PythonScriptComponent` which can be used to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using Dagster Pipes, allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Dagster provides a ready-to-use `PythonScriptComponent` which can be used to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using Dagster Pipes, allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts.
Dagster provides a `PythonScriptComponent` that you can use to execute Python scripts as assets in your Dagster project. This component runs your Python scripts in a subprocess using [Dagster Pipes](/guides/build/external-pipelines), allowing you to leverage existing Python scripts while benefiting from Dagster's orchestration and observability features. This guide will walk you through how to use the `PythonScriptComponent` to execute your Python scripts.


<CliInvocationExample contents="source ../.venv/bin/activate" />

## 2. Scaffold a Python script component
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 2. Scaffold a Python script component
## 2. Scaffold a Python script component definition


## 2. Scaffold a Python script component

Now that you have a Dagster project, you can scaffold a Python script component. You'll need to provide a name for your component. In this example, we'll create a component that will execute a Python script to process sales data and generate a revenue report.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Now that you have a Dagster project, you can scaffold a Python script component. You'll need to provide a name for your component. In this example, we'll create a component that will execute a Python script to process sales data and generate a revenue report.
Now that you have a Dagster project, you can scaffold a Python script component definition. In this example, we'll create a component definition called `generate_revenue_report` that will execute a Python script to process sales data and generate a revenue report.


<CliInvocationExample path="docs_snippets/docs_snippets/guides/components/integrations/python-script-component/3-tree.txt" />

## 3. Create your Python script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 3. Create your Python script
## 3. Create a Python script (if needed)

With Dagster Pipes, you can:

- **Log structured information**: Use `context.log.info()` to send logs directly to Dagster
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI
- **Report asset metadata**: Use <PyObject section="libraries" module="dagster-pipes" object="PipesContext.report_asset_materialization()" displayText="context.report_asset_materialization()" /> to attach rich metadata that appears in the Dagster UI.


- **Log structured information**: Use `context.log.info()` to send logs directly to Dagster
- **Report asset metadata**: Use `context.report_asset_materialization()` to attach rich metadata that appears in the Dagster UI
- **Handle errors**: Exception information is automatically captured and reported to Dagster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Handle errors**: Exception information is automatically captured and reported to Dagster
- **Handle errors**: Exception information is automatically captured and reported to Dagster.


### Orchestrate multiple Python scripts

You can define multiple Python script components in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can define multiple Python script components in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets:
You can define multiple Python script component instances in a single `defs.yaml` file using the `---` separator syntax. This allows you to run different scripts for different assets:

language="yaml"
/>

Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together but run separately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together but run separately.
Each component instance runs independently and can execute different Python scripts. This approach is useful when you have multiple related data processing tasks that should be organized together, but run separately.


### Automate Python scripts

You can configure when assets should be automatically materialized using automation conditions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can configure when assets should be automatically materialized using automation conditions:
You can configure when assets should be automatically materialized using [declarative automation](/guides/automate/declarative-automation) conditions:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants