Remove the need for `Environment` classes entirely

Currently a registered flow is parametrized by the follow:

- An `Environment`, containing an `Executor` and some other configuration, specifying how and where to run the flow
- A `Storage` object specifying where to get the flow from
- An `Environment.metadata` dict specifying general metadata about the flow. There's a convention that the `image` key in that mapping specifies the image to use to the agent
- The `Agent` that kicks off the flow run.

This is a *lot* of different concepts, and the different options can be confusing to beginners and experienced users alike.

Say I'm a user who wants to run a flow as a K8s Job. I see there's a `KubernetesJobEnvironment` - that sounds like what I want. But there's also a `KubernetesAgent`. Do I need both? Only one? Why would I pick one or the other? Too many options and combinations of options can be overwhelming.

With #2805 we deprecated (or plan to deprecate) all of the environments except for:

- `LocalEnvironment`
- `FargateTaskEnvironment`
- `KubernetesJobEnvironment`

I believer with making the agents more configurable we can drop the last two entirely, removing environment configuration from something a user needs to think about.

The `FargateTaskEnvironment` and `KubernetesJobEnvironment` exist to allow users to configure a k8s job or fargate task without exposing that configuration to Prefect cloud (the configuration is stored with the flow in the flow in the `Storage` object, not in Cloud's DB). If users only want a single job spec for all k8s jobs, they can use a `LocalEnvironment` with the `KubernetesAgent` and things work out fine.

A different way to expose configurable specs to users would be to let them customize how the agent generates these specs from the flow run information. Currently this is hardcoded into each agent (template in a few flow-specific things, a few user-configured things, then kick off the k8s job/fargate task). We might make this a configurable callable, to let users fully customize this process, and also provide a few builtin versions for common patterns.

A possible proposal for the `KubernetesAgent`:

- We add a configurable callback `generate_job_spec(flow_run) -> V1JobSpec`. This defaults to an implementation similar to the one provided already.
- If they want to customize the job spec in a simple-ish way, we can let them override the template used by the builtin callback. All jobs would still look the same, but would be customized in some static way (say adding a static `Secret` to all pods).
- If a user wants super-custom-behavior, they can configure their own callback to be used. In the callback they'd have access to the `metadata` dict associated with the flow, which can contain whatever info they want.

With the third option, they could implement behaviors like:

- Having a flow switch between a few different job specs based on a value in `metadata` (e.g. `metadata["job_profile"]`). I can see this being a common option, and maybe one we'd like to support out-of-the-box.
- Provide additional values to use in the templates via the `metadata` mapping. Perhaps `metadata["memory"]` or `metadata["cpu"]`.
- The custom callback could poll an external service where things are stored to allow for full flexibility. I've found this flexibility valuable when working on [dask-gateway](gateway.dask.org/) or JupyterHub extensions - power users often value having hooks into the infrastructure they're deploying.

This got a bit rambling, so in quick summary:

- I think we can let users customize the k8s jobs/fargate tasks on the agent rather than in the Environment by providing callbacks and more configuration on the agent side, and encouraging the use/templating off of things the user stores in the flow's `metadata` dict.
- I think we can keep this customization simple for simple use cases, but provide the possibility for complicated behaviors for users that need them.
- By removing the need for an `Environment` class at all, we remove a concept that many users find confusing. Prefect has a lot of concepts, simplifying things can help users get started quicker and easier.

If we were to go down this route, an incremental path might be:
- [ ] Expose these kinds of configurations to the `FargateAgent` and `KubernetesAgent`
- [ ] Slowly deprecate all but the default `LocalEnvironment`
- [ ] Move configuration that lives on the `Environment` onto the `Flow` instead. For example, we might configure an `Executor` with the flow instead of on the `Environment`. This provides nice parity, since `flow.run()` and `flow.register()` could both "use" the same configured `Executor`.
- [ ] Remove the `Environment` entirely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove the need for `Environment` classes entirely #2928

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove the need for Environment classes entirely #2928

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Remove the need for `Environment` classes entirely #2928