-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Currently a registered flow is parametrized by the follow:
- An
Environment
, containing anExecutor
and some other configuration, specifying how and where to run the flow - A
Storage
object specifying where to get the flow from - An
Environment.metadata
dict specifying general metadata about the flow. There's a convention that theimage
key in that mapping specifies the image to use to the agent - The
Agent
that kicks off the flow run.
This is a lot of different concepts, and the different options can be confusing to beginners and experienced users alike.
Say I'm a user who wants to run a flow as a K8s Job. I see there's a KubernetesJobEnvironment
- that sounds like what I want. But there's also a KubernetesAgent
. Do I need both? Only one? Why would I pick one or the other? Too many options and combinations of options can be overwhelming.
With #2805 we deprecated (or plan to deprecate) all of the environments except for:
LocalEnvironment
FargateTaskEnvironment
KubernetesJobEnvironment
I believer with making the agents more configurable we can drop the last two entirely, removing environment configuration from something a user needs to think about.
The FargateTaskEnvironment
and KubernetesJobEnvironment
exist to allow users to configure a k8s job or fargate task without exposing that configuration to Prefect cloud (the configuration is stored with the flow in the flow in the Storage
object, not in Cloud's DB). If users only want a single job spec for all k8s jobs, they can use a LocalEnvironment
with the KubernetesAgent
and things work out fine.
A different way to expose configurable specs to users would be to let them customize how the agent generates these specs from the flow run information. Currently this is hardcoded into each agent (template in a few flow-specific things, a few user-configured things, then kick off the k8s job/fargate task). We might make this a configurable callable, to let users fully customize this process, and also provide a few builtin versions for common patterns.
A possible proposal for the KubernetesAgent
:
- We add a configurable callback
generate_job_spec(flow_run) -> V1JobSpec
. This defaults to an implementation similar to the one provided already. - If they want to customize the job spec in a simple-ish way, we can let them override the template used by the builtin callback. All jobs would still look the same, but would be customized in some static way (say adding a static
Secret
to all pods). - If a user wants super-custom-behavior, they can configure their own callback to be used. In the callback they'd have access to the
metadata
dict associated with the flow, which can contain whatever info they want.
With the third option, they could implement behaviors like:
- Having a flow switch between a few different job specs based on a value in
metadata
(e.g.metadata["job_profile"]
). I can see this being a common option, and maybe one we'd like to support out-of-the-box. - Provide additional values to use in the templates via the
metadata
mapping. Perhapsmetadata["memory"]
ormetadata["cpu"]
. - The custom callback could poll an external service where things are stored to allow for full flexibility. I've found this flexibility valuable when working on dask-gateway or JupyterHub extensions - power users often value having hooks into the infrastructure they're deploying.
This got a bit rambling, so in quick summary:
- I think we can let users customize the k8s jobs/fargate tasks on the agent rather than in the Environment by providing callbacks and more configuration on the agent side, and encouraging the use/templating off of things the user stores in the flow's
metadata
dict. - I think we can keep this customization simple for simple use cases, but provide the possibility for complicated behaviors for users that need them.
- By removing the need for an
Environment
class at all, we remove a concept that many users find confusing. Prefect has a lot of concepts, simplifying things can help users get started quicker and easier.
If we were to go down this route, an incremental path might be:
- Expose these kinds of configurations to the
FargateAgent
andKubernetesAgent
- Slowly deprecate all but the default
LocalEnvironment
- Move configuration that lives on the
Environment
onto theFlow
instead. For example, we might configure anExecutor
with the flow instead of on theEnvironment
. This provides nice parity, sinceflow.run()
andflow.register()
could both "use" the same configuredExecutor
. - Remove the
Environment
entirely.