> Back story behind how this PR came to be ?
When we introduced JobSets into Argo Workflows, we had to implement a
few "work-arounds" (hacks) in the JobSet manifest construction. One
major hack we **HAD to implement** was replacing the
`inputs.parameters.workerCount` in the JobSet manifest given to the Argo
Workflow Template with an un-quoted templating string. The templating
string looks like:`{{=asInt(inputs.parameters.workerCount)}}`. This
templating string is quoted when we dump the manifest to a dictionary
BUT having it quoted would mean the parameter gets casted as a string in
the JobSet Manifest and jobset controller will reject that resource
creation.
Argo on the other hand accepts storing the "malformed" manifest when we
submit the workflow template and trigger it via the k8s resource
creation API. When we trigger via the k8s API, Argo controller will
apply the templating the to the manifest before executing it. Argo does
not accept such malformed manifests when we submit workflows via the
argo CLI ie. it will crash with errors like `Failed to submit workflow:
templates.TestEventFlow.tasks.pre-eval-foreach-parallel
templates.pre-eval-foreach-parallel.tasks.train
templates.train.resource.manifest must be a valid yaml"`
When sensors where originally introduced into Metaflow, we used the
[ArgoWorkflowTrigger](https://pkg.go.dev/github.com/argoproj/argo-events@v1.7.6/pkg/apis/sensor/v1alpha1#ArgoWorkflowTrigger)
as a mean to trigger workflows. The ArgoWorkflowTrigger uses the Argo
CLI under the hood to submit workflows. This glitches out when the
workflow template involves manifests compiled for Jobsets. The sensor
crashes with the logs like :
```log
{"level":"info","ts":1753152856.4700022,"logger":"argo-events.sensor","caller":"argo-workflow/argo-workflow.go:207","msg":"submitting command &exec.Cmd{Path:\"/usr/local/bin/argo\", Args:[]string{\"argo\", \"-n\", \"jobs-default\", \"submit\", \"/tmp/testeventflow-2654745760\", \"--kubeconfig\", \"/etc/obp/data-plane-kubeconfig/config\"}, Env:[]string(nil), Dir:\"\", Stdin:io.Reader(nil), Stdout:(*os.File)(0xc00012c008), Stderr:(*os.File)(0xc00012c010), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(nil), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), Err:error(nil), Cancel:(func() error)(nil), WaitDelay:0, childIOFiles:[]io.Closer(nil), parentIOPipes:[]io.Closer(nil), goroutine:[]func() error(nil), goroutineErr:(<-chan error)(nil), ctxResult:(<-chan exec.ctxResult)(nil), createdByStack:[]uint8(nil), lookPathErr:error(nil)}","sensorName":"testeventflow","triggerName":"testeventflow","triggerType":"ArgoWorkflow"}
time="2025-07-22T02:54:16.560Z" level=fatal msg="Failed to submit workflow: templates.TestEventFlow.tasks.pre-eval-foreach-parallel templates.pre-eval-foreach-parallel.tasks.train templates.train.resource.manifest must be a valid yaml"
```
Since the sensor deployments for workflows using `@parallel` was broken,
we have to ship this commit/PR to resolve the issue.
> How did we fix it ?
We fixed this issue with 2 patches:
1. Patching Argo workflows in Metaflow : Argo events also offers
[StandardK8sTrigger](https://pkg.go.dev/github.com/argoproj/argo-events/pkg/apis/sensor/v1alpha1#StandardK8STrigger)
which allows creating K8s resource directly. Since we are still creating
workflows, this change should ideally (hopefully) be a No-op. Since we
are just switching the abstraction triggering the workflow. From a
metaflow POV, all the labels we propagate down are mostly captures.
2. Patching Jobsets: Jobsets used to dump to JSON when we compiled the
manifest for Argo Workflows. Because we were using JSON, the manifest
became malform when it was triggered by event. Instead we now use YAML.