-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Report
ScaledJob with failing trigger(s) swallows the error (log level 1).
But the status reports that the ScaledJob is "defined correctly and is ready to scaling", while it is not.
status:
conditions:
- message: ScaledJob is defined correctly and is ready to scaling
reason: ScaledJobReady
status: "True"
type: Ready
- message: Scaling is not performed because triggers are not active
reason: ScalerNotActive
status: "False"
type: Active
Logically the Active
condition is false - we failed to read the scaling metric, but there's no indication to the user about the failing trigger.
At least an event is emitted:
75s Warning KEDAScalerFailed scaledjob/sample-sj error requesting metrics endpoint: Get "http://failing-source.com/metrics": dial tcp [::1]:8027: connect: connection refused
Expected Behavior
When a trigger for a ScaledJob fails, the error should not be suppressed, but should be propagated to the user in status and in operator log.
Actual Behavior
When a trigger for a ScaledJob fails, the status says that "ScaledJob is defined correctly and is ready to scaling".
If all triggers for the scaledjob are failing or more commonly this trigger is the only trigger, the scaledjob is never active as it can never read its scaling metric and says "Scaling is not performed because triggers are not active".
It doesn't say that it's not performing any scaling because of the trigger failure.
Steps to Reproduce the Problem
- Create a scaledjob with failing trigger
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: sample-sj
namespace: default
spec:
failedJobsHistoryLimit: 5
jobTargetRef:
template:
spec:
containers:
- args:
- /bin/sh
- -c
- sleep 30
image: busybox
name: busybox-worker
restartPolicy: Never
maxReplicaCount: 10
pollingInterval: 30
successfulJobsHistoryLimit: 3
triggers:
- metadata:
targetValue: "1"
url: http://failing-source.com/metrics
valueLocation: value
type: metrics-api
- Check events
kubectl get events
- Check scaledjob status
Logs from KEDA operator
No logs because the error is swallowed and printed only with elevated log level.
KEDA Version
2.14.0
Kubernetes Version
1.29
Platform
None
Scaler Details
No response
Anything else?
No response