Skip to content

ScaledJob ignores failing trigger(s) error #5922

@josefkarasek

Description

@josefkarasek

Report

ScaledJob with failing trigger(s) swallows the error (log level 1).

But the status reports that the ScaledJob is "defined correctly and is ready to scaling", while it is not.

status:
  conditions:
  - message: ScaledJob is defined correctly and is ready to scaling
    reason: ScaledJobReady
    status: "True"
    type: Ready
  - message: Scaling is not performed because triggers are not active
    reason: ScalerNotActive
    status: "False"
    type: Active

Logically the Active condition is false - we failed to read the scaling metric, but there's no indication to the user about the failing trigger.

At least an event is emitted:

75s         Warning   KEDAScalerFailed     scaledjob/sample-sj      error requesting metrics endpoint: Get "http://failing-source.com/metrics": dial tcp [::1]:8027: connect: connection refused

Expected Behavior

When a trigger for a ScaledJob fails, the error should not be suppressed, but should be propagated to the user in status and in operator log.

Actual Behavior

When a trigger for a ScaledJob fails, the status says that "ScaledJob is defined correctly and is ready to scaling".
If all triggers for the scaledjob are failing or more commonly this trigger is the only trigger, the scaledjob is never active as it can never read its scaling metric and says "Scaling is not performed because triggers are not active".

It doesn't say that it's not performing any scaling because of the trigger failure.

Steps to Reproduce the Problem

  1. Create a scaledjob with failing trigger
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: sample-sj
  namespace: default
spec:
  failedJobsHistoryLimit: 5
  jobTargetRef:
    template:
      spec:
        containers:
        - args:
          - /bin/sh
          - -c
          - sleep 30
          image: busybox
          name: busybox-worker
        restartPolicy: Never
  maxReplicaCount: 10
  pollingInterval: 30
  successfulJobsHistoryLimit: 3
  triggers:
  - metadata:
      targetValue: "1"
      url: http://failing-source.com/metrics
      valueLocation: value
    type: metrics-api
  1. Check events kubectl get events
  2. Check scaledjob status

Logs from KEDA operator

No logs because the error is swallowed and printed only with elevated log level.

KEDA Version

2.14.0

Kubernetes Version

1.29

Platform

None

Scaler Details

No response

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions