-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I have keda deployed with version v2.15.1 on AKS using work load identity. AKS k8s version is 1.29.7.
My scaled job trigges based on azure event hub. Keda operator shows issue "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w()"
The setup was working fine with KEDA v2.14.2 on AKS using work load identity. AKS k8s version is 1.29.7.
Scled job shows below issues
Status:
Conditions:
Message: Some triggers defined in ScaledJob are not working correctly
Reason: PartialTriggerError
Status: Unknown
Type: Ready
Message: Scaling is not performed because triggers are not active
Reason: ScalerNotActive
Status: False
Type: Active
Status: Unknown
Type: Fallback
Status: Unknown
Type: Paused
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KEDAScalersStarted 7m16s (x4 over 7m16s) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 7m16s scale-handler Started scalers watch
Normal ScaledJobReady 7m16s keda-operator ScaledJob is ready for scaling
Warning KEDAScalerFailed 7m16s (x2 over 7m16s) scale-handler unable to get runtimeInfo for metrics: context canceled
Warning KEDAScalerFailed 2m16s (x61 over 7m14s) scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w(<nil>)
The keda operator pod log shows below
2024-08-16T12:57:17Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "mydemo-scaledjob", "scaledJob.Namespace": "avalanche", "Number of running Jobs": 0}
2024-08-16T12:57:17Z INFO scaleexecutor Scaling Jobs {"scaledJob.Name": "mydemo-scaledjob", "scaledJob.Namespace": "avalanche", "Number of pending Jobs": 0}
2024-08-16T12:57:22Z ERROR scale_handler Error getting scaler metrics and activity, but continue {"scaledJob.Name": "mydemo-scaledjob", "Scaler": "*scalers.azureEventHubScaler:", "error": "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w(<nil>)"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledJobMetrics
/workspace/pkg/scaling/scale_handler.go:853
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).isScaledJobActive
/workspace/pkg/scaling/scale_handler.go:897
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
/workspace/pkg/scaling/scale_handler.go:262
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
/workspace/pkg/scaling/scale_handler.go:182
If I deploy KEDA v2.14.2 or v2.14.3 on top of v2.15.1 without changing anything else in my setup everything starts to work fine. and status of my scaled job comes back to normal as below log shows.
Status:
Conditions:
Message: ScaledJob is defined correctly and is ready to scaling
Reason: ScaledJobReady
Status: True
Type: Ready
Message: Scaling is not performed because triggers are not active
Reason: ScalerNotActive
Status: False
Type: Active
Status: Unknown
Type: Fallback
Status: Unknown
Type: Paused
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KEDAScalersStarted 20m (x4 over 20m) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 20m scale-handler Started scalers watch
Normal ScaledJobReady 20m keda-operator ScaledJob is ready for scaling
Warning KEDAScalerFailed 20m (x2 over 20m) scale-handler unable to get runtimeInfo for metrics: context canceled
Warning KEDAScalerFailed 19m (x18 over 20m) scale-handler unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w(<nil>)
Normal KEDAScalersStarted 16m (x2 over 16m) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 16m scale-handler Started scalers watch
Normal ScaledJobReady 16m keda-operator ScaledJob is ready for scaling
Normal KEDAJobsCreated 16m scale-handler Created 1 jobs
Normal KEDAScalersStarted 14m (x2 over 14m) scale-handler Scaler azure-eventhub is built.
Normal KEDAScalersStarted 14m scale-handler Started scalers watch
Normal KEDAJobsCreated 12m (x22 over 14m) scale-handler Created 0 jobs
Below are more information on my setup.
I deployed keda using below
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm upgrade keda kedacore/keda --install `
--namespace keda `
--version 2.15.1 `
--set serviceAccount.operator.create=true `
--set serviceAccount.operator.name=keda-operator `
--set podIdentity.azureWorkload.enabled=true `
--set podIdentity.azureWorkload.clientId=$(sys_aks_uai_client_id) `
--set podIdentity.azureWorkload.tenantId=$(tenantid)
KEDA triiger auth setup as
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: av-keda-trigger-auth
namespace: mynamespace
spec:
podIdentity:
provider: azure-workload
My scaled job triggers
triggers:
- type: azure-eventhub
metadata:
consumerGroup: largevideogenerator
unprocessedEventThreshold: "1"
activationUnprocessedEventThreshold: "0"
blobContainer: largevideogenerator-largevideogenerationrequired
eventHubNamespace: myeventhubnamespace
eventHubName: largevideogenerationrequired
storageAccountName: mystoragename
checkpointStrategy: blobMetadata
authenticationRef:
name: av-keda-trigger-auth
- type: azure-eventhub
metadata:
consumerGroup: largevideogenerator
unprocessedEventThreshold: "1"
activationUnprocessedEventThreshold: "0"
blobContainer: largevideogenerator-regeneratelargevideo
eventHubNamespace: myeventhubnamespace
eventHubName: regeneratelargevideo
storageAccountName: mystoragename
checkpointStrategy: blobMetadata
authenticationRef:
name: av-keda-trigger-auth
I can provide more information and logs if required.
In summary this is what happens
- In fresh setup of KEDA v2.15.1 with everything else identical - does not work and give "unable to get unprocessedEventCount for metrics: unable to get checkpoint from storage: %!w()" in keda operator pod.
- In failing setup of KEDA v2.15.1 if i deploy KEDA v2.14.2 or v2.14.3 triggers run and scaledjobs getting created as expected.
- In fresh setup of KEDA v2.14.2 with everything else identical - works fine.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status