-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Description
Describe the bug
When utilizing custom resource health checks with glob patterns, Lua scripts are inconsistently invoked, resulting in the application controller randomly failing to report resource health.
Upon investigation of the codebase, I identified the underlying issue:
To locate the health Lua script associated with a resource, the controller iterates over all resource override keys until it finds a key matching the resource group/version.
The primary problem here lies in the fact that map keys are not ordered. Consequently, if a resource group/version matches multiple resource override glob keys, the controller randomly selects one of these keys based on the order of iteration. This leads to the random usage of one of the Lua scripts if they exist.
While it might seem improbable to define multiple globs matching the same group/kind resource, another issue arises. The controller adds default resource overrides under the */*
key.
Due to this additional resource override, which matches all resources, the iteration over map keys may randomly match this key instead of the user-defined key. The */*
resource override lacks a Lua script, causing the resource to have no health check evaluated.
To address this, I propose a fix that involves sorting keys by string length (from longer key to shorter):
func GetWildcardConfigMapKey(vm VM, gvk schema.GroupVersionKind) string {
gvkKeyToMatch := GetConfigMapKey(gvk)
resourceOverridesKeys := make([]string, len(vm.ResourceOverrides))
i := 0
for k := range vm.ResourceOverrides {
resourceOverridesKeys[i] = k
i++
}
// Sort the keys so that the longest key is first
sort.Slice(resourceOverridesKeys, func(i, j int) bool {
return len(resourceOverridesKeys[i]) > len(resourceOverridesKeys[j])
})
for _, key := range resourceOverridesKeys {
if glob.Match(key, gvkKeyToMatch) {
return key
}
}
return ""
}
By sorting the keys, the */*
key is consistently evaluated last. This not only addresses the issue with the random evaluation but also provides an added benefit: it allows the definition of a health check script with a glob and its subsequent override with a longer glob. For instance, defining *.upbound.io/*
and *.aws.upbound.io/*
health checks ensures that the health evaluation for the s3.aws.upbound.io
group will use the *.aws.upbound.io/*
Lua script.
Version
argocd: v2.9.3+6eba5be
BuildDate: 2023-12-01T23:05:50Z
GitCommit: 6eba5be864b7e031871ed7698f5233336dfe75c7
GitTreeState: clean
GoVersion: go1.21.3
Compiler: gc
Platform: linux/amd64