Skip to content

Custom resource health checks randomly not evaluated when using glob pattern #16905

@duizabojul

Description

@duizabojul

Describe the bug

When utilizing custom resource health checks with glob patterns, Lua scripts are inconsistently invoked, resulting in the application controller randomly failing to report resource health.

Upon investigation of the codebase, I identified the underlying issue:

To locate the health Lua script associated with a resource, the controller iterates over all resource override keys until it finds a key matching the resource group/version.

The primary problem here lies in the fact that map keys are not ordered. Consequently, if a resource group/version matches multiple resource override glob keys, the controller randomly selects one of these keys based on the order of iteration. This leads to the random usage of one of the Lua scripts if they exist.

While it might seem improbable to define multiple globs matching the same group/kind resource, another issue arises. The controller adds default resource overrides under the */* key.

Due to this additional resource override, which matches all resources, the iteration over map keys may randomly match this key instead of the user-defined key. The */* resource override lacks a Lua script, causing the resource to have no health check evaluated.

To address this, I propose a fix that involves sorting keys by string length (from longer key to shorter):

func GetWildcardConfigMapKey(vm VM, gvk schema.GroupVersionKind) string {
	gvkKeyToMatch := GetConfigMapKey(gvk)

	resourceOverridesKeys := make([]string, len(vm.ResourceOverrides))

	i := 0
	for k := range vm.ResourceOverrides {
		resourceOverridesKeys[i] = k
		i++
	}

	// Sort the keys so that the longest key is first
	sort.Slice(resourceOverridesKeys, func(i, j int) bool {
		return len(resourceOverridesKeys[i]) > len(resourceOverridesKeys[j])
	})

	for _, key := range resourceOverridesKeys {
		if glob.Match(key, gvkKeyToMatch) {
			return key
		}
	}

	return ""
}

By sorting the keys, the */* key is consistently evaluated last. This not only addresses the issue with the random evaluation but also provides an added benefit: it allows the definition of a health check script with a glob and its subsequent override with a longer glob. For instance, defining *.upbound.io/* and *.aws.upbound.io/* health checks ensures that the health evaluation for the s3.aws.upbound.io group will use the *.aws.upbound.io/* Lua script.

Version

argocd: v2.9.3+6eba5be
  BuildDate: 2023-12-01T23:05:50Z
  GitCommit: 6eba5be864b7e031871ed7698f5233336dfe75c7
  GitTreeState: clean
  GoVersion: go1.21.3
  Compiler: gc
  Platform: linux/amd64

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions