[Flaky Test] E2E tests for extensions can fail due to unavailability of `clusterrole-machine-controller-manager.local.extensions.gardener.cloud` webhook

**How to categorize this issue?**

/area testing
/kind flake

**Which test(s)/suite(s) are flaking**:
E2E tests for extensions which use the KinD setup  can sometimes flake during the step which deploys the extensions' charts in the local KinD cluster.

**CI link**:
- https://prow.gardener.cloud/view/gs/gardener-prow/pr-logs/pull/gardener_gardener-extension-shoot-rsyslog-relp/34/pull-gardener-extension-shoot-rsyslog-relp-e2e-kind/1723972989701066752
- https://prow.gardener.cloud/view/gs/gardener-prow/pr-logs/pull/gardener_gardener-extension-shoot-rsyslog-relp/38/pull-gardener-extension-shoot-rsyslog-relp-e2e-kind/1731974321494036480

**Reason for failure**:
This can happen if the extensions' charts contain a `clusterrole` resource. E.g. the [`shoot-rsyslog-relp`](https://github.com/gardener/gardener-extension-shoot-rsyslog-relp) extension deploys a [ClusterRole](https://github.com/gardener/gardener-extension-shoot-rsyslog-relp/blob/main/charts/gardener-extension-shoot-rsyslog-relp-admission/charts/application/templates/cluster-role.yaml) as part of the skaffold deployment for the [`shoot-rsyslog-relp-admission`](https://github.com/gardener/gardener-extension-shoot-rsyslog-relp/blob/5ad228a5d4548d8f790b9db0054eeb02c86aba4f/skaffold.yaml#L65-L88) used for the e2e tests.

This skaffold deployment can fail with the following error:
```
Error: INSTALLATION FAILED: 1 error occurred:
	* Internal error occurred: failed calling webhook "clusterrole-machine-controller-manager.local.extensions.gardener.cloud": failed to call webhook: Post "[https://gardener-extension-provider-local.extension-provider-local-5mf8n.svc:443/clusterrole-machine-controller-manager?timeout=5s](https://gardener-extension-provider-local.extension-provider-local-5mf8n.svc/clusterrole-machine-controller-manager?timeout=5s)": dial tcp 10.2.126.230:443: connect: connection refused
```

The reason for the failure is that the `gardener-extension-provider-local` pods could get evicted by VPA during the deployment of the extension charts, meaning that the `gardener-extension-provider-local`s webhook server will be temporarily unavailable. 

The `clusterrole-machine-controller-manager.local.extensions.gardener.cloud` webhook does not have any selectors: 
https://github.com/gardener/gardener/blob/bcaed6dcbf8e06ea9a3b9a7ea726aaaf7c3dbbd2/pkg/provider-local/webhook/machinecontrollermanager/add.go#L71-L80
However, it is only responsible for the `system:machine-controller-manager-runtime` ClusterRole: 
https://github.com/gardener/gardener/blob/bcaed6dcbf8e06ea9a3b9a7ea726aaaf7c3dbbd2/pkg/provider-local/webhook/machinecontrollermanager/mutator.go#L30-L32.

Therefore, anything that tries to deploy a ClusterRole while the `gardener-extension-provider-local` pods are down will fail.

**Anything else we need to know**:

	return &extensionswebhook.Webhook{
	Name: name,
	Provider: provider,
	Types: types,
	Target: target,
	Path: name,
	Webhook: &admission.Webhook{Handler: handler, RecoverPanic: true},
	FailurePolicy: &failurePolicy,
	TimeoutSeconds: pointer.Int32(5),
	}, nil

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Flaky Test] E2E tests for extensions can fail due to unavailability of `clusterrole-machine-controller-manager.local.extensions.gardener.cloud` webhook #9020

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if newObj.GetName() != "system:machine-controller-manager-runtime" {
	return nil
	}

[Flaky Test] E2E tests for extensions can fail due to unavailability of clusterrole-machine-controller-manager.local.extensions.gardener.cloud webhook #9020

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Flaky Test] E2E tests for extensions can fail due to unavailability of `clusterrole-machine-controller-manager.local.extensions.gardener.cloud` webhook #9020