-
Notifications
You must be signed in to change notification settings - Fork 526
[GEP-28] Handle networking and control plane components #11892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Skipping CI for Draft Pull Request. |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for making autonomous shoot clusters even more usable. The pull request already looks awesome, but I have a few comments.
f5ae781
to
a43b8ee
Compare
a43b8ee
to
a5c2ad2
Compare
36320bb
to
d23d2d2
Compare
- there is no VPN :)
- They might serve webhooks necessary for the subsequent steps (e.g., deployment of `kube-proxy`)
- only for autonomous shoots - we now deploy the control plane amespace (i.e., in fact we apply some labels to kube-system (besides others, gardener.cloud/role=shoot will be added) to make the webhooks work)
- This basically adds the `gardener.cloud/purpose=kube-system` label to the `kube-system` namespace (see https://github.com/gardener/gardener/blob/0df3391737ca7c36cd58279845a46c832ce4b422/pkg/component/shoot/namespaces/namespaces.go#L90-L101)
- the `ControllerInstallation` controller now passes information to extension controller Helm charts (when deploying them) in case the shoot is an autonomous shoot cluster - extensions can then use this information to adapt their behaviour (see next commit)
- if extension runs in an ASC, it now merges the shoot webhooks into the seed webhooks (without this, they would only be deployed when `gardenadm` or `gardenlet` create the `ControlPlane` resource later (not yet implemented)) - the `ControlPlane` controller does not include the shoot webhooks as part of its reconciliation in case the `ControlPlane` is for an ASC (no need since they are already deployed via the seed webhooks)
- We don't have any information of a garden cluster in `gardenadm init`, so we cannot inject anything - This becomes relevant only now because we deploy the extensions into the pod network (in bootstrap mode, this was skipped)
- Otherwise, the pods in the pod network cannot talk to the API server (or resolve DNS)
- without `.spec.region`, the validation webhook fails with `2025-04-16T11:44:57.927Z ERROR Error {"flow": "init", "task": "Deploying shoot control plane components", "error": "admission webhook \"validation.extensions.controlplanes.resources.gardener.cloud\" denied the request: ControlPlane.extensions.gardener.cloud \"root\" is invalid: spec.region: Required value: field is required"}` - without a cloud provider secret, the generic control plane actuator fails with ` * task "Waiting until shoot control plane has been reconciled" failed: Error while waiting for ControlPlane kube-system/root to become ready: error during reconciliation: Error reconciling ControlPlane: could not get secret 'kube-system/cloudprovider': Secret "cloudprovider" not found`
Otherwise, we would attempt to deploy a real pod into the cluster (which we don't want). Instead, we want a `StatefulSet` with `replicas=0` that we can translate into a static pod.
Result: ``` root@machine-0:/# k get deploy,sts NAME READY UP-TO-DATE AVAILABLE AGE ... deployment.apps/kube-apiserver 0/0 0 0 2m14s deployment.apps/kube-controller-manager 0/0 0 0 2m14s deployment.apps/kube-scheduler 0/0 0 0 2m14s NAME READY AGE statefulset.apps/etcd-events-0 0/0 2m14s statefulset.apps/etcd-main-0 0/0 2m14s ```
- we have to deactivate the NodeAgentAuthorizer feature for now since it only works with Machine objects (which we don't have) - we have to adapt the feature later (separately) - we don't have a machine image name/type for now, and that's a sane configuration
- GRM's system-components-config webhook now propagates respective tolerations to the pods managed by Gardener - GRM itself tolerates the taint if it is configured to propagate it to system component pods
d23d2d2
to
87a33f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: 48d2c79d95638300e8069a72db4f0e82a249f2a5
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ScheererJ The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The change merging the shoot and seed webhooks for autonomous shoot clusters (gardener#11892) assumed that extensions always had both validating as well as mutating webhooks. However, there are extensions, which only have one of them. They might also not have seed webhooks at all, but only shoot webhooks. In all those cases the initialisation was incomplete leading to potential nil dereferences when adding the shoot webhooks. This change resolves these problems.
The change merging the shoot and seed webhooks for autonomous shoot clusters (gardener#11892) assumed that extensions always had both validating as well as mutating webhooks. However, there are extensions, which only have one of them. They might also not have seed webhooks at all, but only shoot webhooks. In all those cases the initialisation was incomplete leading to potential nil dereferences when adding the shoot webhooks. This change resolves these problems.
…2040) The change merging the shoot and seed webhooks for autonomous shoot clusters (#11892) assumed that extensions always had both validating as well as mutating webhooks. However, there are extensions, which only have one of them. They might also not have seed webhooks at all, but only shoot webhooks. In all those cases the initialisation was incomplete leading to potential nil dereferences when adding the shoot webhooks. This change resolves these problems.
* Bump github.com/gardener/gardener from 1.117.6 to 1.118.3 * Adapt to breaking changes introduced in gardener/gardener#11892
How to categorize this PR?
/area ipcei
/kind enhancement
What this PR does / why we need it:
This PR is the next increment for
gardenadm init
. It deployskube-proxy
, theNetwork
extension, CoreDNS, theNetworkPolicy
s, and finally re-deploysgardener-resource-manager
and extension controllers into the pod network.In addition, the
ControlPlane
resource is deployed as well asDeployment
s/StatefulSet
for the control plane components that have to be translated to static pods. These resources are used to make extension webhooks work (i.e., still allow extensions to modify the control plane components).Finally, the
gardener-node-agent
systemd unit is activated, and the node is properly tainted with the commonnode-role.kubernetes.io/control-plane
taint.Which issue(s) this PR fixes:
Part of #2906
Special notes for your reviewer:
/cc @ScheererJ
Still in draft because some unit tests are still missing.Release note: