Skip to content

Conversation

rfranzke
Copy link
Member

@rfranzke rfranzke commented Apr 16, 2025

How to categorize this PR?

/area ipcei
/kind enhancement

What this PR does / why we need it:
This PR is the next increment for gardenadm init. It deploys kube-proxy, the Network extension, CoreDNS, the NetworkPolicys, and finally re-deploys gardener-resource-manager and extension controllers into the pod network.

In addition, the ControlPlane resource is deployed as well as Deployments/StatefulSet for the control plane components that have to be translated to static pods. These resources are used to make extension webhooks work (i.e., still allow extensions to modify the control plane components).

Finally, the gardener-node-agent systemd unit is activated, and the node is properly tainted with the common node-role.kubernetes.io/control-plane taint.

Which issue(s) this PR fixes:
Part of #2906

Special notes for your reviewer:
/cc @ScheererJ
Still in draft because some unit tests are still missing.

Release note:

NONE

Copy link
Contributor

gardener-prow bot commented Apr 16, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@gardener-prow gardener-prow bot requested a review from ScheererJ April 16, 2025 10:21
@gardener-prow gardener-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/ipcei IPCEI (Important Project of Common European Interest) kind/enhancement Enhancement, improvement, extension cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 16, 2025
@ScheererJ
Copy link
Member

/assign

Copy link
Member

@ScheererJ ScheererJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for making autonomous shoot clusters even more usable. The pull request already looks awesome, but I have a few comments.

@rfranzke rfranzke requested a review from ScheererJ April 16, 2025 12:19
@gardener-prow gardener-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 16, 2025
@gardener-prow gardener-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 16, 2025
@rfranzke rfranzke changed the title [GEP-28] Handle networking components [GEP-28] Handle networking and control plane components Apr 16, 2025
@rfranzke rfranzke marked this pull request as ready for review April 16, 2025 15:33
@gardener-prow gardener-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 16, 2025
@rfranzke rfranzke force-pushed the gep28/networking branch 2 times, most recently from 36320bb to d23d2d2 Compare April 17, 2025 06:23
- They might serve webhooks necessary for the subsequent steps (e.g.,
  deployment of `kube-proxy`)
- only for autonomous shoots
- we now deploy the control plane amespace (i.e., in fact we apply
  some labels to kube-system (besides others, gardener.cloud/role=shoot
  will be added) to make the webhooks work)
- This basically adds the `gardener.cloud/purpose=kube-system` label
  to the `kube-system` namespace (see
  https://github.com/gardener/gardener/blob/0df3391737ca7c36cd58279845a46c832ce4b422/pkg/component/shoot/namespaces/namespaces.go#L90-L101)
- the `ControllerInstallation` controller now passes information to
  extension controller Helm charts (when deploying them) in case the
  shoot is an autonomous shoot cluster
- extensions can then use this information to adapt their behaviour (see
  next commit)
- if extension runs in an ASC, it now merges the shoot webhooks into the
  seed webhooks (without this, they would only be deployed when
  `gardenadm` or `gardenlet` create the `ControlPlane` resource later
  (not yet implemented))
- the `ControlPlane` controller does not include the shoot webhooks as
  part of its reconciliation in case the `ControlPlane` is for an ASC
  (no need since they are already deployed via the seed webhooks)
rfranzke added 13 commits April 17, 2025 12:37
- We don't have any information of a garden cluster in `gardenadm init`,
  so we cannot inject anything
- This becomes relevant only now because we deploy the extensions into
  the pod network (in bootstrap mode, this was skipped)
- Otherwise, the pods in the pod network cannot talk to the API server
  (or resolve DNS)
- without `.spec.region`, the validation webhook fails with
  `2025-04-16T11:44:57.927Z	ERROR	Error	{"flow": "init", "task":
"Deploying shoot control plane components", "error": "admission webhook
\"validation.extensions.controlplanes.resources.gardener.cloud\" denied
the request: ControlPlane.extensions.gardener.cloud \"root\" is invalid:
spec.region: Required value: field is required"}`

- without a cloud provider secret, the generic control plane actuator
  fails with `	* task "Waiting until shoot control plane has been
reconciled" failed: Error while waiting for ControlPlane
kube-system/root to become ready: error during reconciliation: Error
reconciling ControlPlane: could not get secret
'kube-system/cloudprovider': Secret "cloudprovider" not found`
Otherwise, we would attempt to deploy a real pod into the cluster (which
we don't want). Instead, we want a `StatefulSet` with `replicas=0` that
we can translate into a static pod.
Result:

```
root@machine-0:/# k get deploy,sts
NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
...
deployment.apps/kube-apiserver                       0/0     0            0           2m14s
deployment.apps/kube-controller-manager              0/0     0            0           2m14s
deployment.apps/kube-scheduler                       0/0     0            0           2m14s

NAME                             READY   AGE
statefulset.apps/etcd-events-0   0/0     2m14s
statefulset.apps/etcd-main-0     0/0     2m14s
```
- we have to deactivate the NodeAgentAuthorizer feature for now since it
  only works with Machine objects (which we don't have) - we have to
  adapt the feature later (separately)
- we don't have a machine image name/type for now, and that's a sane
  configuration
- GRM's system-components-config webhook now propagates respective
  tolerations to the pods managed by Gardener
- GRM itself tolerates the taint if it is configured to propagate it to
  system component pods
@rfranzke rfranzke requested a review from ScheererJ April 17, 2025 10:49
Copy link
Member

@ScheererJ ScheererJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2025
Copy link
Contributor

gardener-prow bot commented Apr 17, 2025

LGTM label has been added.

Git tree hash: 48d2c79d95638300e8069a72db4f0e82a249f2a5

Copy link
Contributor

gardener-prow bot commented Apr 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ScheererJ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 17, 2025
@gardener-prow gardener-prow bot merged commit fbf76ed into gardener:master Apr 17, 2025
19 checks passed
@rfranzke rfranzke deleted the gep28/networking branch April 17, 2025 12:16
ScheererJ added a commit to ScheererJ/gardener that referenced this pull request May 9, 2025
The change merging the shoot and seed webhooks for autonomous shoot
clusters (gardener#11892) assumed that extensions always had both validating as
well as mutating webhooks. However, there are extensions, which only
have one of them. They might also not have seed webhooks at all, but
only shoot webhooks. In all those cases the initialisation was
incomplete leading to potential nil dereferences when adding the shoot
webhooks.
This change resolves these problems.
ScheererJ added a commit to ScheererJ/gardener that referenced this pull request May 9, 2025
The change merging the shoot and seed webhooks for autonomous shoot
clusters (gardener#11892) assumed that extensions always had both validating as
well as mutating webhooks. However, there are extensions, which only
have one of them. They might also not have seed webhooks at all, but
only shoot webhooks. In all those cases the initialisation was
incomplete leading to potential nil dereferences when adding the shoot
webhooks.
This change resolves these problems.
gardener-prow bot pushed a commit that referenced this pull request May 12, 2025
…2040)

The change merging the shoot and seed webhooks for autonomous shoot
clusters (#11892) assumed that extensions always had both validating as
well as mutating webhooks. However, there are extensions, which only
have one of them. They might also not have seed webhooks at all, but
only shoot webhooks. In all those cases the initialisation was
incomplete leading to potential nil dereferences when adding the shoot
webhooks.
This change resolves these problems.
dimitar-kostadinov added a commit to dimitar-kostadinov/gardener-extension-registry-cache that referenced this pull request Jun 17, 2025
dimitar-kostadinov added a commit to dimitar-kostadinov/gardener-extension-registry-cache that referenced this pull request Jun 17, 2025
gardener-prow bot pushed a commit to gardener/gardener-extension-registry-cache that referenced this pull request Jun 17, 2025
* Bump github.com/gardener/gardener from 1.117.6 to 1.118.3

* Adapt to breaking changes introduced in gardener/gardener#11892
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipcei IPCEI (Important Project of Common European Interest) cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants