Skip to content

Conversation

timebertt
Copy link
Member

@timebertt timebertt commented Jun 27, 2025

How to categorize this PR?

/area ipcei
/kind enhancement

What this PR does / why we need it:

This PR continues #12267 and deploys a Bastion in gardenadm bootstrap for connecting to the control plane machines.

gardenadm bootstrap deploys a Bastion.extensions.gardener.cloud (not a Bastion.operations.gardener.cloud) with a dedicated SSH key pair. Similar to gardenctl, the user's public IPs are detected using https://ipify.org and used as ingress restrictions for the Bastion (can be overwritten using --bastion-ingress-cidr). Once the Bastion is healthy, an SSH connection is opened.

Which issue(s) this PR fixes:
Part of #2906

Special notes for your reviewer:

/cc @maboehm @ScheererJ @rfranzke

The Bastion component in this PR uses pkg/utils/ssh introduced in #12366, and the e2e tests need the Bastion implementation in provider-local:

Release note:

NONE

Copy link
Contributor

gardener-prow bot commented Jun 27, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@gardener-prow gardener-prow bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 27, 2025
@gardener-prow gardener-prow bot requested a review from ScheererJ June 27, 2025 07:49
@gardener-prow gardener-prow bot added area/ipcei IPCEI (Important Project of Common European Interest) kind/enhancement Enhancement, improvement, extension labels Jun 27, 2025
@gardener-prow gardener-prow bot requested a review from rfranzke June 27, 2025 07:49
Copy link
Contributor

gardener-prow bot commented Jun 27, 2025

@timebertt: GitHub didn't allow me to request PR reviews from the following users: maboehm.

Note that only gardener members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

How to categorize this PR?

/area ipcei
/kind enhancement

What this PR does / why we need it:

This PR continues #12267 and deploys a Bastion in gardenadm bootstrap for connecting to the control plane machines.

gardenadm bootstrap deploys a Bastion.extensions.gardener.cloud (not a Bastion.operations.gardener.cloud) with a dedicated SSH key pair. Similar to gardenctl, the user's public IPs are detected using https://ipify.org and used as ingress restrictions for the Bastion (can be overwritten using --bastion-ingress-cidr). Once the Bastion is healthy, an SSH connection is opened.

Which issue(s) this PR fixes:
Part of #2906

Special notes for your reviewer:

/cc @maboehm @ScheererJ @rfranzke

The Bastion component in this PR uses pkg/utils/ssh introduced in #12366, and the e2e tests need the Bastion implementation in provider-local:

Release note:

NONE

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gardener-prow gardener-prow bot added cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 27, 2025
@timebertt
Copy link
Member Author

/test pull-gardener-unit pull-gardener-e2e-kind-gardenadm

@timebertt
Copy link
Member Author

/test pull-gardener-unit pull-gardener-e2e-kind-gardenadm

@timebertt timebertt force-pushed the gardenadm-bootstrap-bastion branch from 4f95b42 to e49ca4f Compare July 1, 2025 07:38
@timebertt
Copy link
Member Author

This PR is ready, now that #12366 has been merged.

@timebertt timebertt marked this pull request as ready for review July 1, 2025 07:42
@gardener-prow gardener-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 1, 2025
@rfranzke
Copy link
Member

rfranzke commented Jul 1, 2025

/assign

Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice :)

@@ -195,7 +196,18 @@ func run(ctx context.Context, opts *Options) error {
// TODO(timebertt): add b.Shoot.Components.Extensions.Worker.Wait when
// https://github.com/gardener/machine-controller-manager/issues/994 has been implemented

deployBastion = g.Add(flow.Task{
Name: "Deploying and connecting to bastion host",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you outline the next step/next iteration (now that we have a bastion and an SSH connection)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next step, we can connect to the first control plane machine via the Bastion connection. However, this requires gardener/machine-controller-manager#1007 (or a temporary hack for provider-local).
Once this has been established, we can prepare the ShootState, copy it along with the manifests to the machine via SCP, and then we can execute gardenadm init there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks!
Do you plan to run gardenadm init from within gardenadm bootstrap? I thought, the idea would be to just prepare the infrastructure and the nodes via gardenadm bootstrap, and then let the human user SSH into it (via the bastion) to run gardenadm init themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in my vision, gardenadm init would be executed by gardenadm bootstrap (at least, by default).

Now that I checked the GEP section again, I found the sentence that I had forgotten:

machine-controller-manager will also take over the creation of the control plane nodes, however, bootstrapping and joining them must still be performed by the user (meaning that step 3 remains)

But it also says:

It still needs to be investigated how much automation by machine-controller-manager is possible for the control plane nodes

Because of this, I was under the impression that we strive for as much automation in gardenadm bootstrap as possible. It would be gardenadm bootstrapping/joining the control plane nodes instead of MCM, though.
But in the end, I don't see a reason for letting the user execute gardenadm init themselves.
gardenadm bootstrap already has the manifests, the SSH connection, etc. So it would be the simplest user experience to finish gardenadm bootstrap with a fully bootstrapped autonomous shoot cluster and output the kubeconfig.

WDYT? Do you have a case in mind where the user would need to run gardenadm init themselves?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fair enough. We thought it might be easier to not couple this in the beginning, but left this open for the future to further automate this. If you think we are good to go down this path right away, even better.

Personally, I was wondering whether the user has to anyways manually SSH into the nodes (at least the control plane node). If they want to inspect the cluster (kubectl get nodes,pods or whatever) after bootstrapping, there is no way around it (as of today, at least, since we don't expose the API server or export the kubeconfig somehow (the latter could also be automated by gardenadm bootstrap eventually)).
Yet, there is nothing blocking the user from doing so, even if gardenadm bootstrap already calls init or join. They can always use the bastion and SSH into the nodes to inspect whatever they need (e.g., in case init or join fails), right?

TL;DR: Go ahead :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I see we're on the same page :)
Yeah, we still need to clarify such details. I can imagine an option to keep the Bastion for troubleshooting/access purposes. I can also imagine that the Infrastructure adds a Load Balancer for the control plane.

@timebertt timebertt requested a review from rfranzke July 3, 2025 13:02
Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jul 4, 2025
Copy link
Contributor

gardener-prow bot commented Jul 4, 2025

LGTM label has been added.

Git tree hash: 41b3616deaa3c5db3ab692df1463993ea5cb1fb9

Copy link
Contributor

gardener-prow bot commented Jul 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 4, 2025
@gardener-prow gardener-prow bot merged commit b83c1b2 into gardener:master Jul 4, 2025
22 checks passed
@timebertt timebertt deleted the gardenadm-bootstrap-bastion branch July 4, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipcei IPCEI (Important Project of Common European Interest) cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants