Skip to content

daemon: Add require-k8s-connectivity flag to enable usage with cloud NLB services #39434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ya-makariy
Copy link
Contributor

Add 'require-k8s-connectivity' flag to Cilium agent (default: true)
This flag controls whether agent health checks require Kubernetes connectivity. Previously this was only possible using a custom HTTP header, which doesn't work with cloud Network Load Balancers. Setting this to 'false' allows health checks to succeed when K8s connectivity is unavailable.

Related PR: #32724

@ya-makariy ya-makariy requested review from a team as code owners May 8, 2025 12:42
@ya-makariy ya-makariy requested review from thorn3r and tklauser May 8, 2025 12:42
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 8, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label May 8, 2025
@tklauser
Copy link
Member

tklauser commented May 8, 2025

/test

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See one comment inline about the naming of the flag.

Also, the command reference would need to be updated using make -C Documentation update-cmdref to contain the new flag. Usually CI should point this out, but it seems this isn't the case right now. Will look into it.

…NLB services

Add 'require-k8s-connectivity' flag to Cilium agent (default: true)
This flag controls whether agent health checks require Kubernetes connectivity.
Previously this was only possible using a custom HTTP header, which doesn't
work with cloud Network Load Balancers. Setting this to 'false' allows health checks
to succeed when K8s connectivity is unavailable.

Signed-off-by: ya-makariy <me@ya-makariy.com>
@ya-makariy ya-makariy force-pushed the add-require-k8s-connectivity-flag-to-agent branch from 9b08572 to 0b32d28 Compare May 9, 2025 17:18
@ya-makariy ya-makariy requested a review from tklauser May 9, 2025 17:23
@tklauser
Copy link
Member

/test

@tklauser tklauser added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 13, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 13, 2025
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@tklauser tklauser enabled auto-merge May 13, 2025 12:18
@tklauser tklauser added this pull request to the merge queue May 13, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 13, 2025
Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 13, 2025
@joestringer
Copy link
Member

cc @tkna what was the original rationale for adding this into the healthz API? This approach seems much simpler overall.

@ya-makariy
Copy link
Contributor Author

@tklauser it looks like merge checks failed unexpectedly, could you please add this pr back to merge queue?

@tklauser tklauser added this pull request to the merge queue May 14, 2025
Merged via the queue into cilium:main with commit 2dd402e May 14, 2025
68 checks passed
@tkna
Copy link
Contributor

tkna commented May 15, 2025

@joestringer

cc @tkna what was the original rationale for adding this into the healthz API? This approach seems much simpler overall.

The original rationale was:

  • It does not require restarting the cilium-agent to take effect.
  • It allows behavior changes per request.

Ref: #32724 (comment)

That said, I think there's no problem with supporting both approaches.
One concern I have is that whether we should also consider the Cilium API's /healthz endpoint:

  1. From livenessProbe, etc. -> Pod’s /healthz endpoint -> GetStatus()
  2. From cilium-dbg status command, etc. -> Cilium API’s /healthz endpoint -> GetStatus()

As I understand it, this PR sets the daemon config for the Pod’s /healthz endpoint, but not for the Cilium API’s /healthz endpoint.
As a result, the daemon config is not taken into account when using cilium-dbg status.
This might be a bit confusing, though probably not critical.

@joestringer
Copy link
Member

@tkna thanks, I think that's fine then 👍

The change here is setting the default value for whether to require k8s connectivity via a command-line flag. If the healthz API call specifies this attribute, it will override the behavior.

@ya-makariy ya-makariy deleted the add-require-k8s-connectivity-flag-to-agent branch May 18, 2025 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants