-
Notifications
You must be signed in to change notification settings - Fork 3.4k
add support to override not ready taint to be ignored by the CA #19247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support to override not ready taint to be ignored by the CA #19247
Conversation
a3de006
to
4a02e27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small questions as per below, but LGTM in general, thanks.
pkg/option/config.go
Outdated
@@ -621,6 +622,14 @@ const ( | |||
// K8sNamespaceName is the name of the K8sNamespace option | |||
K8sNamespaceName = "k8s-namespace" | |||
|
|||
// AgentNotReadyNodeTaintPrefixDefault hold the default value of | |||
// AgentNotReadyNodeTaintPrefix | |||
AgentNotReadyNodeTaintPrefixDefault = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this default param used anywhere ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually meant to use that since it didn't look like viper.GetString
will return a default empty-string value. I'm a bit confused though as I'm trying to figure out the behaviors. For example, k8s-namespace
doesn't appear to have a default value set anywhere except in bugtool
. I'll start work on an edit to use that default if the config value isn't set, but if you happen to know if there's a standard way to set those defaults I'd welcome a pointer.
pkg/option/config.go
Outdated
@@ -2999,6 +3020,7 @@ func (c *DaemonConfig) Populate() { | |||
c.HTTP403Message = viper.GetString(HTTP403Message) | |||
c.DisableEnvoyVersionCheck = viper.GetBool(DisableEnvoyVersionCheck) | |||
c.K8sNamespace = viper.GetString(K8sNamespaceName) | |||
c.AgentNotReadyNodeTaintPrefix = viper.GetString(AgentNotReadyNodeTaintPrefixName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this flag is hidden, I reckon you need to use helm extraConfig param to set value, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's right from reading, but I'm not actually sure whether that's desirable behavior. I'm still working on a test of this on my own infra to verify, so I may make changes based on how that goes (was building images on top of the 1.11.2 tag to test with).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, appreciate if you can share your testing later on.
let's wait until #19074 and one other PR is not open yet is merged first. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
@thejosephstevens the PR #19074 was merged, can you rebase and squash this PR against master? Thank you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @thejosephstevens! I have a few comments:
- As it currently stands, users will have to resort to
extraConfig
in Helm to specifyagent-not-ready-taint-prefix
, because a corresponding Helm value is currently not being added. It would be great if you could add this (I suggestagentNotReadyTaintPrefix
so it matched the name of the option). - Also, the choice being given to users seems to be between the old taint key (
agent.cilium.io/not-ready
) and<custom-prefix>/cilium-agent-not-ready
. While this is fine and works, I wonder if we should instead give users full control over the taint key (i.e., insteadl ofagent-not-ready-taint-prefix
, we'd support something likeagent-not-ready-taint-key
). It seems to me the work is essentially done, so this shouldn't imply a big change. - There is no documentation being added/updated as part of the PR. I suggest adding the Helm value and document it there, as that should be picked up by
helm-docs
, and at least making a mention to the new property and the reason for introducing it in this page. - I also believe that we should update CI to exercise the new code path. This should amount to updating this line (and the corresponding ones in the
-1.10
and-1.11
files), and setting the new option somewhere around here.
Thanks for the review @bmcustodio ! We're working through the suggestions (they look pretty straightforward - I expect we'll have something ready later today). Just a note, @XiaoxiHuang21 is taking over lead on this PR. Feel free to tag me, but you'll likely get quicker responses tagging her directly. |
d4213cd
to
ee59e5f
Compare
682ed53
to
f53e1c1
Compare
I have forced pushed your commit here after resolving conflicts with master, as well as minor formatting changes to make CI happy. Thanks for your support. |
/test Job 'Cilium-PR-K8s-1.22-kernel-5.4' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-1.16-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
f53e1c1
to
df893fe
Compare
/test Job 'Cilium-PR-K8s-1.22-kernel-5.4' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
df893fe
to
162316c
Compare
/test Job 'Cilium-PR-K8s-1.22-kernel-5.4' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
If you scale a node pool down to zero nodes the Cluster Autoscaler won't add new nodes because pending pods don't tolerate the cilium agent-not-ready taint. If you're managing your own CA you can ignore the cilium taint with a command line flag on the CA. However if you're running on managed k8s (e.g. GKE), you may not have access to configure the CA, and will need to use a custom taint prefix to be ignored by the autoscaler when scaling up from zero nodes in a pool. This commit adds config to add a prefix to the agent-not-ready taint so it can be ignored by the cluster-autoscaler. Fixes: cilium#19241 Co-authored-by: Xiaoxi Huang <Xiaoxi@ascend.io> Signed-off-by: Joseph Stevens <thejosephstevens@gmail.com>
162316c
to
c2a455a
Compare
Forced pushed after rebase with master and fixed upgrade tests. |
/test |
/test-runtime Job 'Cilium-PR-K8s-1.23-kernel-net-next' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
/test-1.23-net-next |
reviews are in, required CI tests are passed. I am marking this ready to merge, please let me know if you think otherwise. |
Thanks so much for taking this across the line @sayboras !!! |
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
As described in this issue, if you scale a node pool down to zero, agent-not-ready taints can prevent the cluster autoscaler from scaling the node pool up from zero nodes because it thinks nothing will be able to run on the node (or at least the pending app pods won't be able to).
This PR adds the ability to pass a prefix that will be prepended to the agent-not-ready taint in order for it to be ignored by the cluster autoscaler when generating sanitized node templates in order to make scaling decisions. More specifically, this allows scaling node pools up from zero in GKE, where you can not customize the command line arguments of the cluster autoscaler, and where a multi-zone node pool is made up of multiple node pools under the hood, making it very easy to end up with a "node pool" with no nodes in it if you scale below 1 node per zone.
Fixes: #19241