-
Notifications
You must be signed in to change notification settings - Fork 3.4k
cilium, gops: remap to fixed port to avoid collision with nodeport range #14329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Could you please also update Hubble Relay as I think it's also affected (nodeport protection is global)? It also enables gops by default with the default listen address (see hubble-relay/cmd/serve/serve.go
).
Ah, good point, that one fell through the cracks :/ will add. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the overall changes LGTM, however a couple points that we need to take care:
- It would likely be better to
Fatal
if we can't initialize gops. If gops isn't running at all we won't be able to debug Cilium if we need it; - Currently, the default is
127.0.0.1:0
and we are changing it tolocalhost:XXXX
, maybe we should keep the127.0.0.1
. - I don't see a "reuse port" in gops, so it is likely that we will hit something similar as #11573.
Marked for backport to v1.8 as this also fixes an issue with port collision between gops agent and the proxy, see #13400. I'll send a manual backport PR as cherry-picking is not straight forward. |
1.8 backport PR: #15634 |
Apologies, I forgot to mark this as |
[ upstream commit 7757d31 ] Manually backported from cilium#14329 to address cilium#13400 for v1.8. Lee reported that kube-proxy log had a warning that its bind protection couldn't bind a specific port in the nodeport range. Turns out gops was using this particular port already through it's auto-binding (127.0.0.1:0). Meaning that in case gops collides with a NodePort service, we might not be able to pull gops data from that port since either kube-proxy or kube-proxt free variant will redirect us to the actual service instead. Given this is rather unpredictable wrt which port the agent will bind for gops, remap it to a fixed default port and add a user configurable knob that allows to use a different one if necessary. Given the agent, operator, clustermesh-apiserver and hubble-relay all start the gops listener, add the --gops-port flag to each of them. The CNI does not have gops enabled by default but only in debug mode hence no changes there for now given it's unlikely being used this way in production. Fixes: cilium#14218 Reported-by: Lee Hu via Slack Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 7757d31 ] Manually backported from #14329 to address #13400 for v1.8. Lee reported that kube-proxy log had a warning that its bind protection couldn't bind a specific port in the nodeport range. Turns out gops was using this particular port already through it's auto-binding (127.0.0.1:0). Meaning that in case gops collides with a NodePort service, we might not be able to pull gops data from that port since either kube-proxy or kube-proxt free variant will redirect us to the actual service instead. Given this is rather unpredictable wrt which port the agent will bind for gops, remap it to a fixed default port and add a user configurable knob that allows to use a different one if necessary. Given the agent, operator, clustermesh-apiserver and hubble-relay all start the gops listener, add the --gops-port flag to each of them. The CNI does not have gops enabled by default but only in debug mode hence no changes there for now given it's unlikely being used this way in production. Fixes: #14218 Reported-by: Lee Hu via Slack Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 7757d31 ] Manually backported from cilium#14329 to address cilium#13400 for v1.8. Lee reported that kube-proxy log had a warning that its bind protection couldn't bind a specific port in the nodeport range. Turns out gops was using this particular port already through it's auto-binding (127.0.0.1:0). Meaning that in case gops collides with a NodePort service, we might not be able to pull gops data from that port since either kube-proxy or kube-proxt free variant will redirect us to the actual service instead. Given this is rather unpredictable wrt which port the agent will bind for gops, remap it to a fixed default port and add a user configurable knob that allows to use a different one if necessary. Given the agent, operator, clustermesh-apiserver and hubble-relay all start the gops listener, add the --gops-port flag to each of them. The CNI does not have gops enabled by default but only in debug mode hence no changes there for now given it's unlikely being used this way in production. Fixes: cilium#14218 Reported-by: Lee Hu via Slack Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Tobias Klauser <tobias@cilium.io>
See commit msg.