Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Jan 26, 2024

Replace the global map of BPF masquerade addresses with a query to Table[NodeAddress] in the loader's patchHostNetdevDatapath and remove the now unused code around masquerade addresses in pkg/node.

@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Jan 26, 2024
@joamaki joamaki requested review from a team as code owners January 26, 2024 12:36
@joamaki joamaki requested review from ldelossa, asauber and lmb January 26, 2024 12:36
@joamaki
Copy link
Contributor Author

joamaki commented Jan 26, 2024

@dylandreimerink Please take a look and let me know whether you're OK with these incremental changes. I don't think they're entirely taking the loader to the right direction, so if you have a better idea what steps we should take let me know and I'll refactor this.

@joamaki
Copy link
Contributor Author

joamaki commented Jan 29, 2024

/test

@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from df12459 to 6c3ad04 Compare January 30, 2024 07:57
@joamaki
Copy link
Contributor Author

joamaki commented Jan 30, 2024

/test

Copy link
Contributor

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left a few minor comments.

@joamaki
Copy link
Contributor Author

joamaki commented Jan 30, 2024

A config flag starting with a verb (Derive in this case) typically conveys that this is a boolean, though. (e.g. derive the masq IP or not?) Also e2manywords; let's call this what it is: MasqueradeDevice. The fact that it's (currently) used to derive the masq IP can go in a comment on the field, though if it's extended later, that will go stale.

I would like to keep it as it was originally (as bad as it was) and not start changing it now. MasqueradeDevice is also confusing as this is about overriding the BPF masquerade IP address by forcing it to be derived from the given device (e.g. not about what device to "use", but how to find the IP). I actually want to keep the option badly named and obscure as it's a hidden option and only used in a very specific deployment by one user of Cilium to work around an odd setup and we want to get rid of it as soon as we can.

Unless we can tackle #17158 in the very short term (1.16) and it's actually planned, let's not assume that this is temporary and just document this as a feature/requirement of the loader.

Good point, I'll adjust the comments.
Will check with @brb on how feasible it is to actually get rid of this in v1.16.

One more thing, this string can be empty. It's not clear what the caller should expect when it's left empty. If it's optional, it needs to be specified why.

Will add comments to explain that default is empty which means the address is taken from the actual device and not overridden by this.

@joamaki
Copy link
Contributor Author

joamaki commented Jan 30, 2024

Adding reconciliation later is perfectly fine, of course. Though I'd prefer if the query happens as early as possible so we can get rid of the stringly input param asap and pass around link objects instead. Not sure if there's some kind of startup ordering here, I assume this device exists before the agent starts?

Yes for this particular hidden option it is assumed that the device exists before the agent starts.
And agreed on querying much earlier. This will follow, but I didn't yet want to start making guesses on how we want to structure it.

We could also just forget this PR and incorporate it into a follow-up that @dylandreimerink is working on that adds the loader reconciliation on device changes?

@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from 6c3ad04 to 17afcf9 Compare January 30, 2024 13:08
@joamaki joamaki requested a review from ti-mo January 30, 2024 13:09
@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from 17afcf9 to a00d084 Compare January 30, 2024 13:09
@joamaki
Copy link
Contributor Author

joamaki commented Jan 30, 2024

One more thing, this string can be empty. It's not clear what the caller should expect when it's left empty. If it's optional, it needs to be specified why.

A Config struct is populated via Hive from command-line flags and config files and the defaults for it are specified when defining the flag. Only in tests is it manually populated (well we do still have the exception in cilium-dbg but that'll go away). I added comments on the defaults and the optionality.

@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch 2 times, most recently from f522367 to a7be16f Compare February 19, 2024 13:02
Copy link
Member

@asauber asauber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic looks good to me, besides the above comment.

I'll assume that the function is moved if needed as discussed here #30457 (comment)

@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from a7be16f to 2dbea5e Compare February 28, 2024 10:08
@joamaki joamaki requested a review from a team as a code owner February 28, 2024 10:08
@brb brb self-requested a review March 1, 2024 10:22
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@brb brb added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Mar 4, 2024
@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from 2dbea5e to 00926a5 Compare March 4, 2024 14:59
@joamaki
Copy link
Contributor Author

joamaki commented Mar 4, 2024

/test

@joamaki joamaki enabled auto-merge March 4, 2024 15:09
@joamaki joamaki removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Mar 5, 2024
@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch 2 times, most recently from 44ed79f to 2f25197 Compare March 6, 2024 15:09
@joamaki
Copy link
Contributor Author

joamaki commented Mar 6, 2024

/test

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from 2f25197 to c629a52 Compare March 11, 2024 10:44
@joamaki
Copy link
Contributor Author

joamaki commented Mar 11, 2024

/test

@bimmlerd
Copy link
Member

bimmlerd commented Mar 11, 2024

CI triage (WIP)

  • Clustermesh Upgrade failed waiting for the clustermesh to become ready. Could be a real issue or a flake, investigating. https://github.com/cilium/cilium/actions/runs/8231576562/job/22507953395#step:24:3642 clustermesh-apiserver 1 pods of Deployment clustermesh-apiserver are not ready
  • Clustermesh conformance:
    • (6, vxlan, ipv4, disabled, none, clustermesh, cluster, cluster... also failed waiting for the mesh, this time both hubble relay and clustermesh apiserver:
Errors:                hubble-relay             hubble-relay             1 pods of Deployment hubble-relay are not ready
                       clustermesh-apiserver    clustermesh-apiserver    1 pods of Deployment clustermesh-apiserver are not ready
Waiting until key rotation starts (seeing 1 keys)
Waiting until key rotation starts (seeing 1 keys)
Waiting until key rotation completes (seeing 2 keys)
Waiting until key rotation completes (seeing 2 keys)
Error from server: error dialing backend: dial tcp 192.168.183.34:10250: i/o timeout
  • ci ginkgo has many failed tests, typically FAIL: Kubernetes DNS did not become ready in time - failing to get the cluster into a healthy state initially
  • ci-e2e also has lots of failed tests, lots of timeouts in connectivity tests; after which at some point seemingly all the sysdumps overwhelm the k8s api server.

Given the amount of redness in these tests I don't think these are just flakes.

joamaki added 2 commits March 12, 2024 15:10
Replace the global map of BPF masquerade addresses with a query to
Table[NodeAddress] in the loader's patchHostNetdevDatapath and
remove the now unused code around masquerade addresses in pkg/node.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Some setups that use e.g. ECMP may not have addresses assigned to all
devices. For BPF masquerading one can use DeriveMasqIPFromDevice to work
around this, but we don't have a similar mechanism for the direct routing
address.

To fix this and retain the semantics of v1.14, add a "wildcard device"
(*) to the NodeAddress table that picks suitable fallback addresses from
all devices in the system. The criteria (in order) is to prefer public over
private, lower scope, lower ifindex and finally lower address.

To allow overlapping addresses in the NodeAddress table, use a primary key
that combines address with device name.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@joamaki joamaki force-pushed the pr/joamaki/cleanup-ipmasqaddrs branch from c629a52 to f538065 Compare March 12, 2024 15:40
@joamaki
Copy link
Contributor Author

joamaki commented Mar 12, 2024

/test

@joamaki joamaki added this pull request to the merge queue Mar 13, 2024
Merged via the queue into cilium:main with commit 76c85c5 Mar 13, 2024
@joamaki joamaki deleted the pr/joamaki/cleanup-ipmasqaddrs branch March 13, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants