Skip to content

Conversation

pippolo84
Copy link
Member

@pippolo84 pippolo84 commented Oct 1, 2024

When a new node is discovered, in case of a multi-devices environment, we loop over each neighbor discovery link to found the next hop IP for the given node. For each link, if a next hop is found, it is inserted in the related neighNextHopByNode4 map.

In this setup, it is possible that a node is reachable only by a subset of the neighbor discovery links. This case should not be considered an error, as long as the node is reachable by at least one link. Therefore, the errors handling logic is changed to propagate an error only when it is not possible to find a next hop for any link.

This also reduces the amount of memory used for joining all the strings related to the false error condition described above, besides reducing the amount of useless retry runs of the node-neighbor-link-updater controller.

Related: #34020

@pippolo84 pippolo84 added kind/bug This is a bug in the Cilium logic. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels Oct 1, 2024
@pippolo84

This comment was marked as outdated.

@pippolo84 pippolo84 marked this pull request as ready for review October 1, 2024 10:49
@pippolo84 pippolo84 requested review from a team as code owners October 1, 2024 10:49
@pippolo84 pippolo84 force-pushed the pr/pippolo84/insert-neigh-err-mgmt branch from cb1041f to 324355b Compare October 2, 2024 14:03
@pippolo84

This comment was marked as outdated.

@pippolo84 pippolo84 requested a review from ovidiutirla October 2, 2024 14:05
@pippolo84 pippolo84 force-pushed the pr/pippolo84/insert-neigh-err-mgmt branch from 324355b to 624e2d8 Compare October 2, 2024 15:08
@pippolo84
Copy link
Member Author

/test

Copy link
Contributor

@ovidiutirla ovidiutirla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🎉

Add AllMatch function that returns true if the predicate evaluates to
true for each element in the slice.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
When a new node is discovered, in case of a multi-devices environment,
we loop over each neighbor discovery link to found the next hop IP for
the given node. For each link, if a next hop is found, it is inserted in
the related neighNextHopByNode4 map.

In this setup, it is possible that a node is reachable only by a subset
of the neighbor discovery links. This case should not be considered an
error, as long as the node is reachable by at least one link. Therefore,
the errors handling logic is changed to propagate an error only when it
is not possible to find a next hop for any link.

This also reduces the amount of memory used for joining all the strings
related to the false error condition described above, besides reducing
the amount of useless retry runs of the node-neighbor-link-updater
controller.

Fixes: cilium#34020

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 force-pushed the pr/pippolo84/insert-neigh-err-mgmt branch from 624e2d8 to 192a35e Compare October 3, 2024 08:32
@pippolo84
Copy link
Member Author

/test

@pippolo84 pippolo84 added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 3, 2024
@pippolo84
Copy link
Member Author

Reviews are in and all required tests are green, marking ready to merge

@tklauser tklauser added this pull request to the merge queue Oct 3, 2024
@julianwiedmann
Copy link
Member

👋 is this worth backporting?

Merged via the queue into cilium:main with commit 7e6195b Oct 3, 2024
62 of 63 checks passed
@BenoitKnecht
Copy link
Contributor

👋 is this worth backporting?

@julianwiedmann I think it should be, yes, as evidenced by #35926. I can submit a PR if needed (both commits apply cleanly to v1.16), but I imagine there might be an automated backport process instead.

@pippolo84
Copy link
Member Author

👋 is this worth backporting?

Yep, being a bug it should be backported.

I can submit a PR if needed (both commits apply cleanly to v1.16), but I imagine there might be an automated backport process instead.

Np, I'm gonna take care of that, thanks anyway! 🤝

@pippolo84 pippolo84 added the needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch label Nov 21, 2024
@pippolo84 pippolo84 mentioned this pull request Nov 21, 2024
1 task
@pippolo84 pippolo84 added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 21, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants