-
Notifications
You must be signed in to change notification settings - Fork 3.4k
routing: Delete stale ingress and egress rules #39734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
julianwiedmann
merged 4 commits into
cilium:main
from
pippolo84:pr/pippolo84/ipam-delete-stale-rules
May 28, 2025
Merged
routing: Delete stale ingress and egress rules #39734
julianwiedmann
merged 4 commits into
cilium:main
from
pippolo84:pr/pippolo84/ipam-delete-stale-rules
May 28, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This comment was marked as outdated.
This comment was marked as outdated.
Refactor computeTableIDFromIfaceNumber to take the compat flag as a parameter. Depending on the compat flag value, the helper returns the iface number to be used directly as the table ID or a the same iface number plus a constant offset. This refactor does not change anything but it will be useful in a subsequent change. Related: 67dc983 ("routing: Fix route collisions in AWS ENI") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In order to simplify the ingress and engress routing rules deletion logic, removes the per family functions and rely on the underlying deleteRule function directly, passing a family parameter. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Instead of relying on t.Log to separate each subtest output, use the idiomatic table-driven test structure calling t.Run for each subtest. Also, improve the error message reporting the error returned by Delete. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
4eff60e
to
ea7f1a1
Compare
This comment was marked as outdated.
This comment was marked as outdated.
e2f982e
to
c87bb06
Compare
This comment was marked as outdated.
This comment was marked as outdated.
c87bb06
to
8782267
Compare
This comment was marked as outdated.
This comment was marked as outdated.
8782267
to
ac21f25
Compare
This comment was marked as outdated.
This comment was marked as outdated.
The current rule deletion logic skips the removal in case more than one rule that matches the requested template is found. Therefore, if a stale rule is not garbage collected properly and another rule for the same address is installed, Cilium stops removing the installed rules, accumulating them indefinitely. The change addresses this issue deleting all the stale ingress and egress rules matching the specified template, even when more than one occurrence is found. In order to do so, a more generic deleteRulesFiltered is added, where all the installed rules matching a given template are then evaluated by a series of filter to decide if it should be removed or not. In case of egress rules, the new logic skips the deletion of those that refers to a table ID not managed by Cilium, that is, a table ID outside of the per ENI routing table range. The unit tests are changed accordingly, therefore expecting a success instead of an error when more than one rule is found. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
ac21f25
to
181a22c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
tklauser
approved these changes
May 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for @cilium/endpoint
YutaroHayakawa
approved these changes
May 27, 2025
christarazi
approved these changes
May 28, 2025
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects/v1.16
This issue affects v1.16 branch
affects/v1.17
This issue affects v1.17 branch
area/datapath
Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
area/eni
Impacts ENI based IPAM.
area/ipam
IP address management, including cloud IPAM
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/minor
This PR changes functionality that users may find relevant to operating Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Among the others, the result of the last
TestDelete
sub-test inpkg/datapath/linux/routing/routing_test.go
, added in #24933, is altered by this PR.The subtest was meant to verify that a change in the masquerading option generates an error when trying to remove the previously installed rules. Specifically, the workflow simulated in the test was:
Configure
)But the test was actually failing due to the fact that in the test setup two rules where installed:
In other words, the previous logic was not logging any error when a single stale egress rule with a non-nil destination CIDR was found. The error seen in the test was just a consequence of the limitation related to the multiple rules found.
With the changes in this PR, we remove all stale egress rules with a non-nil destination CIDR, since the template rule passed to
deleteRulesFiltered
has a nil dst CIDR, corresponding to a0.0.0.0/0
CIDR. Therefore, all the previous rules are selected byroute.ListRules
and removed.Note that the opposite (that is, restarting the agent after enabling masquerading) still leaves stale rules behind. This was not solved by the previous approach either, but requires #38285 instead.
Notes to reviewer: please review each commit individually