Skip to content

Conversation

antonipp
Copy link
Contributor

@antonipp antonipp commented Jun 20, 2025

Description

See detailed description of the issue in #35501

This PR ensures that when BPF Masquerading is enabled together with the BPF ip-masq-agent, the Cilium Agent managed datapath ip rules are actually consistent with the ip-masq-agent configuration. The idea is that all nonMasqueradeCIDRs from ip-masq-agent configuration actually need to be propagated down to the datapath to ensure that these CIDRs are not masqueraded.

As mentioned in the issue, the actual datapath ip rule setup is here:

if info.Masquerade && info.IpamMode == ipamOption.IPAMENI {
// Lookup a VPC specific table for all traffic from an endpoint to the
// CIDR configured for the VPC on which the endpoint has the IP on.
// ReplaceRule function doesn't handle all zeros cidr and return `file exists` error,
// so we need to normalize the rule to cidr here and in Delete
for _, cidr := range info.IPv4CIDRs {
if err := route.ReplaceRule(route.Rule{
Priority: egressPriority,
From: &ipWithMask,
To: normalizeRuleToCIDR(&cidr),
Table: tableID,
Protocol: linux_defaults.RTProto,
}); err != nil {
return fmt.Errorf("unable to install ip rule: %w", err)
}
}

The To CIDRs are coming from the response to the CNI IP Allocation request. The response is populated here:

cilium/pkg/ipam/crd.go

Lines 686 to 691 in cad67ba

result.CIDRs = []string{eni.VPC.PrimaryCIDR}
result.CIDRs = append(result.CIDRs, eni.VPC.CIDRs...)
// Add manually configured Native Routing CIDR
if a.conf.IPv4NativeRoutingCIDR != nil {
result.CIDRs = append(result.CIDRs, a.conf.IPv4NativeRoutingCIDR.String())
}

The newly added logic in my PR simply reads the ip-masq-agent config file and returns the non-masqueradable CIDRs in the CNI IP allocation response.

It's somewhat of a hotpath so I made sure that it fails gracefully if something goes wrong. It also makes the best effort to read the CIDRs from memory and not from disk. I also moved the fsnotify Watcher instantiation to the Start() function to avoid its overhead in the codepath. Finally, in order to avoid race conditions with the ip-masq-agent startup and the IP allocation requests (the IPAM Allocator is started before the ip-masq-agent), this code doesn't rely on the ip-masq-agent being in running state. -> This part of the description is obsolete since I ended up doing a refactoring of the ip-masq-agent in #40347 to extract it into a separate Cell.

One small limitation is that the ip rules will not be updated dynamically if the ip-masq-agent configuration changes. The new CIDRs will apply to all newly started pods however existing pods will need to be restarted in order for the new CIDRs to be applied. IMO it's not a big deal since the situation is already much better than what we had previously.

Testing

Before the PR

Taking the same AWS setup as described in my original issue, when we enable masquerading for a given pod, its ip rules look something like this:

$ ip rule | grep 10.130.100.133
111:	from 10.130.100.133 to 10.128.0.0/13 lookup 11
111:	from 10.130.100.133 to 100.112.0.0/13 lookup 11

The to CIDRs are only coming from the VPC CIDRs auto-detected by the Operator. If we enable the ip-masq-agent, the rules stay the same.

After the PR

Once we deploy the PR and then enable ip-masq-agent with the following config:

  config.yaml: |
    nonMasqueradeCIDRs:
    - 10.0.0.0/8
    - 172.16.0.0/12
    - 100.64.0.0/10
    - 192.168.0.0/16
    masqLinkLocal: false

The CIDRs are now properly propagated down to the ip rule level:

$ ip rule | grep 10.112.49.248
111:	from 10.112.49.248 to 10.0.0.0/8 lookup 11
111:	from 10.112.49.248 to 100.64.0.0/10 lookup 11
111:	from 10.112.49.248 to 169.254.0.0/16 lookup 11
111:	from 10.112.49.248 to 172.16.0.0/12 lookup 11
111:	from 10.112.49.248 to 192.168.0.0/16 lookup 11

We can also confirm that cross-VPC traffic is no longer dropped.

Fixes: #35501

ip-masq-agent: Ensure ip rules on the host match the BPF ip-masq-agent configuration in AWS ENI mode. Note that rules are set up once at pod creation and will not be regenerated if the ip-masq-agent configuration changes.

@antonipp antonipp requested review from a team as code owners June 20, 2025 09:16
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 20, 2025
@antonipp
Copy link
Contributor Author

/test

Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your contribution @antonipp!

I think it would help to modularize the IPMasqAgent and provide it via Hive Cell first. See my comments inline.

Are you familiar with Hive and willing to do the extra steps?

@joestringer joestringer added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Jun 20, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 20, 2025
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor nits and a question left inline, but overall LGTM from an IPAM point of view.

I second Marco's suggestions about the ip-masq-agent "cellification" and the improvement of NonMasqCIDRsFromConfig cache usage.

…mode

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
@antonipp antonipp force-pushed the ai/ip-rules-ip-masq-agent branch from 67a551f to 842410a Compare July 28, 2025 13:39
@antonipp
Copy link
Contributor Author

Alright, my ip-masq-agent Hive refactoring PR #40347 has been merged, so I rebased this current PR off the new changes. This simplified the code, LMK what you think!

@antonipp
Copy link
Contributor Author

/test

Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sig-agent looks good to me. Thanks for extracting the IP Masq Agent into its own Hive Cell! It looks way cleaner this way.

@mhofstetter mhofstetter added the area/ipam IP address management, including cloud IPAM label Jul 28, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jul 28, 2025
@joestringer joestringer enabled auto-merge July 28, 2025 18:53
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 💯

@joestringer joestringer added this pull request to the merge queue Jul 29, 2025
Merged via the queue into cilium:main with commit caf8f14 Jul 29, 2025
65 of 66 checks passed
@antonipp antonipp deleted the ai/ip-rules-ip-masq-agent branch July 29, 2025 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ipam IP address management, including cloud IPAM ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent ip rules with BPF ip-masq-agent nonMasqueradeCIDRs
4 participants