Skip to content

Conversation

poblahblahblah
Copy link
Contributor

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • [?] All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • [?] Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

This allows a user to specify a regex that will cause a link to be ignored when enabling XDP. We need this ability since we are running physical hosts where the primary NICs do support XDP, but have additional vlan devices that do not have XDP support in the driver.

More background can be found in #24768, but to summarize we have 1 physical device (ens785np0) per host and due to network design we need an additional vlan type interface (ens785np0.3). Regrettably the vlan driver in the kernel does not support XDP. When enabling XDP without this change Cilium enters a crashloop with this error:

Failed to compile XDP program" error="program cil_xdp_entry: attaching XDP program to interface ens785np0.3: operation not supported" subsys=datapath-loader

With this change applied and by specifying xdp-ignore-device-name-regex in the Cilium config we now get XDP enabled on the device and tc for the vlan device:

$ sudo bpftool net
xdp:
ens785np0(2) driver id 22899

tc:
ens785np0(2) clsact/ingress cil_from_netdev-ens785np0 id 23063
ens785np0(2) clsact/egress cil_to_netdev-ens785np0 id 23077
cilium_net(4) clsact/ingress cil_to_host-cilium_net id 23052
cilium_host(5) clsact/ingress cil_to_host-cilium_host id 23005
cilium_host(5) clsact/egress cil_from_host-cilium_host id 23051
cilium_vxlan(6) clsact/ingress cil_from_overlay-cilium_vxlan id 22907
cilium_vxlan(6) clsact/egress cil_to_overlay-cilium_vxlan id 22903
lxc4ecafafd804a(10) clsact/ingress cil_from_container-lxc4ecafafd804a id 23022
lxc9e4006f5d109(12) clsact/ingress cil_from_container-lxc9e4006f5d109 id 22950
lxcb6797ab24b31(14) clsact/ingress cil_from_container-lxcb6797ab24b31 id 22914
lxc22d432e999bb(16) clsact/ingress cil_from_container-lxc22d432e999bb id 22913
lxcecd9c72cbc66(18) clsact/ingress cil_from_container-lxcecd9c72cbc66 id 22983
lxc123aaba50ec5(20) clsact/ingress cil_from_container-lxc123aaba50ec5 id 22937
lxc602295b67e63(22) clsact/ingress cil_from_container-lxc602295b67e63 id 23028
lxcd5f791798257(24) clsact/ingress cil_from_container-lxcd5f791798257 id 22917
lxc3bd4bca11864(26) clsact/ingress cil_from_container-lxc3bd4bca11864 id 23017
lxc1ecd38cdf625(28) clsact/ingress cil_from_container-lxc1ecd38cdf625 id 23039
lxcb36787663030(30) clsact/ingress cil_from_container-lxcb36787663030 id 22922
ens785np0.3(57) clsact/ingress cil_from_netdev-ens785np0.3 id 23090
ens785np0.3(57) clsact/egress cil_to_netdev-ens785np0.3 id 23103
lxc_health(79) clsact/ingress cil_from_container-lxc_health id 23008

flow_dissector:

@borkmann indicated in #24768 (comment) that a regex to filter out a device name would be acceptable.

Happy to make any adjustments and I am looking forward to getting this merged!

Note: I have not yet added support for this option to the Helm chart. Happy to do so in this PR or in a follow up if desired. I am holding off on it just to ensure that the variable names I chose are acceptable.

Fixes: #24768

Adds ability to filter out device names via regex when enabling XDP

@poblahblahblah poblahblahblah requested review from a team as code owners June 2, 2023 16:29
@poblahblahblah poblahblahblah requested review from thorn3r, squeed and rgo3 June 2, 2023 16:30
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 2, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Jun 2, 2023
@poblahblahblah poblahblahblah marked this pull request as draft June 2, 2023 16:42
@poblahblahblah poblahblahblah marked this pull request as ready for review June 2, 2023 17:04
@squeed
Copy link
Contributor

squeed commented Jun 6, 2023

Out of curiosity, should the vlan devices be completely ignored, or just for the sake of XDP?

@squeed
Copy link
Contributor

squeed commented Jun 6, 2023

Question for you, @borkmann -- given the usual XDP use-case, would an allowing regex make more sense (i.e. use XDP for devices matching the regex)? Or should we stick with an excluding regex?

Separately, are there any other specifiers that might make sense? Only interfaces with a given driver?

@poblahblahblah
Copy link
Contributor Author

Out of curiosity, should the vlan devices be completely ignored, or just for the sake of XDP?

I'm not positive!

I think regardless of whether it's beneficial to use tc on a vlan type device the change would be useful for other groups who want to have XDP but have mixed NICs where one supports XDP and the other does not.

Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes lgtm, pending the question @squeed raised about inverting the option. Mind squashing the commits?

@poblahblahblah
Copy link
Contributor Author

Would you all like me to update the Helm chart and docs while I am in here?

@poblahblahblah poblahblahblah requested review from a team as code owners June 8, 2023 21:11
@poblahblahblah
Copy link
Contributor Author

Hello, is there anything else you would like to see for this change?

@ti-mo ti-mo added kind/enhancement This would improve or streamline existing functionality. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Jun 20, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 20, 2023
@poblahblahblah
Copy link
Contributor Author

Hello!

Wondering if there's anything I can do to help move this change along.

Let me know!

Copy link
Contributor

@rgo3 rgo3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the wait @poblahblahblah. We should still clarify some of the questions by @squeed above, but a release is coming up so a lot of folks are stretched a bit thin. Apart from the two code style nits, this LGTM so I'll approve.

@rgo3 rgo3 added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Jun 23, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 18, 2023
@poblahblahblah
Copy link
Contributor Author

Rebased!

Copy link
Contributor

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, left a few nits.

@ti-mo ti-mo removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 18, 2023
@poblahblahblah
Copy link
Contributor Author

@ti-mo updated per your recommendations

This allows a user to specify a regex that will cause a link to be ignored
when enabling XDP. We need this ability since we are running physical hosts
where the primary NICs do support XDP, but have additional `vlan` devices that
_do not_ have XDP support in the driver.

Fixes: cilium#24768

Signed-off-by: Patrick O’Brien <patrick.obrien@thetradedesk.com>
@poblahblahblah
Copy link
Contributor Author

@ti-mo all suggested changes have been made. I also rebased against the latest main

Thanks for the input - I appreciate the opportunity to improve!

Copy link
Contributor

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for addressing all of my feedback. 🙏

@ti-mo
Copy link
Contributor

ti-mo commented Aug 24, 2023

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 24, 2023
@poblahblahblah
Copy link
Contributor Author

It looks like a single test failed - can that test be retried?

@@ -1543,6 +1543,13 @@ func initEnv(vp *viper.Viper) {
)
}
}

// Ensure the xdp-ignore-device-regex compiles
if len(option.Config.XDPIgnoreDeviceNameRegex) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this question from @squeed ever discussed? Scanning through I could see a few folks saying there should be a discussion but I couldn't see a clear resolution to the question:

Question for you, @borkmann -- given the usual XDP use-case, would an allowing regex make more sense (i.e. use XDP for devices matching the regex)? Or should we stick with an excluding regex?

Separately, are there any other specifiers that might make sense? Only interfaces with a given driver?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the huge delay on my side. First release, and then out for long PTO, it slipped my radar.

Just thinking some more on this topic, @poblahblahblah / @joestringer / @joamaki, what if instead of regex we just had a simple bool option where we say "if XDP attach fails on this device, then just fallback to tc BPF". Would this be more generic and easier to use rather than specifying device regex? Something like --bpf-lb-acceleration-fallback=true [ In case BPF load balancing acceleration via native XDP is not supported for some devices, fall back to tc BPF attachment ]. Other option could be to have a new mode --bpf-lb-acceleration=best-effort ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, no need for the config flag even IMO. If the device is expected to be managed, we should make it so.

Copy link
Member

@borkmann borkmann Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean just to fallback in case of failure with the native option? I think this perhaps could lead to false conclusions, meaning people will try to active, then some ethtool driver channel setting is not correct and it'll fail and people just assume XDP would be there now but actually we run in tc BPF mode. So I would bail out by default and let users make a conscious decision to fallback upon failure. Also what would help as well in this context would be to let cilium status --verbose dump the devices which have Cilium progs on XDP so that this is easily visible for troubleshooting / and sysdump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are proposing the following:

  • Keep the default behavior of bailing out in place
  • Implement --bpf-lb-acceleration=best-effort which if set would have devices fall back to TC

If I updated this PR to do the above, would the PR be considered mergeable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I updated this PR to do the above, would the PR be considered mergeable?

Yes, definitely - if code looks good then I consider it mergable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkmann Sorry for the delay. I made a first pass at this change here: #28666

@joestringer joestringer requested a review from borkmann August 24, 2023 18:44
@joestringer joestringer added dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Aug 24, 2023
@poblahblahblah
Copy link
Contributor Author

I would love to see this MR merged so we can enable XDP in our environment!

Is there anything I can do to help move this along further?

@lmb
Copy link
Contributor

lmb commented Sep 25, 2023

Note to future triagers: this is waiting for #25870 (comment) to be implemented.

@ti-mo
Copy link
Contributor

ti-mo commented Oct 4, 2023

@lmb Small note: we can convert PRs to drafts if they need a course change.

@poblahblahblah Please mark ready for review if this is ready for another round.

@ti-mo ti-mo marked this pull request as draft October 4, 2023 11:41
@poblahblahblah poblahblahblah mentioned this pull request Oct 18, 2023
6 tasks
@poblahblahblah
Copy link
Contributor Author

Closing in favor of #28666

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. kind/community-contribution This was a contribution made by a community member. kind/enhancement This would improve or streamline existing functionality. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

XDP: Failing to attach XDP program to a tagged VLAN interface
10 participants