-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Allow users to ignore links when enabling XDP #25870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d9e433e
to
9c3aa70
Compare
Out of curiosity, should the vlan devices be completely ignored, or just for the sake of XDP? |
Question for you, @borkmann -- given the usual XDP use-case, would an allowing regex make more sense (i.e. use XDP for devices matching the regex)? Or should we stick with an excluding regex? Separately, are there any other specifiers that might make sense? Only interfaces with a given driver? |
I'm not positive! I think regardless of whether it's beneficial to use tc on a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes lgtm, pending the question @squeed raised about inverting the option. Mind squashing the commits?
Would you all like me to update the Helm chart and docs while I am in here? |
f0f5cb8
to
b1ca84a
Compare
Hello, is there anything else you would like to see for this change? |
Hello! Wondering if there's anything I can do to help move this change along. Let me know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the wait @poblahblahblah. We should still clarify some of the questions by @squeed above, but a release is coming up so a lot of folks are stretched a bit thin. Apart from the two code style nits, this LGTM so I'll approve.
dc4b2a4
to
2f97fb2
Compare
Rebased! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, left a few nits.
2f97fb2
to
618ab40
Compare
@ti-mo updated per your recommendations |
618ab40
to
73307cd
Compare
This allows a user to specify a regex that will cause a link to be ignored when enabling XDP. We need this ability since we are running physical hosts where the primary NICs do support XDP, but have additional `vlan` devices that _do not_ have XDP support in the driver. Fixes: cilium#24768 Signed-off-by: Patrick O’Brien <patrick.obrien@thetradedesk.com>
73307cd
to
3b9a40c
Compare
@ti-mo all suggested changes have been made. I also rebased against the latest Thanks for the input - I appreciate the opportunity to improve! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks for addressing all of my feedback. 🙏
/test |
It looks like a single test failed - can that test be retried? |
@@ -1543,6 +1543,13 @@ func initEnv(vp *viper.Viper) { | |||
) | |||
} | |||
} | |||
|
|||
// Ensure the xdp-ignore-device-regex compiles | |||
if len(option.Config.XDPIgnoreDeviceNameRegex) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this question from @squeed ever discussed? Scanning through I could see a few folks saying there should be a discussion but I couldn't see a clear resolution to the question:
Question for you, @borkmann -- given the usual XDP use-case, would an allowing regex make more sense (i.e. use XDP for devices matching the regex)? Or should we stick with an excluding regex?
Separately, are there any other specifiers that might make sense? Only interfaces with a given driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the huge delay on my side. First release, and then out for long PTO, it slipped my radar.
Just thinking some more on this topic, @poblahblahblah / @joestringer / @joamaki, what if instead of regex we just had a simple bool option where we say "if XDP attach fails on this device, then just fallback to tc BPF". Would this be more generic and easier to use rather than specifying device regex? Something like --bpf-lb-acceleration-fallback=true [ In case BPF load balancing acceleration via native XDP is not supported for some devices, fall back to tc BPF attachment ]
. Other option could be to have a new mode --bpf-lb-acceleration=best-effort
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, no need for the config flag even IMO. If the device is expected to be managed, we should make it so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean just to fallback in case of failure with the native
option? I think this perhaps could lead to false conclusions, meaning people will try to active, then some ethtool driver channel setting is not correct and it'll fail and people just assume XDP would be there now but actually we run in tc BPF mode. So I would bail out by default and let users make a conscious decision to fallback upon failure. Also what would help as well in this context would be to let cilium status --verbose
dump the devices which have Cilium progs on XDP so that this is easily visible for troubleshooting / and sysdump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you are proposing the following:
- Keep the default behavior of bailing out in place
- Implement
--bpf-lb-acceleration=best-effort
which if set would have devices fall back to TC
If I updated this PR to do the above, would the PR be considered mergeable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I updated this PR to do the above, would the PR be considered mergeable?
Yes, definitely - if code looks good then I consider it mergable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see this MR merged so we can enable XDP in our environment! Is there anything I can do to help move this along further? |
Note to future triagers: this is waiting for #25870 (comment) to be implemented. |
@lmb Small note: we can convert PRs to drafts if they need a course change. @poblahblahblah Please mark ready for review if this is ready for another round. |
Closing in favor of #28666 |
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
This allows a user to specify a regex that will cause a link to be ignored when enabling XDP. We need this ability since we are running physical hosts where the primary NICs do support XDP, but have additional
vlan
devices that do not have XDP support in the driver.More background can be found in #24768, but to summarize we have 1 physical device (ens785np0) per host and due to network design we need an additional
vlan
type interface (ens785np0.3). Regrettably thevlan
driver in the kernel does not support XDP. When enabling XDP without this change Cilium enters a crashloop with this error:With this change applied and by specifying
xdp-ignore-device-name-regex
in the Cilium config we now get XDP enabled on the device andtc
for the vlan device:@borkmann indicated in #24768 (comment) that a regex to filter out a device name would be acceptable.
Happy to make any adjustments and I am looking forward to getting this merged!
Note: I have not yet added support for this option to the Helm chart. Happy to do so in this PR or in a follow up if desired. I am holding off on it just to ensure that the variable names I chose are acceptable.
Fixes: #24768