Skip to content

XDP: Failing to attach XDP program to a tagged VLAN interface #24768

@JunPark

Description

@JunPark

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

With Cilium 1.13.0, we could successfully enable XDP when a node has a single native VLAN. However, after we added a tagged VLAN (namely, vlan3 with the vlan id 3) on the same node along with the native VLAN, and then restarted its cilium agent, we got the following error.

Failed to compile XDP program" error="program cil_xdp_entry: attaching XDP program to interface vlan3: operation not supported" subsys=datapath-loader

Here is our installation parameter set. Basically we use BPG to advertise both POD CIRD and Service CIDR with xvlan tunneling for Pod-to-pod communication.

helm upgrade --namespace kube-system cilium cilium/cilium --version 1.13.0 --set bgpControlPlane.enabled=true,bpf.masquerade=true,cluster.id=0,cluster.name=k8s-cluster-01,encryption.nodeEncryption=false,ipMasqAgent.enabled=true,ipam.mode=kubernetes,ipam.operator.clusterPoolIPv4MaskSize=26,ipam.operator.clusterPoolIPv4PodCIDR=172.16.0.0/12,kubeProxyReplacement=strict,loadBalancer.acceleration=native,operator.replicas=1,serviceAccounts.cilium.name=cilium,serviceAccounts.operator.name=cilium-operator,tunnel=vxlan
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Tue Apr  4 15:47:21 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 9
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.13.0.

Just in case, here is an example of the success case when there is only a single VLAN as a good reference. I think such a result can prove our Mellanox driver supports XDP.
But unfortunately I could not see such a good result after adding a tagged VLAN on the native VLAN interface.

# bpftool net
xdp:
ens785np0(2) driver id 1641

tc:
ens785np0(2) clsact/ingress cilium-ens785np0 id 1754
ens785np0(2) clsact/egress cilium-ens785np0 id 1756
cilium_net(4) clsact/ingress cilium-cilium_net id 1737
cilium_host(5) clsact/ingress cilium-cilium_host id 1718
cilium_host(5) clsact/egress cilium-cilium_host id 1724
cilium_vxlan(6) clsact/ingress bpf_overlay.o:[from-overlay] id 1576
cilium_vxlan(6) clsact/egress bpf_overlay.o:[to-overlay] id 1587
lxc_health(11) clsact/ingress cilium-lxc_health id 1674

Cilium Version

1.13.0

Kernel Version

Linux worknode-1 5.15.0-67-generic #74~20.04.1-Ubuntu SMP Wed Feb 22 14:52:34 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

v1.24.10+rke2r1

Sysdump

Somehow my sysdump file is too big (189MB), I cannot upload it here because its size exceeds 25 MB.

Relevant log output

$ cilium status
...
...
Errors:           cilium             cilium-jmgtm    unable to retrieve cilium status: container cilium-agent is in CrashLoopBackOff, exited with code 1: level=fatal msg="Failed to compile XDP program" error="program cil_xdp_entry: attaching XDP program to interface vlan3: operation not supported" subsys=datapath-loader
...

Anything else?

lspci | grep -i ether

01:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)
65:00.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions