Skip to content

Cilium agent fails to start when bpf.lbModeAnnotation is true. Invalid value for --bpf-lb-dsr-dispatch #37659

@pushpinderbal

Description

@pushpinderbal

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.17.0 and lower than v1.18.0

What happened?

Version - 1.17.1

I'm trying to use annotation based DSR, but Cilium agent does not start when annotation mode is on.
The result is same even with loadBalancer.dsrDispatch=geneve and tunnelProtocol=geneve.

Works as expected when not using annotations and set loadBalancer.mode=dsr.

How can we reproduce the issue?

  1. values.yml
---

routingMode: native
ipv4NativeRoutingCIDR: 10.42.0.0/16
autoDirectNodeRoutes: true
kubeProxyReplacement: true
bgpControlPlane:
  enabled: true
k8sServiceHost: k3s.local.s1ngh.ca
k8sServicePort: 1443
rollOutCiliumPods: true
envoy:
  rollOutPods: true
operator:
  rollOutPods: true
ipam:
  operator:
    clusterPoolIPv4PodCIDRList:
      - 10.42.0.0/16
ingressController:
  enabled: true
  default: true
bpf:
  lbModeAnnotation: true
  masquerade: true
loadBalancer:
  mode: dsr
  1. Install Cilium with Cilum CLI -> cilium install --version 1.17.1 -f values.yml

Cilium Version

1.17.1

Kernel Version

Linux k3s-1 6.8.0-52-generic #53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 00:06:25 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Client Version: v1.31.3
Kustomize Version: v5.4.2
Server Version: v1.31.5+k3s1

Regression

No response

Sysdump

cilium-sysdump-20250215-130618.zip

Relevant log output

time="2025-02-15T18:01:46.015591377Z" level=info msg="Starting Hubble server" address=":4244" subsys=hubble tls=true
time="2025-02-15T18:01:46.017110611Z" level=info msg="Using Managed Neighbor Kernel support" subsys=daemon
time="2025-02-15T18:01:46.017190797Z" level=info msg="Auto-enabling \"enable-node-port\", \"enable-external-ips\", \"bpf-lb-sock\", \"enable-host-port\", \"enable-session-affinity\" features" subsys=daemon
time="2025-02-15T18:01:46.017239064Z" level=error msg="unable to initialize kube-proxy replacement options" error="Invalid value for --bpf-lb-dsr-dispatch: opt" subsys=daemon
time=2025-02-15T18:01:46Z level=error msg="Start hook failed" function="cmd.newDaemonPromise.func1 (cmd/daemon_main.go:1638) (agent.controlplane.daemon)" error="daemon creation failed: unable to initialize kube-proxy replacement options: Invalid value for --bpf-lb-dsr-dispatch: opt"
time=2025-02-15T18:01:46Z level=error msg="Start failed" error="daemon creation failed: unable to initialize kube-proxy replacement options: Invalid value for --bpf-lb-dsr-dispatch: opt" duration=356.780963ms
time=2025-02-15T18:01:46Z level=info msg=Stopping
time="2025-02-15T18:01:46.018241775Z" level=info msg="Stopping fswatcher" config=tls-server subsys=hubble
time="2025-02-15T18:01:46.018629266Z" level=info msg="Datapath signal listener exiting" subsys=signal
time="2025-02-15T18:01:46.019880476Z" level=info msg="Datapath signal listener done" subsys=signal
time=2025-02-15T18:01:47Z level=info msg="Stop hook executed" function="*job.group.Stop (agent.datapath.iptables)" duration=1.488657108s
time=2025-02-15T18:01:47Z level=error msg="Stop hook failed" function="cell.newIPCache.func1 (.../ipcache/cell/cell.go:53) (agent.controlplane.ipcache)" error="unable to find controller ipcache-inject-labels"
time="2025-02-15T18:01:47.513717685Z" level=error msg="Close() called without calling InitIdentityAllocator() first" subsys=identity-cache
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.auth.observer-job-auth-gc-identity-events (rev=50)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.auth.observer-job-auth-request-authentication (rev=48)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.auth.timer-job-auth-gc-cleanup (rev=49)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.daemon.job-sync-hostips (rev=46)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.endpoint-manager.endpoint-gc (rev=8)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.hubble.job-hubble (rev=44)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.identity.timer-job-id-alloc-update-policy-maps (rev=59)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.l2-announcer.job-l2-announcer-lease-gc (rev=24)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.policy.observer-job-policy-importer (rev=54)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.service-manager.job-health-check-event-watcher (rev=56)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.controlplane.service-resolver.job-service-reloader-initializer (rev=53)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.iptables.ipset.job-ipset-init-finalizer (rev=22)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.iptables.ipset.job-reconcile (rev=58)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.iptables.ipset.job-refresh (rev=57)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.iptables.job-iptables-reconciliation-loop (rev=52)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.mtu.job-mtu-updater (rev=47)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.node-address.job-node-address-update (rev=55)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.orchestrator.job-reinitialize (rev=45)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.sysctl.job-reconcile (rev=61)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.datapath.sysctl.job-refresh (rev=60)" module=health
time=2025-02-15T18:01:47Z level=info msg="agent.infra.k8s-synced-crdsync.job-sync-crds (rev=7)" module=health
time="2025-02-15T18:01:47.515790241Z" level=info msg="Stopped gops server" address="127.0.0.1:9890" subsys=gops
time="2025-02-15T18:01:47.515928694Z" level=fatal msg="failed to start: daemon creation failed: unable to initialize kube-proxy replacement options: Invalid value for --bpf-lb-dsr-dispatch: opt\nfailed to stop: unable to find controller ipcache-inject-labels" subsys=daemon
time="2025-02-15T18:01:47.739960698Z" level=info msg="Stopped reading perf buffer" startTime="2025-02-15 18:01:46.011500858 +0000 UTC m=+1.924258270" subsys=monitor-agent

Anything else?

https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/#annotation-based-dsr-and-snat-mode

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/documentationImpacts the documentation, including textual changes, sphinx, or other doc generation code.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsfeature/dsrRelates to Cilium's Direct-Server-Return feature for KPR.good-first-issueGood starting point for new developers, which requires minimal understanding of Cilium.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.kind/enhancementThis would improve or streamline existing functionality.pinnedThese issues are not marked stale by our issue bot.

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions