Skip to content

Loopback CNI is not copied atomically #29461

@akhilles

Description

@akhilles

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

If copying of the loopback CNI binary gets interrupted, then a truncated version will exist on the node. This causes CNI operations to fail. Also, cilium-agent can't recover the node from this state even if it's restarted because it thinks that loopback already exists.

cilium-cni isn't affected by this because it does an atomic copy (cp + mv). I think we should do the same for loopback.

Relevant code: https://github.com/cilium/cilium/blob/1aa21ecc3697a78d66a1fbd3f9e4cc3d7f6ebcc6/plugins/cilium-cni/install-plugin.sh#L21C2-L21C4

Cilium Version

All

Kernel Version

N/A

Kubernetes Version

N/A

Sysdump

N/A

Relevant log output

plugin type="loopback" failed (delete): netplugin failed with no error message: signal: segmentation fault

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/agentCilium agent related.area/cniImpacts the Container Networking Interface between Cilium and the orchestrator.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions