Skip to content

bug: endpoint regeneration proceeds before policy synchronization has completed #31865

@squeed

Description

@squeed

I was working on another change when I noticed this.

Likely as a result of #30322, I noticed endpoints undergoing their first regeneration after agent restart before policy had been fully applied. This is unfortunate, as it meant endpoints briefly had incorrect policy.

I added some additional logging and saw something like this:

2024-04-09T10:35:53.379194584Z time="2024-04-09T10:35:53Z" level=debug msg="resource \"cilium/v2::CiliumNetworkPolicy\" cache has synced, stopping timeout watcher" subsys=k8s
2024-04-09T10:36:01.638609794Z time="2024-04-09T10:36:01Z" level=info msg="CacheStatus closed, proceeding with regeneration" subsys=daemon
2024-04-09T10:36:01.687707197Z time="2024-04-09T10:36:01Z" level=debug msg="waiting for cache to synchronize" kubernetesResource="cilium/v2::CiliumNetworkPolicy" subsys=k8s
2024-04-09T10:36:01.687726580Z time="2024-04-09T10:36:01Z" level=debug msg="Is CNP synced?" subsys=policy-k8s-watcher v=false
2024-04-09T10:36:01.691903550Z time="2024-04-09T10:36:01Z" level=debug msg="Adding CiliumNetworkPolicy" ciliumNetworkPolicyName=allow-one-fqdn-http k8sApiVersion=cilium.io/v2 k8sNamespace=foo subsys=policy-k8s-watcher
2024-04-09T10:36:01.692351378Z time="2024-04-09T10:36:01Z" level=info msg="Policy Add Request" ciliumNetworkPolicy= [snip]
2024-04-09T10:36:01.692450366Z time="2024-04-09T10:36:01Z" level=info msg="Imported CiliumNetworkPolicy" ciliumNetworkPolicyName=allow-one-fqdn-http k8sApiVersion=cilium.io/v2 k8sNamespace=foo subsys=policy-k8s-watcher
2024-04-09T10:36:01.788546485Z time="2024-04-09T10:36:01Z" level=debug msg="Is CNP synced?" subsys=policy-k8s-watcher v=false
2024-04-09T10:36:01.788585344Z time="2024-04-09T10:36:01Z" level=debug msg="CNP sync complete" subsys=policy-k8s-watcher

I also saw the endpoint in question undergo two regenerations, one of which definitely was before the PolicyRepository was fully populated.

Metadata

Metadata

Assignees

Labels

area/agentCilium agent related.priority/highThis is considered vital to an upcoming release.release-blocker/1.16This issue will prevent the release of the next version of Cilium.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions