Skip to content

Inefficient resolution of toServices in CNPs #35273

@marseel

Description

@marseel

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

When performing CNP-scalability testing with "regular" policies that do not have toServices rules I've noticed that a significant amount of memory allocations come from resolveServices
pprof with memory allocations:
Image

resolveToServices is called for each CNP regardless if the CNP has the toServices rule or not. On top of that, we iterate through all services and rules in CNP: https://github.com/cilium/cilium/blame/d73443647bd44bca9ea883d68a2874db813e93f0/pkg/policy/k8s/service.go#L117-L146

This is super inefficient and wasteful. Instead, we should:
cache information if a particular CNP affects services and only process it when:

  • CNP has toServices rule
  • CNP previously had toServices rule, but the rule was removed

How can we reproduce the issue?

N/A

Cilium Version

~main branch

Kernel Version

N/A

Kubernetes Version

N/A

Regression

Potentially regression from #31062 but I haven't checked code in detail how it was handled previously.

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

affects/v1.16This issue affects v1.16 brancharea/agentCilium agent related.kind/bugThis is a bug in the Cilium logic.kind/regressionThis functionality worked fine before, but was broken in a newer release of Cilium.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.sig/scalabilityImpacts how well Cilium handles a high rate of events or churn.

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions