Skip to content

Conversation

MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented Nov 7, 2024

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

Before this PR two types of resources were exposed for services and endpoints: one excluding headless services/endpoints and one including them. This PR drop the non headless resource (and make the necessary changes to accommodate that) to make sure we only start one informer for services and endpoints and lower the load the apiserver.

Fixes: #35797

Remove duplicated watch on services and endpoint in the cilium-agent

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 7, 2024
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

MrFreezeex commented Nov 7, 2024

All the usage of service/endpoints resources and service cache in the daemon:

  • Service cache support headless services:

    svcID, newService := ParseService(k8sSvc, addrs)
    ->
    headless := false

  • Policy: services are from service cache

    p.svcCache.ForEachService(func(svcID k8s.ServiceID, svc *k8s.Service, eps *k8s.EndpointSlices) bool {
    &
    svcCacheNotifications <-chan k8s.ServiceNotification
    . And policy handling doesn't seems to require the frontend IP. Letting headless services be processed should result in CNP being able to process headless services without any new handling.

  • BGP speaker (metallb integration): Doesn't seems to read the cluster IP at all but rather the IP in the service status, see:

    ing := make([]v1.LoadBalancerIngress, len(in.Status.LoadBalancer.Ingress))

  • Datapath from service cache: The datapath already ignores headless services after getting them from the service cache:

    // Headless services do not need any datapath implementation
    &
    // Headless services do not need any datapath implementation
    . However the is headless is not entirely correct according to a SIG network test (see the commit on this PR for the related fix)

  • Redirect policy: Seems to be handling headless services well by ignoring service without frontend ips:

    ip := rpm.svcCache.GetServiceFrontendIP(*config.serviceID, lb.SVCTypeClusterIP)

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from e07aba6 to 5d2f4b7 Compare November 7, 2024 12:54
@MrFreezeex
Copy link
Member Author

/test

1 similar comment
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from a6a7dab to 9bbcdbb Compare November 7, 2024 14:48
@MrFreezeex MrFreezeex changed the title daemon: drop the non headless resource daemon: drop the non headless service/endpoint resources Nov 7, 2024
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from 9bbcdbb to bed90e7 Compare November 7, 2024 15:11
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch 2 times, most recently from 7fae3b2 to 84337ab Compare November 7, 2024 15:43
@MrFreezeex
Copy link
Member Author

/test

1 similar comment
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from 84337ab to 43e5479 Compare November 8, 2024 10:39
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

/ci-l4lb

@MrFreezeex MrFreezeex marked this pull request as ready for review November 8, 2024 12:09
@MrFreezeex MrFreezeex requested review from a team as code owners November 8, 2024 12:09
@MrFreezeex
Copy link
Member Author

MrFreezeex commented Nov 8, 2024

I believe this should be ready for reviews. Although, it needs reviews from more groups than sig-k8s/sig-agent, and in particular (according to code owners) @cilium/sig-bgp, @cilium/sig-policy , @cilium/sig-lb (for redirect policy).

I tried to summarize the impact of this change here: #35838 (comment), TLDR: according to my research it should have no impact outside of policies that will start to include headless Services (which might count as an additional feature?). But this needs reviews from all the groups mentioned above to validate that and have their opinion.

Also cc @giorio94 @marseel that found this issue (thanks again for finding it!)

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from 43e5479 to c27f2d6 Compare November 8, 2024 12:31
@MrFreezeex MrFreezeex requested a review from a team as a code owner November 8, 2024 12:31
@MrFreezeex MrFreezeex requested a review from thorn3r November 8, 2024 12:31
@MrFreezeex
Copy link
Member Author

/test

@joestringer joestringer added the release-note/misc This PR makes changes that have no direct user impact. label Nov 8, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 8, 2024
Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this! A few minor comments inline.

Redirect policy: Seems to be handling headless services well by ignoring service without frontend ips.

I assume this may be problematic in case of services marked as headless via the label, but associated with a ClusterIP, given that they would have a frontend IP.

And policy handling doesn't seems to require the frontend IP. Letting headless services be processed should result in CNP being able to process headless services without any new handling.

I personally don't have enough context to comment on this.

@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from 912ba08 to ed917f4 Compare November 11, 2024 15:16
@giorio94
Copy link
Member

Note that AFAICT the handling of a service is treated as headless if the label has service.kubernetes.io/headless is not super common outside of kube-proxy/associated conformance tests IIUC.

I agree, that looks really a corner case, which is probably not handled consistently by every project.

Copy link
Member

@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BGP looks good to me. Thanks!

Headless services may have frontend IPs if labeled with
`service.kubernetes.io/headless` and those should not be taken into
account in the datapath. This commit allow to handle the cleanup of such
services.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Before this commit two types of resources were exposed for services and
endpoints: one excluding headless services/endpoints and one including
them. This commit drop the non headless resource to make sure we only start one
informer for services and endpoints and lower the load the apiserver.
The consumers of the resources excluding the headless services/endpoint
already handles headless services.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
@MrFreezeex MrFreezeex force-pushed the resources-svc-unfiltered-headless branch from ed917f4 to 894b91a Compare November 12, 2024 11:44
Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the quick attention 👍

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just noticed I had not explicitly approved it.

@giorio94
Copy link
Member

/test

@MrFreezeex
Copy link
Member Author

/ci-e2e-upgrade

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 19, 2024
@tklauser tklauser added this pull request to the merge queue Nov 19, 2024
Merged via the queue into cilium:main with commit 4263231 Nov 19, 2024
64 checks passed
@rastislavs rastislavs mentioned this pull request Nov 20, 2024
14 tasks
@rastislavs rastislavs added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 20, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 26, 2024
marseel added a commit that referenced this pull request Feb 18, 2025
For cases when envoyConfig was enabled, we were starting separate
informers for services and endpoints. This resulted in a duplicated
informers, adding overhead for both cilium-agent and k8s apiserver.
Before, existing informer did not handle headless services, which was
changed in #35838 so now we can rely on a single informer instead.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Feb 18, 2025
For cases when envoyConfig was enabled, we were starting separate
informers for services and endpoints. This resulted in a duplicated
informers, adding overhead for both cilium-agent and k8s apiserver.
Before, existing informer did not handle headless services, which was
changed in #35838 so now we can rely on a single informer instead.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
jschwinger233 pushed a commit that referenced this pull request Feb 19, 2025
[ upstream commit a4fecbe ]

For cases when envoyConfig was enabled, we were starting separate
informers for services and endpoints. This resulted in a duplicated
informers, adding overhead for both cilium-agent and k8s apiserver.
Before, existing informer did not handle headless services, which was
changed in #35838 so now we can rely on a single informer instead.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: gray <greyschwinger@gmail.com>
jschwinger233 pushed a commit that referenced this pull request Feb 21, 2025
[ upstream commit a4fecbe ]

For cases when envoyConfig was enabled, we were starting separate
informers for services and endpoints. This resulted in a duplicated
informers, adding overhead for both cilium-agent and k8s apiserver.
Before, existing informer did not handle headless services, which was
changed in #35838 so now we can rely on a single informer instead.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: gray <greyschwinger@gmail.com>
github-merge-queue bot pushed a commit that referenced this pull request Feb 21, 2025
[ upstream commit a4fecbe ]

For cases when envoyConfig was enabled, we were starting separate
informers for services and endpoints. This resulted in a duplicated
informers, adding overhead for both cilium-agent and k8s apiserver.
Before, existing informer did not handle headless services, which was
changed in #35838 so now we can rely on a single informer instead.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: gray <greyschwinger@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/performance There is a performance impact of this. kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cilium agents start duplicate Service/Endpointslice informers
10 participants