Skip to content

Conversation

robscott
Copy link
Member

This PR adds support for Service Traffic Distribution, a Kubernetes feature that should go beta/available-by-default in v1.31. For more information, refer to the following resources:

TrafficDistribution represents the latest and hopefully final iteration in Kubernetes topology aware routing. The default logic is quite simple - route traffic within the same zone if there are any healthy endpoints in that zone. The kube-proxy implementation of that is still based on the EndpointSlice hints field, but does not strictly need to be,

This KEP also comes with a more fundamental change to the implementation logic. As part of a broader goal of separating concerns, we've removed the check from Kube-Proxy that verified that a topology annotation was set before honoring hints on EndpointSlices. Now we simply check that all endpoints for an EndpointSlice have hints, and honor them if so.

In this PR I've left some logic to determine if a Service is likely to have hints set, but that is now only used to determine if a Service should be reconciled again if the local node labels change.

Cilium now supports Kubernetes Service TrafficDistribution. To access this feature, use `--enable-service-topology` when running Cilium.

@robscott robscott requested a review from a team as a code owner May 23, 2024 00:22
@robscott robscott requested a review from tommyp1ckles May 23, 2024 00:22
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 23, 2024
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label May 23, 2024
@robscott
Copy link
Member Author

cc @gauravkghildiyal

@tommyp1ckles
Copy link
Contributor

tommyp1ckles commented May 28, 2024

Hi @robscott , thanks for the PR, could you provide more context in the Git commit message as described here.

@aanm
Copy link
Member

aanm commented May 31, 2024

@robscott the changes LGTM but there are some files that need to be regenerated accordingly the CI.

Copy link
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a small doc update to the topology awareness section to document support for TrafficDistribution? https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/#topology-aware-hints

@joestringer
Copy link
Member

/test

@robscott robscott force-pushed the trafficdistribution branch 2 times, most recently from c6b4e0d to 2aa69ae Compare June 3, 2024 23:50
@robscott robscott requested review from a team as code owners June 3, 2024 23:50
@robscott robscott requested review from ysksuzuki and qmonnet June 3, 2024 23:50
@robscott
Copy link
Member Author

robscott commented Jun 3, 2024

Hi @robscott , thanks for the PR, could you provide more context in the Git commit message as described here.

Updated, let me know if I missed anything @tommyp1ckles.

@robscott the changes LGTM but there are some files that need to be regenerated accordingly the CI.

Ran make generate-k8s-api, hopefully that's all I was missing.

Could you also add a small doc update to the topology awareness section to document support for TrafficDistribution? https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/#topology-aware-hints

Added, thanks for the idea @borkmann!

@robscott robscott force-pushed the trafficdistribution branch 2 times, most recently from 47c53df to 397a751 Compare June 4, 2024 00:10
@qmonnet qmonnet added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Jun 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 4, 2024
@qmonnet
Copy link
Member

qmonnet commented Jun 4, 2024

/test

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you

This commit adds support for Service Traffic Distribution, a Kubernetes feature
that is on track to go beta/available-by-default in v1.31.

TrafficDistribution represents the latest and hopefully final iteration in
Kubernetes topology aware routing. The default logic is quite simple - route
traffic within the same zone if there are any healthy endpoints in that zone.
The kube-proxy implementation of that is still based on the EndpointSlice hints
field, but does not strictly need to be.

This KEP also comes with a more fundamental change to the implementation logic.
As part of a broader goal of separating concerns, we've removed the check from
Kube-Proxy that verified that a topology annotation was set before honoring
hints on EndpointSlices. Now we simply check that all endpoints for an
EndpointSlice have hints, and honor them if so.

In this commit I've left some logic to determine if a Service is likely to have
hints set, but that is now only used to determine if a Service should be
reconciled again if the local node labels change.

Signed-off-by: Rob Scott <robertjscott@google.com>
@aanm aanm force-pushed the trafficdistribution branch from 397a751 to 9be68a8 Compare June 5, 2024 07:32
@aanm
Copy link
Member

aanm commented Jun 5, 2024

/test

@aanm aanm enabled auto-merge June 5, 2024 07:32
@robscott
Copy link
Member Author

robscott commented Jun 5, 2024

/ci-e2e
/ci-e2e-upgrade
/ci-ingress

@joestringer
Copy link
Member

@robscott unfortunately, the bot that responds to comments cannot accept multiple commands in the same message. Additionally, you will need to become an organization member in order to be able to run tests.

@joestringer
Copy link
Member

/ci-e2e

@joestringer
Copy link
Member

/ci-e2e-upgrade

@joestringer
Copy link
Member

/ci-ingress

@joestringer
Copy link
Member

joestringer commented Jun 6, 2024

Test Scenario: An Ingress with a trailing slashes in a prefix path rule should ignore the trailing slash and send traffic to the matching backend service # features/path_rules.feature:188 failed on a couple of runs in the ci-ingress workflow. Is there any chance this could be impacted by the content of this PR?

@robscott
Copy link
Member Author

robscott commented Jun 6, 2024

Thanks for the info @joestringer! I've filed cilium/community#127 to become an organization member.

Test Scenario: An Ingress with a trailing slashes in a prefix path rule should ignore the trailing slash and send traffic to the matching backend service # features/path_rules.feature:188 failed on a couple of runs in the ci-ingress workflow. Is there any chance this could be impacted by the content of this PR?

I can't think of anything in this PR that should affect the results of Ingress conformance tests. As far as I can tell, this is a common flake per #31857.

@joestringer
Copy link
Member

/ci-ingress

@aanm aanm added this pull request to the merge queue Jun 10, 2024
Merged via the queue into cilium:main with commit f1b5334 Jun 10, 2024
@joestringer
Copy link
Member

I wonder if we should actually bump this to release-note/major, given that it's a feature in k8s that we're now supporting and it could allow users to better manage the load in their clusters?

@joestringer joestringer added release-note/major This PR introduces major new functionality to Cilium. and removed release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Jun 18, 2024
@qmonnet
Copy link
Member

qmonnet commented Jun 18, 2024

Fine by me, I hesitated between the two options when setting the label.

@julianwiedmann julianwiedmann added the area/loadbalancing Impacts load-balancing and Kubernetes service implementations label Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/loadbalancing Impacts load-balancing and Kubernetes service implementations kind/community-contribution This was a contribution made by a community member. release-note/major This PR introduces major new functionality to Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants