Skip to content

Conversation

MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented Aug 18, 2024

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

This adds the last needed MCS-API controller to sync ServiceImport from remote clusters in the clustermesh. This PR also comes with an example that will be later used in the user documentation, small fixes to the helm deployment so that this controller is able to patch statuses and a small typo fix with the interface that was added for this controller in previous commits.

Fixes: #32712
Related to #27902

clustermesh: add Multi-cluster Service API support

@MrFreezeex MrFreezeex requested review from a team as code owners August 18, 2024 13:27
@MrFreezeex MrFreezeex requested review from thorn3r and squeed August 18, 2024 13:27
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 18, 2024
@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch 2 times, most recently from 8e823c6 to a205c8d Compare August 18, 2024 13:41
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

MrFreezeex commented Aug 18, 2024

While this is the last "big step" to make mcs-api a thing in Cilium note that there is still a few things to do, like mainly the following things:

  • Add a user documentation
  • Add some integration tests (planning to check out mcs-api conformance test here https://github.com/kubernetes-sigs/mcs-api/tree/master/conformance, which is also being improved by other mcs-api implementers but might not be ready right now, will check with k8s sig-multicluster folks if it is)
  • During initial testing of this I think I spotted that ServiceExportsSynced/ServiceSynced seems to actually not wait for a full initial sync, I need to check a bit more what's happening here and create a related issue and/or PR depending of what I am able to find EDIT: After double checking this doesn't seems to be true
  • Probably adding service annotation synchronization as well (more details on that later, I didn't want to do that here since I need to also update the export side and types and this PR is already large enough)
  • Most likely updating mcs api version as we should get a beta version in the coming month(s)
  • And lastly but this is not directly related to cilium, would like to see with kubernetes sig-multicluster if there wouldn't be a way to simplify the coredns multicluster plugin installation/activation (https://github.com/coredns/multicluster/) a bit further as people need to rebuild a coredns docker image with this included right now.

And that's probably all, at least from what I can think of right now :D.

Note that if you would like to try this locally you need to install MCS-API crds (https://github.com/kubernetes-sigs/mcs-api/tree/master/config/crd), have this fix #34295, enable clustermesh.enableMCSAPISupport from helm and then if you apply the examples from this PR you should be able to see the ServiceImports + derived services (named derived-$hash). There is a bit more high level schema here #32712 as well for more context.

@giorio94 giorio94 self-requested a review August 19, 2024 10:11
@squeed
Copy link
Contributor

squeed commented Aug 19, 2024

I don't see any objects with OwnerReferences. Is that intended? Of course, OwnerReferences doesn't work cross-cluster, so ignore this if there are no "dependent" objects in the same cluster.

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't know why a bunch of comments got published earlier... In any case, just a few nits from an initial high level pass, but I still need to fully digest the logic and perform a few tests before providing more in-depth comments.

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments inline, but the overall logic looks good to me 🚀. I'll perform a final test once addressed.

@giorio94 giorio94 added area/clustermesh Relates to multi-cluster routing functionality in Cilium. release-note/major This PR introduces major new functionality to Cilium. labels Aug 20, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 20, 2024
@giorio94
Copy link
Member

I've labeled this PR as release-note/major, given that it looks to me the last big chunk of work to introduce the support for MCS-API. Would you mind generalizing a bit the release note part of the PR description, to emphasize that aspect?

@giorio94 giorio94 removed the request for review from thorn3r August 20, 2024 13:13
@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch 4 times, most recently from 1173c4c to 007c8a6 Compare August 26, 2024 12:47
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

/ci-clustermesh

@giorio94
Copy link
Member

giorio94 commented Sep 27, 2024

Yes the KEP is currently saying that the condition should be on every service exports whether or not the local service is involved. There is two parts that may change in the KEP though:

  • I am trying to add something to the KEP conditions to encode whether or not the local service is involved in the conflict (while still keeping the Conflict conditions everywhere)

  • The KEP may relax the fact that this conflict should really appear on every ServiceExport as some implementations (Submariner mainly for now) might not be able to technically do that due to their design (but we do have that info, so probably no change for us on this part).

This made me check again what I am doing for the conflict checking on other fields than the port and indeed it was only publishing a conflict if the local service is different 🤦‍♂️ so I did fixed that too. So basically in my last push I essentially applied your suggestion and corrected the non port checking to also be that way!

Perfect, thanks for the clarification and the updates!

@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch 4 times, most recently from 39403c0 to 53409ff Compare September 27, 2024 14:46
@MrFreezeex
Copy link
Member Author

/test

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@MrFreezeex
Copy link
Member Author

/test

1 similar comment
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch from 53409ff to f99c81b Compare October 10, 2024 07:53
@MrFreezeex
Copy link
Member Author

MrFreezeex commented Oct 10, 2024

(rebasing for the CI; there was no conflict/other changes)

@MrFreezeex
Copy link
Member Author

/test

@giorio94
Copy link
Member

Note that you'll need to rebase once more to fix the linting issues from testifylint as #35237 got just merged. Sorry for the inconvenience.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
MCS-API Service{Import,Export} also have statuses that we have to update
for MCS-API implementation so this adds the missing permissions.

This commit also remove permission to MCS-API resources when only
endpointslice sync is enabled as this is not necessary.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch from f99c81b to 97ecab2 Compare October 10, 2024 09:22
@MrFreezeex
Copy link
Member Author

Note that you'll need to rebase once more to fix the linting issues from testifylint as #35237 got just merged. Sorry for the inconvenience.

No worries, thanks for letting me know 🙏!

@MrFreezeex
Copy link
Member Author

/test

This adds the last needed MCS-API controller to sync ServiceImport from
remote clusters in the clustermesh.

This controller pulls ServiceExport info from remote clusters and merge
those into a ServiceImport that it creates. The conflict resolution
algorithm favors the oldest ServiceExport for every properties excepts
the service ports. Service ports are the union of every similar
ServiceExports from the local cluster and the remote clusters.

The Service Import controller also manages ServiceExport status as it
contains the conflict messages and we already have all of this info when
we run the merge to create the ServiceImport in the corresponding
controller.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
As this Service controller also are for ServiceImport (while it truly
isn't about ServiceImport as its just a trick documented in the code) it
conflict with the name of the actual ServiceImport controller. This
commit rename the controller to make sure it does not.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
@MrFreezeex MrFreezeex force-pushed the svcimport-controller branch from 97ecab2 to a7e3758 Compare October 10, 2024 09:37
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

/ci-integration

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 10, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Oct 10, 2024
Merged via the queue into cilium:main with commit 3873b81 Oct 10, 2024
63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CFP: ServiceImport auto creation for MCS-API
4 participants