-
Notifications
You must be signed in to change notification settings - Fork 3.4k
BGP Control Plane #18860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BGP Control Plane #18860
Conversation
Commit bebe911c7693ad6d39f302c3f6d6eb4e8a3b7d74 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
bebe911
to
c246ead
Compare
8ee4bec
to
ddba5a0
Compare
5283cfb
to
56122e9
Compare
de7c296
to
4b2154b
Compare
@kaworu thanks - fixed and rebased. |
/test |
@ldelossa travis ci failure => https://app.travis-ci.com/github/cilium/cilium/jobs/566155290#L9792
|
Thanks! Do you run anything locally to check that this is required? |
eb962a3
to
3419144
Compare
this commit wires up the bgp control plane by adding the "--enable-bgp-control-plane" flag and adds it to helm. this commit also adds a package level README which outlines details of the BGP control plane implementation. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
3419144
to
e49763e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
/test |
1 similar comment
/test |
ci-l4lb => https://github.com/cilium/cilium/actions/runs/2106008869 looks like a cluster creation failure. re-triggering. |
/ci-l4lb |
/ci-nat46x64 |
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.23-kernel-net-next/957/ failed to provision VMs. restarting... |
/test-1.23-net-next |
1 similar comment
/test-1.23-net-next |
@michi-covalent i think this is safe to mark as ready to merge. The final failing test seems to be unrelated to this change. |
``routerID`` field in each defined ``CiliumBGPVirtualRouter`` object | ||
within a ``CiliumBGPPeeringPolicy`` | ||
|
||
- Cilium running on a IPv6 single stack cluster cannot reliably |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calico uses the hash of the ipv6 address to derive a router-id if it's not provided. Defining a router-id for each server is rather annoying especially in presence of auto-scaling and ephemeral nodes. Would it be possible to have some form of "IPAM" for router-IDs? Having to manually set them does not scale well :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point.
We have discussed a bit about IPAM for router-IDs, and it definitely wasn't a no-go. More, a time constraint desicion not to add it for this initial merge.
The hashing of the IPv6 address is an interesting one for an "auto generated" way of going about it.
I think we had discussed with our teams how network admins may want to hard code router ids, as they pen-and-paper design their networks. Generation can causes surprises in some cases.
I'd be happy to cater to both use cases, ones where the net admins want to be completely explicit, and ones where generation is OK and helps lessen the administrative burden.
------------ | ||
|
||
The ``BGP Control Plane`` is split into a ``Agent-Side Control Plane`` | ||
and a ``Operator-Side`` control plane (not yet implemented). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of the Operator-Side ? not clear from me from the documentation but might be nice to know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this is a bit in flux.
When the PR started we were looking to fork lift MetalLB code into our new native solution.
MetalLB couples the "LoadBalancer IP" advertisements with its internal "Bare Metal LoadBalancer" feature. They are one thing, and they act at the "cluster-wide" operator level. So when the initial CFP for this feature took place, we were planning on emulating that.
After a big of further discussion, I think the consensus is, a "Bare Metal LoadBalancer" feature is orthoganol in implementation to a "LoadBalancer IP" advertisement feature. One can exist without the other, but the latter can integrate with the former.
Moving forward, I think (don't quote me on this yet, we are still planning) we will create a "bare metal load balancer" feature on its own and potentially drop this "Operator-Side" architecture, or change its usages to suppor something like "router-id IPAM" as you mentioned earlier.
For now, we can probably omit it in the docs, but it slid in with the large change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a couple of anomalies in this PR while reading through this code, PTAL.
synced.CRDResourceName(v2alpha1.BGPPName): "SKIP", // Handled in BGP control plane | ||
synced.CRDResourceName(v2alpha1.BGPPoolName): "SKIP", // Handled in BGP control plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that the k8s watcher initialization is done separately from this code? I would much prefer if we can avoid skipping here if possible, since this is the mechanism that we use to properly disable k8s resource sync unconditionally across the codebase, so any exceptions can end up having repercussions on builds where we explicitly intend to disable certain functionality (for instance, in EKSA we use this mechanism to remove features that downstream do not wish to support, which includes this BGP support for now).
Feel free to hit me up for more context on this, maybe that'll help to figure out a good way to make this more maintainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth cross-linking the remarks to #19376
(Adding this comment so that at least it will show up in the Github UI for that ticket :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joestringer we skip the watchers registration here due to the BGP control plane managing its own informers and listers:
cilium/pkg/bgpv1/agent/controller.go
Lines 150 to 172 in 636edd0
clientset := k8s.CiliumClient() | |
factory := externalversions.NewSharedInformerFactory(clientset, 0) | |
selfTweakList := informers.WithTweakListOptions(func(lo *metav1.ListOptions) { | |
lo.FieldSelector = "metadata.name=" + nodetypes.GetName() | |
}) | |
k8sfactory := informers.NewSharedInformerFactoryWithOptions(k8s.Client(), 0, selfTweakList) | |
nodeLister := k8sfactory.Core().V1().Nodes().Lister() | |
nodeInformer := k8sfactory.Core().V1().Nodes().Informer() | |
nodeInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{ | |
AddFunc: sig.Event, | |
UpdateFunc: func(_ interface{}, _ interface{}) { sig.Event(struct{}{}) }, | |
DeleteFunc: sig.Event, | |
}) | |
policyLister := factory.Cilium().V2alpha1().CiliumBGPPeeringPolicies().Lister() | |
policyInformer := factory.Cilium().V2alpha1().CiliumBGPPeeringPolicies().Informer() | |
policyInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{ | |
AddFunc: sig.Event, | |
UpdateFunc: func(_ interface{}, _ interface{}) { sig.Event(struct{}{}) }, | |
DeleteFunc: sig.Event, | |
}) |
This was an initial attempt to avoid head of line blocking in the watcher code, and an implementation of: https://docs.google.com/document/d/1gybVpFOrnjJFLZEMwp_VQJZGQSZ30_pVo2VA2fylBFM/edit#
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 okay cool, maybe I'll just hit you up then to see if there's a way to control the k8s registration all from one place then with this model then, because otherwise it'll be a pain to continue to support things like EKS-A going forwards if we have to track each feature individually registering CRD resources and also k8s watchers all over the codebase. Background is this PR: #17677
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ We discussed out of band, it's on me to come up with a concrete proposal of how we might improve this.
This PR implements a new BGP control plane.
The PR has self-contained documentation here located at
/pkg/bgpv1/README.md
and is worth reading through before an initial review.The initial implementation moves over the existing PodCIDR advertisements feature with the feature-adds of dual stack and IPv6 single stack support.
This support encompasses both IPv6 peering and IPv6 MP BGP advertisements.
This PR also implements a new event handling pattern outlined here by this CFP
The large file count is due to vendoring new depdencies. The vendoring changes are isolated to their own commits and can be jumped over while reviewing.