-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Extended Clustermesh (Clustermesh511) #27520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/test |
Commits cea43b08fe3677c661a309147b2bbf8a575402ac, 7087d391c7bb56088c93c7187d793272324d7240, 6c480d6ee5bc1bd74e8a24fae06f3eda6142bf9d, c2df5b3606e4bb6e400bcfdccc0c5bd1b073d8f0, 39de33a81ce87bc3a469490e89896a70b85e5064 do not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
39de33a
to
c521d55
Compare
/test |
c521d55
to
d223290
Compare
/test |
1523fc4
to
6b8e504
Compare
/test |
e1417c6
to
8fc3bbd
Compare
/test |
cb6016d
to
50495b3
Compare
/test |
50495b3
to
0eb31f8
Compare
/test |
tested and passed against the new Clustermesh upgrade/downgrade tests |
241905b
to
4757068
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bearing with me! Left a few nits, good to go otherwise. 🚀
4757068
to
74f3274
Compare
/test |
74f3274
to
f7bb191
Compare
/test |
looks like the CLI tests were updated and now |
I see a CI run with the line |
Not sure yet whether this test failure is legitimate or the test is flakey. All tests were passing prior to this being merged, it looks like we are testing some new paths for 1.15 now. I noticed there are a lot of recent failures in the job history, though. Opened this to track until I can investigate more |
Extend the cluster_id field of the keys/values for the following maps from uint8 to uint16 to support larger ClusterMesh: - lxc - tunnel - ipcache - lb_backends Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
MinimalAllocationIdentity and MaximumAllocationIdentity are global variables that can be changed at runtime by the InitMinMaxIdentityAllocation function. Since these are dependent on ClusterIDShift, which is no longer a constant, this commit replaces the globals with Getter methods to compute the correct value and avoid a race condition in which a global is accessed before it is correctly initialized. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
This commit adds a new flag '--max-connected-clusters' to enable users to configure the bit allocation of cluster ID and identity in a numeric identity to enable support for a clustermesh with more than 255 clusters. Currently the only supported values for this flag are 255 (default) and 511. Increasing this value will increase the maximum number of clusters/maximum cluster ID by reducing the maximum number of allocatable cluster-local identities. This feature changes the way identites are interpreted in the datapath. It should only be used for new clusters, as there is currently no migration path for live clusters. All clusters connected in a clustermesh will require this flag to be set to the same value. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
This commit adds a new flag '--max-connected-clusters' to clustermesh-apiserver. The value will be written to the CiliumClusterConfig and is used by remote-clusters to validate that both clusters are running in the same identity/clusterID allocation mode prior to synchronizing resources. Clusters that are using different values for '--max-connected-clusters' should never be permitted to connect to each other, as this determines how identity is parsed by datapath BPF programs. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
The field MaxConnectedClusters was added to CiliumClusterConfig in a previous commit to support the extended Clustermesh feature. This value must be written to cluster configs in the kvstore to ensure that all clusters in a Clustermesh are using the same identity/clusterID allocation more prior to synchronization. Clusters that are using different values for '--max-connected-clusters' should never be permitted to connect to each other, as this determines how identity is parsed by datapath BPF programs. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
With the addition of MaxConnectedClusters to CiliumClusterConfig, the operator's KVStore watchdog must be aware of and set this value to ensure it is not removed on update after updating the heartbeat. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Extended clustermesh changes the bit allocation in skb->mark to repurpose bits from identity and grant them to cluster ID. Users can configure this with the agent flag '--max-connected-clusters'. This enables support for larger clustermesh/greater cluster ID but reduces the maximum allocatable identity. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Since overloadable_skb.h now requires macros defined in node_config.h, it is necessary to reorder dependencies in some tests that are using unspec.h and including bpf_sock.c, so that when overloadable_skb.h is included in lib/common.h, it occurs after bpf_sock.c includes node_config.h Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Set maxConnectedClusters in ci-multicluster to test extended clustermesh feature. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Setting Cluster ID in the upgrade-downgrade test will provide test coverage for the updates made to BPF maps and the datapath to provide support for extended clustermesh. These changes include expanding the Cluster ID field on lxc, endpoint, lb4backend, and lb6backend maps, as well as modifications to skb->mark parsing for identity. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
New unit tests have been added to cover changes made to the skb functions that get and set identity and cluster_id in skb->mark. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Remove IDENTITY_MAX from node_config.h and replace IDENTITY_LEN with a global config variable using DECLARE_CONFIG and ASSIGN_CONFIG. This will allow replacement at runtime in later iterations. New helper functions now calculate the other previously defined macros for IDENTITY_MAX, CLUSTER_ID_UPPER_MASK, and MARK_MAGIC_CLUSTER_ID_MASK, since these all depend on IDENTITY_LEN, which could change after compilation. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
f7bb191
to
21cab86
Compare
/test |
Support extending Clustermesh to > 255 clusters
This PR introduces a new flag
--max-connected-clusters
which allows the bit allocation of Identity/ClusterID in a NumericIdentity to be configured to allow connecting up to 511 clusters (9 bits) in a Clustermesh, at the expense of shrinking the allocatable cluster-local identities to 32767 (15 bits).The ClusterID field of 5 BPF maps is extended to uint16 to accomodate larger ClusterIDs:
Map migrations were not necessary for anys maps due to:
Currently the only supported values for
max-connected-clusters
are 255 (default) and 511.This feature has no live-migration path and can only (currently) be used for brand new clusters (i.e. you cannot change a 255 cluster to 511).
Additional details available in individual commit messages
Release note: