-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Register cluster-id and cluster-name flags through hive #27823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/test |
0bd7bc2
to
438811b
Compare
/test |
Rebased onto main to pick CI changes |
438811b
to
dac2456
Compare
/test |
dac2456
to
1101132
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-wise looks good to me.
I'd be interested to know more about the motivation behind this change, and what the plan is going forward. The cluster name is something that is needed by many subsystems to namespace global resources: Node names, EC2 network interface tags in ENI IPAM, service labels, identity labels.
Is the goal that all those subsystems (node discovery, IPAM, service, identity allocation, IPCache) eventually rely on the clustermesh cell? Will that not cause potentially circular cell dependencies?
While I agree we should reduce global configuration state, the cluster name is something that is inherently global and relevant to a lot of subsystems. If we want to make the clustername and cluster id a dependency to be injected into other subsystems, it should probably be a leaf dependency that depends on nothing else, so I'm not sure if the clustermesh cell is the right place (since it itself has lots of dependencies already). Did you consider introducing a separate cluster-id cell instead?
Essentially this came up while refactoring the clustermesh-apiserver and kvstoremesh in preparation to merging them into a single binary (as part of a larger effort to unify all clustermesh-apiserver components into a single image for simplicity: #23770). The idea is to extract these two flags into a separate object injected through Hive, so that we can reuse them uniformly in all the different components, as well as remove the hacky conversions from the global DaemonConfig, which were for instance used before in the clustermesh cell. Over time, it would be great if also the other subsystems requiring these parameter could depend directly on the
The different subsystems can depend on the |
Awesome, thanks for clarifying. I'm still not super familiar with the hive framwork, so I didn't know one could depend on struct alone this way. Sounds good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giorio94 Very nice work Marco and nice clean up! I think there might be an opportunity to further improve, but I am a bit out of context here, so feel free to disregard..
@@ -3036,8 +3030,8 @@ func (c *DaemonConfig) Populate(vp *viper.Viper) { | |||
c.AutoCreateCiliumNodeResource = vp.GetBool(AutoCreateCiliumNodeResource) | |||
c.BPFRoot = vp.GetString(BPFRoot) | |||
c.CGroupRoot = vp.GetString(CGroupRoot) | |||
c.ClusterID = vp.GetUint32(ClusterIDName) | |||
c.ClusterName = vp.GetString(ClusterName) | |||
c.ClusterID = vp.GetUint32(clustermeshTypes.OptClusterID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the new type here ie c.ClusterIDName = NewClusterIDName(...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky, because the only way to keep the current names of the fields (to avoid codebase-wide changes) would be to embed the ClusterInfo
struct into DaemonConfig
, which feels weird as it would also inherit the associated methods. Although suboptimal, I'd keep the current approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Marco! I see what you are saying. Figured it might be an opportunity to begin cleaning this insanely large struct in favor of segmenting based on config domain. Using named association would scope the delegation vs inheriting
the behaviors... But in this case clusterInfo validation when it comes to configuration would be a good thing anyway. But we would need to establish pattens across the board which is a big undertaking at this stage ;(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely second the idea of removing configurations from the DaemonConfig struct. I took another pass at removing a few more usages of the ClusterID/ClusterName fields when possible (mainly the ones which would also be useful in the context of #27520, but I wouldn't go further as not to excessively increase the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about instead of pulling them from viper like this, we would instead fill them into DaemonConfig
from an Invoke
that is part of the config cell (see my earlier comment)? Or alternatively leave it as-is, if these are going away soon anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue of pulling it from an invoke hook is that we would postpone a bit the initialization, which would be an issue if any (legacy) component not directly depending on it would try to read these values. The current solution seems a bit less error prone to me from that point of view.
1101132
to
be8e582
Compare
be8e582
to
0efc7e0
Compare
/test |
/test |
6be4f09
to
ee2d117
Compare
/test |
ee2d117
to
0895dae
Compare
Removed CI-related reviewers because that commit has been dropped. |
/test |
0895dae
to
0a06061
Compare
I've reverted the changes affecting external workloads, as the scope of this PR was increasing too much. Let's see if now it passes all tests. |
/test |
Currently, the clustermesh package already leverages a ClusterIDName struct to wrap the ClusterID and ClusterName information. Let's use it also to register the associated flags, so that we can directly depend on it on the different components, and reuse it for the kvstoremesh and clustermesh-apiserver binaries uniformly. To avoid the need for modifying all usages, we preserve the current fields inside DaemonConfig, which continue to be populated as before. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
0a06061
to
b325485
Compare
/test |
@derailed PTAL |
@@ -141,7 +140,6 @@ func NewCachingIdentityAllocator(owner cache.IdentityAllocatorOwner) fakeIdentit | |||
func (s *EndpointSuite) SetUpTest(c *C) { | |||
/* Required to test endpoint CEP policy model */ | |||
kvstore.SetupDummy(c, "etcd") | |||
identity.InitWellKnownIdentities(&fakeConfig.Config{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not need these here anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WellKnownIdentities are used only in very specific cases, mainly to bootstrap Cilium with etcd running in pod network (to break the chicken egg dependency). Given that they were not necessary here (as well as in the clustermesh case), I just removed the line altogether instead of adapting the parameters.
@@ -100,6 +101,9 @@ var ( | |||
"operator-controlplane", | |||
"Operator Control Plane", | |||
|
|||
cell.Config(cmtypes.DefaultClusterInfo), | |||
cell.Invoke(func(cinfo cmtypes.ClusterInfo) error { return cinfo.Validate() }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need to wrap it, but rather you can just do: cell.Invoke(cmtypes.ClusterInfo.Validate)
. Debatable whether this is less or more confusing, but it is shorter.. :D
Why do we need Validate
here, but not in daemon/cmd/cells.go
? If it's missing from there, how about doing cmtypes.ConfigCell
that's cell.Group(cell.Config(DefaultClusterInfo), cell.Invoke(ClusterInfo.Validate))
? Preferably this config would be part of some module and not referenced naked here like this, but perhaps that's to come in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need to wrap it, but rather you can just do:
cell.Invoke(cmtypes.ClusterInfo.Validate)
. Debatable whether this is less or more confusing, but it is shorter.. :D
Uh, that's right. I didn't realize it.
Why do we need Validate here, but not in daemon/cmd/cells.go? If it's missing from there, how about doing cmtypes.ConfigCell that's cell.Group(cell.Config(DefaultClusterInfo), cell.Invoke(ClusterInfo.Validate))? Preferably this config would be part of some module and not referenced naked here like this, but perhaps that's to come in the future?
So, the thing is that depending on the component we need to perform a slightly different validation, mainly because for kvstoremesh we don't allow the cluster ID to be zero (which also means that the check needs to be performed from a start hook, as it is the default value). The validation in the agent is currently still part of Config.Validate
mainly because that is performed earlier to prevent possible issues for components which do not directly depend on the ClusterInfo
struct.
@@ -3036,8 +3030,8 @@ func (c *DaemonConfig) Populate(vp *viper.Viper) { | |||
c.AutoCreateCiliumNodeResource = vp.GetBool(AutoCreateCiliumNodeResource) | |||
c.BPFRoot = vp.GetString(BPFRoot) | |||
c.CGroupRoot = vp.GetString(CGroupRoot) | |||
c.ClusterID = vp.GetUint32(ClusterIDName) | |||
c.ClusterName = vp.GetString(ClusterName) | |||
c.ClusterID = vp.GetUint32(clustermeshTypes.OptClusterID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about instead of pulling them from viper like this, we would instead fill them into DaemonConfig
from an Invoke
that is part of the config cell (see my earlier comment)? Or alternatively leave it as-is, if these are going away soon anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giorio94 LGTM
Currently, the clustermesh package already leverages a ClusterIDName struct to wrap the ClusterID and ClusterName information. Let's use it also to register the associated flags, so that we can directly depend on it on the different components, and reuse it for the kvstoremesh and clustermesh-apiserver binaries uniformly.
To avoid the need for modifying all usages, we preserve the current fields inside DaemonConfig, which continue to be populated as before.