-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add CiliumFlowLog CRD #28220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CiliumFlowLog CRD #28220
Conversation
84c2e73
to
bf6f5f2
Compare
pkg/k8s/watchers/watcher.go
Outdated
@@ -450,6 +451,7 @@ var ciliumResourceToGroupMapping = map[string]watcherInfo{ | |||
synced.CRDResourceName(v2alpha1.CCGName): {start, k8sAPIGroupCiliumCIDRGroupV2Alpha1}, | |||
synced.CRDResourceName(v2alpha1.L2AnnouncementName): {skip, ""}, // Handled by L2 announcement directly | |||
synced.CRDResourceName(v2alpha1.CPIPName): {skip, ""}, // Handled by multi-pool IPAM allocator | |||
synced.CRDResourceName(v2alpha1.CFLName): {afterNodeInit, k8sAPIGroupCiliumFlowLog}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried different combinations, but it seems that this patch prevents cilium-agent
from becoming Ready
. I can't figure out from the logs alone
Loop in @cilium/sig-hubble for your awareness and review |
// List of field names that will be kept in the log output. | ||
// | ||
// +kubebuilder:validation:Optional | ||
FieldMask []string `json:"fieldmask,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json:"fieldMask,...
Capitalize Mask because field mask is two words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder it should be FieldMasks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldMasks
isn't correct because it's one field mask made up by defining multiple field names.
I was going to mention maybe Fields
is a simpler name while still effectively communicating what this is. But there is already a hubble feature which refers to field masks, so we should go with the precedent:
https://docs.cilium.io/en/latest/observability/hubble-exporter/#field-mask
hubble-export-fieldmask: time source.identity source.namespace
|
||
// Filters specifies flows to include in the log output. Flow passes the | ||
// filter if any of the entries match. Only Flows matching the filter, but | ||
// not matching exclude-filter will be logged. The check is disabled when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "the check is disabled when empty" means. If we don't have any filters, then logging is disabled? Or when we don't have any filters, we will include all flows?
Maybe "When empty, all flows are included, except those in exclude filters"?
Whatever we do to fix this, let's do it in the ExcludeFilters doc comment as well.
} | ||
|
||
// FlowLogStatus is a status of a flow log task. | ||
type FlowLogStatus struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without status, we can't tell if it is working without digging into the node files.
We should consider some status, like number of flows logged per node, and conditions for whether the node is failing to log (maybe due to permissions or storage issues).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I would expect at least the Conditions
to indicate if the flow filter has been programmed successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who would own the object in order to update it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what status would be present without seeing the implementation.
I imagine there would be status associated with each node so there could be a Nodes slice where each node owns and updates its own item in the slice.
We would have to have some central controller which could clean up stale entries for nodes which don't exist anymore or aren't running anetd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid having a central controller for this. I thought that the state would be exposed by metrics:
flow_log_active_task{name="just-an-example"} 1
flow_log_processed_flows{name="just-an-example"} 12345
flow_log_dropped_flows{name="just-an-example"} 321
Does it need to be figured out in this PR or can be added later when we will have more details from the implementation?
// +kubebuilder:validation:Optional | ||
Filters []*flowpb.FlowFilter `json:"filters.omitempty"` | ||
|
||
// ExcludeFilters specifies flows to exclude from the log output. Flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Flow passes the filter if any of the entries match" sounds like we would log the flow.
Maybe "Flows are excluded from logging if any of the exclusion filters match"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also struggle with wording. Do you have other ideas (preferably for both)?
bf6f5f2
to
a7f0616
Compare
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
a7f0616
to
e30a922
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The only thing I'm missing here is the definition of expected flow log output. In the other mentioned PR I can read it from code but as we start with bare API I would document how to configure flow logs output and where to look for them (or configure log scraper)
Hey all, sorry for the late reply, I've been trying to construct my thoughts on this PR and the overall proposal as I wasn't as involved as I should have been early on in the original proposal. First, thanks for the proposal and PRs. In general, I really like the idea of a structured configuration file to manage more of the cilium agent’s configuration. It gives us a lot of flexibility, in terms of validation, better data types (than flags), and the ability to reload configuration on-demand. However, after a lot of thought, I believe the approach of using the CRD has some trade-offs that I would like us to avoid, but believe there is a middle ground where we can still obtain all these benefits, but without using a CRD. I have ideas for a different approach that I’ll mention towards the end, but here’s the main points on why I think we should avoid using CRDs for something like this:
So here’s my proposal: keep everything the same, but no CRDs. I like the config format, I see valid use-cases for every option. Let’s just make a normal struct for the configuration, add a new flag to specify the file path for the configuration, and then implement the logic to handle reloading and updating the exporter configuration. This would still enable runtime reconfiguration, as files can be updated and the application can either use inotify or poll for changes to the file. The configuration file could be provided via a ConfigMap volume, or another mechanism out of band (eg: a sidecar, a secret-management operator, etc). What do you think? |
I leave some quick comments for now and I will elaborate later as needed.
Yes, this was the intent. Although having multiple CRDs capture the same flow may be wasteful I think it is still better than the alternative, i.e. annotating flows with each CRD they matched.
We explored this alternative, but logging all flows to later cut and dice consumes too much resources.
Interesting observation, I didn't think about it in terms of configuration. I focused on having an option to dynamically change what's logged and where. ConfigMaps was one of the alternatives, but I feel it is too slow (usually ~30s).
In our environments person who wants to turn logging on/off is completely unrelated to person installing Cilium. Admittedly we haven't though much about RBAC. In our case we will have controller for managing that with some user-friendly API to it, so our main focus is to make this dynamically configurable and preferably fast.
This sounds reasonable and I like the idea, but I need to think it through. In previous experiments I tried reloading Hubble and it had 2 main issues: slow and risky. The first can be easily improved (or optimized for our use case) and the second could have changed since the version 1.12. Then there were many assumptions about initialization order and reloading stuff, but since introduction of cell/hive it may have improved. I need to look further into that. |
I don't enjoy using configmaps for anything :)
Logging a lot of stuff that you're going to throw away uses a lot of resources, both on the cilium and the log ingest side. Our initial experiments indicated that this feature is going to use a lot of resources so we want to have options to cut that down.
I've seen this argument a few times and I have trouble following the benefit. Instead of one API and one object, we create and manage two. It still has to be a singleton. The separation of concerns via operator with CRDs to create config files makes a lot more sense if you're talking about deploying an application which isn't Kubernetes aware. But Cilium is very tightly coupled with Kubernetes so why would we need to separate concerns?
There's a lot of benefits from the complexity too. Like having a versioned API and guarantees we won't make breaking changes. Projects are always changing their configmaps and breaking things :( |
Replying to @AwesomePatrol:
Totally get that, just mentioning it as an alternative, in case it hadn't been explored yet.
Yeah, so I totally agree with this functionality. Dynamically reconfiguring the logging/export sub-system, I fully agree with.
Yes, kubelet does poll for changes to ConfigMaps and Secrets instead of using watches, for performance reasons. Some tools which only read files run a sidecar or another agent to handle "mounting" the ConfigMap/Secret to work around this but that means a new connection to the API server which is not ideal. Alternatively, we could consider reading from a single configmap specified by a flag on the agent. This would still avoid the CRD but still keeps the configuration of "what" config to use scoped at installation time, and I believe it would also avoid a new connection to the API.
While I understand that use-case, I believe this is a very small and limited use-case for most Cilium users, and isn't enough to justify a new CRD to me. A ConfigMap could be used in this situation to support the use-case.
No matter how we expose the configuration, implementation of reloading the export sub-system configuration is the same, so I don't think that this impacts the discussion on the API for configuration. Replying to @nathanperkins :
My main problem is CRDs are bit over-complicated for this particular use-case IMO. They certainly have benefits, as you mention: versioning, schemas, etc. But they also have more feature than we need in this particular case of reconfiguring export: (eg multiple instances, new RBAC, new reconcilers/watchers, etc). Now, CRDs in general, do make perfect sense for the general problem of "configuration". But what I want to avoid is introducing a bunch of separate CRDs for reconfiguring Cilium, with each CRD behaving in slightly different ways, or having different semantics and behaviors, or only support for re-configuring some sub-systems. We already have If we want to do reconfiguration of Cilium via CRDs, then I'd like to see a wholistic vision of what that looks like for all of Cilium. What features should be supported, how does it interact with the various sub-systems, etc. However, this is a big problem to solve, and not one I'm expecting you to solve necessarily. Since this a difficult problem, I believe a easier way to tackle it is to build the underlying primitives to support "dynamic reconfiguration" where it matters, using simple techniques like config files + reload as it's been done in Unix applications for decades. This lets us handle orchestrating reconfiguration outside of Cilium, which should let you experiment and control the behavior better without having to solve the entire problem of CRD based configuration of Cilium. With new primitives to support reconfiguration, this opens up more possibilities to implementing something that does solve the bigger problem, such as an "install operator". |
@AwesomePatrol @nathanperkins @chancez Given into account that this is an observability feature I don't find it especially crucial to reconfigure the system in the near real-time. The 30-60s delay from kubelet configmap remount seems reasonable to me. At this point I believe we should take a little step back and find an agreement over the feature requirements and solution design. From what I understand these are the requirements:
In such a case I would suggest the following solution:
Let me know what you think. If what I wrote seems reasonable to you, I can propose YAML file format and later some PoC. |
@marqc I generally agree with the idea. I'm not sure we need to support "multiple logging tasks", but could be convinced otherwise. I think the format of the config could likely just be the CRD definition but adapted into a regular struct/etc. But overall I agree with the idea. |
I'm okay with either approach but I do want to at least mention some CUJs that we are tracking which are not well-supported if there is only a single configuration for logging tasks and no RBAC.
We are getting around some of these issues by providing our own CRD and controller but it adds complexity to using the feature. |
I thought that was just a matter of setting the
I understand the difficulty, but I think that's the right approach for this right now. Hopefully what you learn will inform us on the best approach for managing Cilium configuration longer term. |
I'm going to close this out to remove the review requests. We can discuss topics further in the original issue. |
Part of #25508
This is CRD without implementation (as opposed to #26646). Let's agree on definition first and then implement it.