Add CiliumFlowLog CRD #28220

AwesomePatrol · 2023-09-19T08:18:52Z

This is CRD without implementation (as opposed to #26646). Let's agree on definition first and then implement it.

AwesomePatrol · 2023-09-19T12:15:11Z

pkg/k8s/watchers/watcher.go

@@ -450,6 +451,7 @@ var ciliumResourceToGroupMapping = map[string]watcherInfo{
 	synced.CRDResourceName(v2alpha1.CCGName):            {start, k8sAPIGroupCiliumCIDRGroupV2Alpha1},
 	synced.CRDResourceName(v2alpha1.L2AnnouncementName): {skip, ""}, // Handled by L2 announcement directly
 	synced.CRDResourceName(v2alpha1.CPIPName):           {skip, ""}, // Handled by multi-pool IPAM allocator
+	synced.CRDResourceName(v2alpha1.CFLName):            {afterNodeInit, k8sAPIGroupCiliumFlowLog},


I tried different combinations, but it seems that this patch prevents cilium-agent from becoming Ready. I can't figure out from the logs alone

sayboras · 2023-09-19T15:15:08Z

Loop in @cilium/sig-hubble for your awareness and review

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

nathanperkins · 2023-09-20T18:24:32Z

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

+	// List of field names that will be kept in the log output.
+	//
+	// +kubebuilder:validation:Optional
+	FieldMask []string `json:"fieldmask,omitempty"`


json:"fieldMask,...

Capitalize Mask because field mask is two words.

I wonder it should be FieldMasks?

FieldMasks isn't correct because it's one field mask made up by defining multiple field names.

I was going to mention maybe Fields is a simpler name while still effectively communicating what this is. But there is already a hubble feature which refers to field masks, so we should go with the precedent:

https://docs.cilium.io/en/latest/observability/hubble-exporter/#field-mask

hubble-export-fieldmask: time source.identity source.namespace

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

nathanperkins · 2023-09-20T18:27:58Z

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

+
+	// Filters specifies flows to include in the log output. Flow passes the
+	// filter if any of the entries match. Only Flows matching the filter, but
+	// not matching exclude-filter will be logged. The check is disabled when


I'm not sure what "the check is disabled when empty" means. If we don't have any filters, then logging is disabled? Or when we don't have any filters, we will include all flows?

Maybe "When empty, all flows are included, except those in exclude filters"?

Whatever we do to fix this, let's do it in the ExcludeFilters doc comment as well.

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

nathanperkins · 2023-09-20T18:33:02Z

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

+}
+
+// FlowLogStatus is a status of a flow log task.
+type FlowLogStatus struct{}


Without status, we can't tell if it is working without digging into the node files.

We should consider some status, like number of flows logged per node, and conditions for whether the node is failing to log (maybe due to permissions or storage issues).

Agree. I would expect at least the Conditions to indicate if the flow filter has been programmed successfully.

Who would own the object in order to update it?

I don't know what status would be present without seeing the implementation.

I imagine there would be status associated with each node so there could be a Nodes slice where each node owns and updates its own item in the slice.

We would have to have some central controller which could clean up stale entries for nodes which don't exist anymore or aren't running anetd.

I wanted to avoid having a central controller for this. I thought that the state would be exposed by metrics:

flow_log_active_task{name="just-an-example"} 1 flow_log_processed_flows{name="just-an-example"} 12345 flow_log_dropped_flows{name="just-an-example"} 321

Does it need to be figured out in this PR or can be added later when we will have more details from the implementation?

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

nathanperkins · 2023-09-20T18:39:38Z

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

+	// +kubebuilder:validation:Optional
+	Filters []*flowpb.FlowFilter `json:"filters.omitempty"`
+
+	// ExcludeFilters specifies flows to exclude from the log output. Flow


"Flow passes the filter if any of the entries match" sounds like we would log the flow.

Maybe "Flows are excluded from logging if any of the exclusion filters match"

I also struggle with wording. Do you have other ideas (preferably for both)?

pkg/k8s/apis/cilium.io/v2alpha1/flowlog.go

Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>

marqc

LGTM

The only thing I'm missing here is the definition of expected flow log output. In the other mentioned PR I can read it from code but as we start with bare API I would document how to configure flow logs output and where to look for them (or configure log scraper)

chancez · 2023-09-29T15:51:55Z

Hey all, sorry for the late reply, I've been trying to construct my thoughts on this PR and the overall proposal as I wasn't as involved as I should have been early on in the original proposal.

First, thanks for the proposal and PRs. In general, I really like the idea of a structured configuration file to manage more of the cilium agent’s configuration. It gives us a lot of flexibility, in terms of validation, better data types (than flags), and the ability to reload configuration on-demand.   

However, after a lot of thought, I believe the approach of using the CRD has some trade-offs that I would like us to avoid, but believe there is a middle ground where we can still obtain all these benefits, but without using a CRD. I have ideas for a different approach that I’ll mention towards the end, but here’s the main points on why I think we should avoid using CRDs for something like this: 

Currently export is currently limited to a single file. CRDs are a bit awkward because multiple instances of a CRD do not really make sense for this use-case (merging the config values across multiple would be fairly difficult to work with/unintuitive). If we don’t support more than 1 of these CRDs, ie: a global singleton CRD, that is generally an anti-pattern with CRDs and a sign that perhaps a CRD isn’t the right choice, or it belongs within another CRD. Supporting more than 1 CRD by means of 1 CRD == 1 file, is an alternative, but I dislike this idea, as file export can be quite resource hungry and I don’t believe that use-case makes sense in general.
This concept is about reconfiguring the agent and how/what it logs to disk, which arguably should be something only an administrator should be allowed to do. K8s RBAC can be used to isolate access to the CRD, or we could limit which namespaces/resource names of the CRD it watches, but this feels like yet another anti-pattern with CRDs.
An alternative that works with zero changes is to do all of the handling in your log agent. In general, removing fields, discarding specific log entries can already be done via any widely used logging agent/forwarder. That said, I do still believe being able to configure what’s logged makes sense, so I don’t think this is mutually exclusive with the idea of a export configuration file, but it’s worth noting you can achieve similar results without changing cilium at all.
This feature can lead to a slippery slope of features. I believe configuration of cilium via CRDs in general, can make sense, but where should it be handled, and for what options?
- In general, for an approach like this, I would much rather have this handled in a “cilium installation operator” or something similar, where there’s a clear delineation between managing agent configuration and features, and which configuration options require a redeploy, and which are runtime reconfigurable. If we went with a configuration file instead of a CRD as part of agent, then a future “cilium installation operator” could leverage a CRD which would eventually result in managing and updating the proposed config file. I like this separation of concerns much better than having the agent manage a CRD for it’s own config directly.
- I'll admit that we already have CiliumNodeConfig which is similar, but IMO, I wish that feature was handled using a different mechanism (as a regular config file instead).
- To summarize this point, I think if we want to do CRD based configuration of cilium, that should be done in a whole new Cilium installation operator, and config files in agent are a better alternative to CRDs which an “install operator” could still build on top of. See: prometheus-operator, grafana-agent-operator, etc for examples of other projects that take this approach.
Generally, I feel like this should be configured by the same person managing the installation of Cilium. This generally means specifying what configuration to use at deploy time, as part of the helm chart chart. Note that this doesn’t mean the configuration cannot be dynamically updated, but rather that the administrator should be specifying what config to use (by specifying which configmap contains the export configuration for example).
Lastly, CRDs are simply more work to implement, and more complexity we have to maintain, compared to reading a configuration file from disk. Adding the new CRD types, client-go reconcilers, all, RBAC permissions, etc isn’t that difficult, but it’s not free either. A config file by comparison, is mostly just the new type(s), and some code to handle detecting the file changed (which we already have for TLS certificates) and updating the configuration (which is needed to be done regardless of the implementation). A

So here’s my proposal: keep everything the same, but no CRDs. I like the config format, I see valid use-cases for every option. Let’s just make a normal struct for the configuration, add a new flag to specify the file path for the configuration, and then implement the logic to handle reloading and updating the exporter configuration. This would still enable runtime reconfiguration, as files can be updated and the application can either use inotify or poll for changes to the file. The configuration file could be provided via a ConfigMap volume, or another mechanism out of band (eg: a sidecar, a secret-management operator, etc).

What do you think?

AwesomePatrol · 2023-09-29T16:25:24Z

I leave some quick comments for now and I will elaborate later as needed.

1 CRD == 1 file

Yes, this was the intent. Although having multiple CRDs capture the same flow may be wasteful I think it is still better than the alternative, i.e. annotating flows with each CRD they matched.

An alternative that works with zero changes is to do all of the handling in your log agent

We explored this alternative, but logging all flows to later cut and dice consumes too much resources.

I would much rather have this handled in a “cilium installation operator”

Interesting observation, I didn't think about it in terms of configuration. I focused on having an option to dynamically change what's logged and where.

ConfigMaps was one of the alternatives, but I feel it is too slow (usually ~30s).

this should be configured by the same person managing the installation of Cilium

In our environments person who wants to turn logging on/off is completely unrelated to person installing Cilium. Admittedly we haven't though much about RBAC. In our case we will have controller for managing that with some user-friendly API to it, so our main focus is to make this dynamically configurable and preferably fast.

keep everything the same, but no CRDs

This sounds reasonable and I like the idea, but I need to think it through. In previous experiments I tried reloading Hubble and it had 2 main issues: slow and risky. The first can be easily improved (or optimized for our use case) and the second could have changed since the version 1.12. Then there were many assumptions about initialization order and reloading stuff, but since introduction of cell/hive it may have improved. I need to look further into that.

nathanperkins · 2023-09-29T23:16:42Z

I don't enjoy using configmaps for anything :)

An alternative that works with zero changes is to do all of the handling in your log agent. In general, removing fields, discarding specific log entries can already be done via any widely used logging agent/forwarder. That said, I do still believe being able to configure what’s logged makes sense, so I don’t think this is mutually exclusive with the idea of a export configuration file, but it’s worth noting you can achieve similar results without changing cilium at all.

Logging a lot of stuff that you're going to throw away uses a lot of resources, both on the cilium and the log ingest side. Our initial experiments indicated that this feature is going to use a lot of resources so we want to have options to cut that down.

In general, for an approach like this, I would much rather have this handled in a “cilium installation operator” or something similar, where there’s a clear delineation between managing agent configuration and features, and which configuration options require a redeploy, and which are runtime reconfigurable. If we went with a configuration file instead of a CRD as part of agent, then a future “cilium installation operator” could leverage a CRD which would eventually result in managing and updating the proposed config file. I like this separation of concerns much better than having the agent manage a CRD for it’s own config directly.

I've seen this argument a few times and I have trouble following the benefit. Instead of one API and one object, we create and manage two. It still has to be a singleton.

The separation of concerns via operator with CRDs to create config files makes a lot more sense if you're talking about deploying an application which isn't Kubernetes aware. But Cilium is very tightly coupled with Kubernetes so why would we need to separate concerns?

Lastly, CRDs are simply more work to implement, and more complexity we have to maintain, compared to reading a configuration file from disk.

There's a lot of benefits from the complexity too. Like having a versioned API and guarantees we won't make breaking changes. Projects are always changing their configmaps and breaking things :(

chancez · 2023-10-02T17:59:33Z

Replying to @AwesomePatrol:

We explored this alternative, but logging all flows to later cut and dice consumes too much resources.

Totally get that, just mentioning it as an alternative, in case it hadn't been explored yet.

I focused on having an option to dynamically change what's logged and where.

Yeah, so I totally agree with this functionality. Dynamically reconfiguring the logging/export sub-system, I fully agree with.

ConfigMaps was one of the alternatives, but I feel it is too slow (usually ~30s).

Yes, kubelet does poll for changes to ConfigMaps and Secrets instead of using watches, for performance reasons. Some tools which only read files run a sidecar or another agent to handle "mounting" the ConfigMap/Secret to work around this but that means a new connection to the API server which is not ideal.

Alternatively, we could consider reading from a single configmap specified by a flag on the agent. This would still avoid the CRD but still keeps the configuration of "what" config to use scoped at installation time, and I believe it would also avoid a new connection to the API.

In our environments person who wants to turn logging on/off is completely unrelated to person installing Cilium.

While I understand that use-case, I believe this is a very small and limited use-case for most Cilium users, and isn't enough to justify a new CRD to me. A ConfigMap could be used in this situation to support the use-case.

In previous experiments I tried reloading Hubble and it had 2 main issues: slow and risky.

No matter how we expose the configuration, implementation of reloading the export sub-system configuration is the same, so I don't think that this impacts the discussion on the API for configuration.

Replying to @nathanperkins :

I've seen this argument a few times and I have trouble following the benefit. Instead of one API and one object, we create and manage two. It still has to be a singleton.

My main problem is CRDs are bit over-complicated for this particular use-case IMO. They certainly have benefits, as you mention: versioning, schemas, etc. But they also have more feature than we need in this particular case of reconfiguring export: (eg multiple instances, new RBAC, new reconcilers/watchers, etc).

Now, CRDs in general, do make perfect sense for the general problem of "configuration". But what I want to avoid is introducing a bunch of separate CRDs for reconfiguring Cilium, with each CRD behaving in slightly different ways, or having different semantics and behaviors, or only support for re-configuring some sub-systems. We already have CiliumNodeConfig, if we do CiliumFlowLog now, then what's next? I don't think we want a dozen CRDs for reconfiguring Cilium, but this is a bigger discussion of how to manage Cilium's configuration as a whole.

If we want to do reconfiguration of Cilium via CRDs, then I'd like to see a wholistic vision of what that looks like for all of Cilium. What features should be supported, how does it interact with the various sub-systems, etc. However, this is a big problem to solve, and not one I'm expecting you to solve necessarily.

Since this a difficult problem, I believe a easier way to tackle it is to build the underlying primitives to support "dynamic reconfiguration" where it matters, using simple techniques like config files + reload as it's been done in Unix applications for decades. This lets us handle orchestrating reconfiguration outside of Cilium, which should let you experiment and control the behavior better without having to solve the entire problem of CRD based configuration of Cilium.

With new primitives to support reconfiguration, this opens up more possibilities to implementing something that does solve the bigger problem, such as an "install operator".

marqc · 2023-10-04T14:21:16Z

@AwesomePatrol @nathanperkins @chancez
For me the difference between CustomResource and ConfigMap is just an implementation detail. If introducing CRD is much more costly (for maintenance), I'm fine with going for CM. The configmap may be even an implementation detail here. The flow logging feature may be configured with the yaml file available in the cilium-agent's file system. In this case it may be mounted from either CM, host path, PV. The configuration seems to be quite flexible then.

Given into account that this is an observability feature I don't find it especially crucial to reconfigure the system in the near real-time. The 30-60s delay from kubelet configmap remount seems reasonable to me.

At this point I believe we should take a little step back and find an agreement over the feature requirements and solution design.

From what I understand these are the requirements:

user can enable flow logging feature on demand without the need for agent to be restarted
user can enable multiple logging tasks concurrently
each logging task can have its own set of filters (allow/deny) applied on the flows to be logged
each logging task can specify its own log format (fields to include)
each logging task can specify its own target log file path/pattern and retention
each logging task may optionally have end of collection timestamp

In such a case I would suggest the following solution:

the configuration will be managed via YAML file
there will be a single YAML file with a configuration
in default helm config the YAML file will have an empty structure and will be kept in configmap, configmap will be mounted to the cilium-agent pod
path to the YAML file will be configurable with agent flags
The YAML file will contain a list of flow logging tasks
Each flow logging task definition will have fields to configure the output file path, output format, filters and end time
The re-configuration within cilium-agent will be triggered by i-notification on yaml file change
Each flow logging task will have additional artificial id (string) that will be used to match the existing/previous configuration in-memory
- if no changes get detected, then no action will be taken regarding flow logging job
- if configuration changes are detected, then the currently running task will be stopped and new task will be executed with new config
- if the configuration for the running task gets removed, the task will be stopped

Let me know what you think. If what I wrote seems reasonable to you, I can propose YAML file format and later some PoC.

chancez · 2023-10-04T16:18:43Z

@marqc I generally agree with the idea. I'm not sure we need to support "multiple logging tasks", but could be convinced otherwise. I think the format of the config could likely just be the CRD definition but adapted into a regular struct/etc. But overall I agree with the idea.

nathanperkins · 2023-10-04T20:14:39Z

I'm okay with either approach but I do want to at least mention some CUJs that we are tracking which are not well-supported if there is only a single configuration for logging tasks and no RBAC.

Operators with little connection to each other must cooperate to maintain a single config containing many filter configurations which each achieve their own purposes. It could result in a config that is hard to maintain because it is unclear to any specific operator whether leftover filters should be removed.
Some flows (like network policy drops) might be tracked for security auditing compliance. Other more detailed flows may be used for reliability engineering or debugging specific issues. These may go into observability pipelines with different visibility and have different retention policies.
Lack of granular RBAC controls makes it difficult to provide access to the feature while protecting flow filters for a hard requirement like compliance.
We want to have an expiration mechanism for specific flow filters which are added to debug an issue. This is hard to do with configmap.

We are getting around some of these issues by providing our own CRD and controller but it adds complexity to using the feature.

chancez · 2023-10-05T20:22:23Z

We want to have an expiration mechanism for specific flow filters which are added to debug an issue. This is hard to do with configmap.

I thought that was just a matter of setting the end field in the config?

We are getting around some of these issues by providing our own CRD and controller but it adds complexity to using the feature.

I understand the difficulty, but I think that's the right approach for this right now. Hopefully what you learn will inform us on the best approach for managing Cilium configuration longer term.

chancez · 2023-10-26T14:26:41Z

I'm going to close this out to remove the review requests. We can discuss topics further in the original issue.

AwesomePatrol requested review from a team as code owners September 19, 2023 08:18

AwesomePatrol requested a review from sayboras September 19, 2023 08:18

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 19, 2023

AwesomePatrol force-pushed the flow-log-crd branch from 84c2e73 to bf6f5f2 Compare September 19, 2023 09:05

AwesomePatrol commented Sep 19, 2023

View reviewed changes

nathanperkins suggested changes Sep 20, 2023

View reviewed changes

AwesomePatrol force-pushed the flow-log-crd branch from bf6f5f2 to a7f0616 Compare September 21, 2023 15:40

Add CiliumFlowLog CRD

e30a922

Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>

AwesomePatrol force-pushed the flow-log-crd branch from a7f0616 to e30a922 Compare September 22, 2023 07:50

sayboras requested review from a team, kaworu and chancez and removed request for a team and kaworu September 25, 2023 14:46

marqc approved these changes Sep 28, 2023

View reviewed changes

sayboras removed their request for review October 10, 2023 12:09

chancez closed this Oct 26, 2023

marqc mentioned this pull request Oct 31, 2023

Introduce dynamic hubble flow logs exporters based on config file #28873

Merged

8 tasks

kaworu mentioned this pull request Apr 10, 2024

Add CFP-30788 Managing Hubble Metrics cardinality cilium/design-cfps#29

Merged

Add CiliumFlowLog CRD #28220

Add CiliumFlowLog CRD #28220

Uh oh!

Conversation

AwesomePatrol commented Sep 19, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayboras commented Sep 19, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathanperkins Sep 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AwesomePatrol Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marqc left a comment

Choose a reason for hiding this comment

Uh oh!

chancez commented Sep 29, 2023

Uh oh!

AwesomePatrol commented Sep 29, 2023

Uh oh!

nathanperkins commented Sep 29, 2023

Uh oh!

chancez commented Oct 2, 2023

Uh oh!

marqc commented Oct 4, 2023

From what I understand these are the requirements:

In such a case I would suggest the following solution:

Uh oh!

chancez commented Oct 4, 2023

Uh oh!

nathanperkins commented Oct 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chancez commented Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chancez commented Oct 26, 2023

Uh oh!

Uh oh!

nathanperkins Sep 20, 2023 •

edited

Loading

AwesomePatrol Sep 22, 2023 •

edited

Loading

nathanperkins commented Oct 4, 2023 •

edited

Loading

chancez commented Oct 5, 2023 •

edited

Loading