-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feat: add dynamically configured Hubble metrics #35185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9e79903
to
8073372
Compare
Just noticing that 2 days ago a refactoring of the options flags was merged causing a conflictng with this PR - marking as ready for review while I fix that up since it shouldn't affect the core of the feature and I would like feedback on this change. FYI @chancez |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be pretty careful about registration. This definitely will need some more robust tests that go through registering/deregistering/registering the same handler to verify this doesn't panic.
You will also want to test registering/registering/registering the same handler but with different configuration too, since we want to be able to change which labels it has, but I suspect there's going to be cases where that's not valid. For example it's my understanding that it's not valid to have a metric with 3 labels, then reconfigure it to only have 2 labels, at least with the default Prometheus registry.
Prometheus might be a bit more opinionated about this, and while it might not be valid with the default registry, we may still want to find a way to support it somehow, since it's currently possible to change labels by restarting with a different configuration. This could definitely complicate the implementation though, so I would write tests first, and see what happens, then we can figure out how to proceed. I think it's okay if the initial implementation has some limitations on reconfiguring labels, and we can come back with a custom registry implementation in a follow up PR if we need to.
We also will need more robust validation and we need to be careful about accidentally loading configurations that are invalid. I believe the correct behavior when a new config has any invalid constructs, is to not change the existing metrics configuration at all until the entire new configuration is correct.
Commit d3080de does not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
d3080de
to
f200967
Compare
Commit d3080de does not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
f200967
to
d1b87a0
Compare
d1b87a0
to
1037713
Compare
c344ad3
to
1efd209
Compare
@rolinh I've rebased on main to resolve a failing lint check coming from yesterday's common labels change: #35628
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
/test |
1 similar comment
/test |
Sets up new ConfigMap "cilium-dynamic-metrics-config" and new Helm options under ".Values.hubble.metrics.dynamic". Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
Adds new Hubble flag "hubble-dynamic-metrics-config-path". Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
Adds new logic in Hubble for dynamic metric registration. Adds new methods to the Handler interface: HandleConfigurationUpdate, Deinit. Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
Pass filters to Hubble metric handlers' Init method. Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
Fork metric setup logic into static and dynamic paths during Hubble launch. Split handler bulk creation methods into single iterators. Move ProcessFlow method into the Handler interface. Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
Stop Hubble dynamic metric config change watcher when context is closed. Return early in Hubble start when both dynamic and static metrics specified. Fix pointer ownership convention. Signed-off-by: Igor Klemenski <igor.klemenski@microsoft.com>
1efd209
to
e4978b4
Compare
/ci-clustermesh |
/ci-e2e-upgrade |
Fixes: cilium#35185 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Fixes: cilium#35185 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Fixes: #35185 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Fixes: cilium#35185 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Fixes: cilium#35185 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Following the example of the Hubble Dynamic Exporter, this PR implements the CFP to allow Hubble metrics to be dynamically configured with a ConfigMap, rather than a static string at startup: https://github.com/cilium/design-cfps/blob/main/hubble/CFP-30788-managing-hubble-metrics-cardinality.md
This PR has been scoped to allow turning metric handlers on/off and applying filters - changing the label set of a registered metric is moved to a follow-up change pending more design, because of constraints in Prometheus.
Changes:
launchHubble
to conditionally use static/dynamic metrics configHandler
interface withDeinit
andHandleConfigurationUpdate
functionsFlowProcessor
interface and moveProcessFlow
toHandler
: all metric handlers areFP
s and the onlyFP
that's not a handler isDropEventEmitter
, which is registered separately;struct Handlers
is thus no longer necessaryTesting:
TestReadMetricConfigFromCM
TestHandlersUpdatedInDfpOnConfigChange
TestHubbleServerWithDynamicMetrics
TestMetricReRegisterAndCollect