Skip to content

Conversation

entangled90
Copy link
Contributor

Description

HealthMonitor reports arbitrarily nested component in a tree-like structure

CriticalHealthMonitor recursively updates registered components in order to aggregate their health status into a single recursive data-type

HealthReport has been refactored by adding a children field that makes it recursive. Small refactoring has been done, primarily moving some functions from CriticalComponentHealthMonitor into HealthReport.

HealthStatus has been extended with a comparator, so that it's possible to take the "maximum" of a collection of HealthStatus in a way that returns the "worst" HealthStatus.

Related issues

relates #16454

@entangled90 entangled90 self-assigned this Oct 23, 2024
@github-actions github-actions bot added component/zeebe Related to the Zeebe component/team component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 23, 2024
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 4870196 to 7625f9e Compare October 23, 2024 17:01
@entangled90 entangled90 requested a review from npepinpe October 23, 2024 17:09
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch 2 times, most recently from 2295e0a to b781879 Compare October 24, 2024 07:23
@entangled90 entangled90 removed component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 24, 2024
@github-actions github-actions bot added component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 24, 2024
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch 3 times, most recently from 890fd11 to 194e659 Compare October 24, 2024 11:12
@entangled90 entangled90 marked this pull request as ready for review October 24, 2024 11:13
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 194e659 to d83cfdf Compare October 24, 2024 11:42
Copy link
Member

@npepinpe npepinpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 First, I generally really like the changes and how you approached the PR. I saw you started initially with recomputing on every update, and then changed that down the line 👍 It's a lot of comments, but I want to highlight generally I like the PR 🙂

❓ My main question here is why are we separating it from the partition status? Do we think it would pollute the result of the partition status call, and we want to keep it separate? That's a fair point, but I'm also not sure about it honestly. I would poll the other engineers in our team channel on it (I can do that for you so you see how we do it via slack :))

I'm also not super into adding a new endpoint /actuator/partitionHealth. It's additional endpoints for little benefit I think, and makes it harder to discover partition related things. Why can't we use /actuator/partitions/health? We can simply add a new operation in the existing actuator endpoint (BrokerAdminServiceEndpoint). If it's complicated because of how it's set up now (as a plain Endpoint), one option is migrating it to a WebEndpoint).

One thing we've been wanting to do is de-Springify the modules, and move all Spring magic into a single module (dist), or leave it in Spring specific modules (e.g. the REST API). This is to avoid complex object graphs which become hard to track (if anything can provide beans), and minimize in general the Spring magic. So one opportunity here is moving the existing BrokerAdminServiceEndpoint to the dist module, for example. There's also some possible refactoring as follow up we can do in the admin service itself from what I can see 😄 (but we can discuss that separately).

@entangled90 entangled90 removed component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 24, 2024
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from d83cfdf to 9e35394 Compare October 25, 2024 09:12
@github-actions github-actions bot added component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 25, 2024
@entangled90 entangled90 requested a review from npepinpe October 25, 2024 09:15
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 9e35394 to c7ace53 Compare October 25, 2024 09:17
@entangled90 entangled90 removed component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 25, 2024
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from c7ace53 to 1a77cda Compare October 25, 2024 09:49
@github-actions github-actions bot added component/operate Related to the Operate component/team component/tasklist Related to the Tasklist component/team component/optimize Related to Optimize component/team labels Oct 25, 2024
…ke structure

CriticalHealthMonitor recursively updates registered components in order
to aggregate their health status into a single recursive data-type

HealthReport has been refactored by adding a children field that makes
it recursive. Small refactoring has been done, primarily moving some
functions from CriticalComponentHealthMonitor into HealthReport.

HealthStatus has been extended with a comparator, so that it's possible
to take the "maximum" of a collection of HealthStatus in a way that
returns the "worst" HealthStatus.

relates #16454
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 18e9e3d to 3d730ff Compare October 28, 2024 09:10
Copy link
Member

@npepinpe npepinpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 nice

Please address all comments, but I'm pre-approving because I doubt we need another review (though if you want, you can ofc request one!)

@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 3d730ff to 4fc9497 Compare October 29, 2024 07:54
@entangled90 entangled90 force-pushed the cs-16454-visualize-critical-health-components branch from 4fc9497 to 42dc62d Compare October 29, 2024 08:17
@entangled90 entangled90 added this pull request to the merge queue Oct 29, 2024
Merged via the queue into main with commit e0801fd Oct 29, 2024
64 checks passed
@entangled90 entangled90 deleted the cs-16454-visualize-critical-health-components branch October 29, 2024 09:51
@npepinpe npepinpe added the backport stable/8.7 Backport a pull request to stable/8.7 label Aug 14, 2025
@npepinpe
Copy link
Member

/backport

@backport-action
Copy link
Collaborator

Created backport PR for stable/8.7:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-23929-to-stable/8.7
git worktree add --checkout .worktree/backport-23929-to-stable/8.7 backport-23929-to-stable/8.7
cd .worktree/backport-23929-to-stable/8.7
git reset --hard HEAD^
git cherry-pick -x 4fe0324a8902fce33f91d2a03d88c6d3ae913bc1 d894274b72edd020a31104f904a674981b614781 42dc62d82a5704355ca151941694a87a7d2765b7
git push --force-with-lease

github-merge-queue bot pushed a commit that referenced this pull request Aug 15, 2025
…component in a tree-like structure (#36806)

# Description
Backport of #23929 to `stable/8.7`.

relates to #16454
github-merge-queue bot pushed a commit that referenced this pull request Aug 15, 2025
…ts arbitrarily nested component in a tree-like structure (#36825)

# Description
Backport of #36806 to `stable/8.6`.

relates to #23929 #16454
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport stable/8.7 Backport a pull request to stable/8.7 component/operate Related to the Operate component/team component/optimize Related to Optimize component/team component/tasklist Related to the Tasklist component/team component/zeebe Related to the Zeebe component/team version:8.6.25 version:8.7.11
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants