Skip to content

Add more labels to prometheus output format #862

@sebhoss

Description

@sebhoss

Describe the feature:

The prometheus output format should include references to test cases in order to allow more fine grained alerting. The current implementation only contains a summary of test cases for each resource type and a summary across all tests. Since test cases can have different criticality levels, neither of those summaries are sufficient to decide whether a person should be called right now (potentially in the middle of the night) or not.

This has been previously discussed in #607 and my understanding is that it was postponed, but not rejected. There is a valid concern for storage cost, so I don't think this should/must be enabled by default, but rather should be an format option for the prometheus output.

Describe the solution you'd like

The metric goss_tests_outcomes_total should contain additional labels to uniquely identify a single test case or a different metric should be introduced that does exactly that.

Describe alternatives you've considered

My current workaround is an annotation I placed on the prometheus alert for goss tests that links to a runbook which tells the poor soul who has to deal with this to do:

$ ssh ...
$ goss --gossfile ... validate

This returns the failing tests and they can decide whether to fix it right now or go back to bed.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions