Skip to content
This repository was archived by the owner on Dec 21, 2023. It is now read-only.
This repository was archived by the owner on Dec 21, 2023. It is now read-only.

Keptn Core (remediation-service): React on problem.open and process predefined workflow: trigger action, wait, evaluate, continue remediation or send a remediation.finished #1849

@johannes-b

Description

@johannes-b

The implementation of the remediation workflow is based on the event stream outlined here: #1616

Functionality of the remediation-service using remediation.yaml/0.2.0:

  • When receiving a sh.keptn.problem.open event:
  1. Get the remediation file for the problematic service (from the corresponding project/stage). (--> if not found: send sh.keptn.event.remediation.finished with result: Could not execute remediation action because service is not available or remediation file not configured and status: errored)
  2. Verification that remediation.yaml/0.2.0 is used
  3. Add sh.keptn.event.remediation.triggered in materialized view
  4. Get the first action specified for the problem type + state (open/resolved) (if problem type is not listed and there is a “*” configured, use that configuration)
  5. If an action is configured:
    • then, send an sh.keptn.event.action.triggered
    • else, send an sh.keptn.event.remediation.finished with result: triggered all actions and status: succeeded
  6. Add sh.keptn.event.remediation.status.changed in materialized view containing the triggered action
  • When receiving a sh.keptn.event.action.finished event:
  1. Wait for 10 minutes and then send a sh.keptn.event.start-evaluation
  • When receiving a sh.keptn.event.evaluation-done event with test-strategy=real-user:
  1. Get the remediation file for the problematic service (from the corresponding project/stage). (--> if not found: send sh.keptn.event.remediation.finished with result: Could not execute remediation action because service is not available or remediation file not configured and status: errored)
  2. Verification that remediation.yaml/0.2.0 is used
  3. Check whether the remediation is still open by querying the open remediations using the project, stage, service, and shkeptncontext data. In case the remediation is open, the last remediation.status.changed the event indicates the last executed action.
  4. Get the next action specified for the problem type + state (open/resolved) (if problem type is not listed and there is a “*” configured, use that configuration)
  5. If a next action is configured:
    • then, send an sh.keptn.event.action.triggered
    • else, send an sh.keptn.event.remediation.finished
  • When receiving a problem closed event:
  1. Get all open remediations for the service.
  2. If there are open remediations matching the problem.id, delete the open remediation by calling the /DELETE endpoint for remediation actions in the configuration-service.

Definition of Done:

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions