Skip to content

Conversation

dnr
Copy link
Contributor

@dnr dnr commented Dec 1, 2022

What was changed

Several changes to fix problems using dashboards with recent server versions. You might want to review each commit individually, and I can remove ones that we don't want.

  • Replace type tag with service_name: Most panels didn't work without this change.
  • Remove obsolete panels: self-explanatory
  • Set default time range for server dashboards to now-15m: The ranges were inconsistent between server-related dashboards before (I didn't change cloud). I find 15m the most useful when testing the server locally, but I can change to 1h if we want. I just want them to be consistent.
  • Use default data source for server general dashboard: This referred to an old value and grafana complained. I think we can just use the default datasource like the rest of the dashboards. (I didn't change the k8s one.)
  • Remove datasource template from visibility + worker dashboards: This was defaulting to an obsolete value and breaking the dashboards. I think they can use the default like the others.
  • Remove cluster/service templating from visibility and worker dashboards: This was breaking dashboards since

Why?

make things work better

Checklist

  1. Closes

  2. How was this tested:
    Tested with local server+prometheus+grafana (make start-dependencies in temporal repo) after fixing the path (will send a PR there after this is merged)

  3. Any docs updates needed?

Copy link
Contributor

@robholland robholland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just clean up the redundant {}.

@@ -80,7 +80,7 @@
"targets": [
{
"exemplar": true,
"expr": "(sum without (instance, pod) (started{cluster=\"$cluster\"})) - (sum without (instance, pod) (stopped{cluster=\"$cluster\"} or started{cluster=\"$cluster\"}*0))",
"expr": "(sum without (instance, pod) (started{})) - (sum without (instance, pod) (stopped{} or started{}*0))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "(sum without (instance, pod) (started{})) - (sum without (instance, pod) (stopped{} or started{}*0))",
"expr": "(sum without (instance, pod) (started)) - (sum without (instance, pod) (stopped or started*0))",

@@ -200,7 +200,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod) (rate(executor_done{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_done{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod) (rate(executor_done{}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_done[1m]))",

@@ -282,7 +282,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod) (rate(executor_dropped{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_dropped{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod) (rate(executor_dropped{}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_dropped[1m]))",

@@ -364,7 +364,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod) (rate(executor_err{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_err{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod) (rate(executor_err{}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_err[1m]))",

@@ -446,7 +446,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod) (rate(executor_deferred{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_deferred{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod) (rate(executor_deferred{}[1m]))",
"expr": "sum without (instance, pod) (rate(executor_deferred[1m]))",

@@ -2035,7 +2035,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails[1m]))",

@@ -2117,7 +2117,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod)(rate(replicator_messages{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_messages{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod)(rate(replicator_messages{}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_messages[1m]))",

@@ -2198,15 +2198,15 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod)(rate(replicator_messages_dropped{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_messages_dropped{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod)(rate(replicator_messages_dropped{}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_messages_dropped[1m]))",

"hide": false,
"interval": "",
"legendFormat": "{{operation}}",
"refId": "D"
},
{
"exemplar": true,
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails{}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_dlq_enqueue_fails[1m]))",

@@ -2288,7 +2288,7 @@
"targets": [
{
"exemplar": true,
"expr": "sum without (instance, pod)(rate(replicator_errors{cluster=\"$cluster\"}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_errors{}[1m]))",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "sum without (instance, pod)(rate(replicator_errors{}[1m]))",
"expr": "sum without (instance, pod)(rate(replicator_errors[1m]))",

@dnr dnr merged commit 590cbf3 into master Dec 7, 2022
@dnr dnr deleted the dnr/fixes1 branch December 7, 2022 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants