Skip to content

Health metrics (Part 2) #2796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Nadine-H
Copy link
Contributor

Part of #2736

Adding two custom metrics:

  • dstack_submit_to_provision_duration_seconds: Time from when a run has been submitted and first job provisioning
  • dstack_pending_runs_total: Total number of pending runs

We can add metrics later too, but I think for now these two are helpful to see if there are runs stuck in SUBMITTED or PENDING states, which could be due to an issue with dstack or the underlying infrastructure.

@peterschmidt85 peterschmidt85 requested a review from un-def June 16, 2025 11:08
@Nadine-H Nadine-H force-pushed the nadine/2736_add-custom-health-metrics branch from 8941494 to 7227265 Compare June 18, 2025 18:22
@Nadine-H Nadine-H requested a review from un-def June 18, 2025 18:23
@un-def un-def merged commit 40ee802 into dstackai:master Jun 26, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants