Skip to content

Conversation

namasl
Copy link
Contributor

@namasl namasl commented Jun 26, 2025

This PR addresses #345, adding startup, liveness, and readiness probes to prefill and decode pods.

The startup probes give a lot of leeway to try to cover small models which load quickly, and large models with long load times (values here allow up to 30 minutes).

@nerdalert
Copy link
Member

@namasl this looks great. I tested without any issues. Can you bump the chart version to v1.0.21 in https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/Chart.yaml and re-run pre-commit run -a.

cc/ @Gregory-Pereira

@nerdalert
Copy link
Member

@namasl I apologize this hasn't merged yet. I really like this feature. Could you bump https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/Chart.yaml one more time to version: 1.0.22 and I will merge it 🙏

@nerdalert
Copy link
Member

nerdalert commented Jul 8, 2025

@namasl mind rebasing and pre-commit run -a as well please, ty!

namasl added 2 commits July 7, 2025 23:58
Signed-off-by: Nick Masluk <nick@randombytes.net>
Copy link
Member

@nerdalert nerdalert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ty for the feature.

@nerdalert nerdalert merged commit a51e9ca into llm-d:main Jul 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants