Forward unknown args to vLLM #291
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #287
Changes proposed in this pull request:
This might be a bit of a silly approach, but theres a ton of arguments for vLLM, and patching in argument by argument as a new one is needed seems unnecessary, so I'm just forwarding any unknown argument to vLLM. Happy to hear alternatives.
I was testing OlmOCR + vLLM 0.10.0 with data parallelism on some B200s as 0.10.0 has improved support for Blackwell, but I was running into issues with the multimodal input cache so I needed to add
--disable-mm-preprocessor-cache
(does solve the issue), this allows me to do it from the command line.Edit: I also needed to increase the wait time for vLLM startup, but I didn't add that here. Reasonable to add as a command line arg?
Before submitting
section of the
CONTRIBUTING
docs.Writing docstrings section of the
CONTRIBUTING
docs.