Add single lora adapter support for vLLM inference. #1679

yqchen0205 · 2024-11-12T08:10:44Z

Motivation

When evaluating SFT/DPO trained models, vLLM accelerated inference with a single LoRA adapter is often used, but this feature isn’t supported in the source code. Therefore, I added a few lines of code to enable this simple functionality.

Modification

Added a lora_path in opencompass/models/vllm.py and utilized it during generate.

Use cases (Optional)

Now we can use LoRA vLLM inference as shown in the code below.

models = [
dict(
type=VLLM,
abbr='Llama3_8B_LoRA_SFT',
path='llama-3-8b-instruct',
model_kwargs=dict(tensor_parallel_size=2, dtype='bfloat16', seed=0, max_model_len=4096, enable_lora=True,),
max_out_len=100,
max_seq_len=4096,
batch_size=32,
lora_path="Llama3_8B_LoRA_checkpoints/checkpoint-1250/",
generation_kwargs=dict(temperature=0.0, top_p=0.8, max_tokens=1024,),
stop_words=['<|end_of_text|>', '<|eot_id|>'],
run_cfg=dict(num_gpus=2),
)
]

This reverts commit 3ec178f.

add single lora adapter support for vLLM inference.

f816a53

mm-assistant bot assigned liushz Nov 12, 2024

bittersweet1999 approved these changes Nov 12, 2024

View reviewed changes

bittersweet1999 merged commit 3ec178f into open-compass:main Nov 12, 2024
5 of 7 checks passed

yqchen0205 had a problem deploying to prod November 12, 2024 09:31 — with GitHub Actions Failure

bittersweet1999 added a commit that referenced this pull request Nov 12, 2024

Revert "add single lora adapter support for vLLM inference. (#1679)"

a0e607a

This reverts commit 3ec178f.

bittersweet1999 mentioned this pull request Nov 12, 2024

Revert "Add single lora adapter support for vLLM inference." #1680

Closed

yqchen0205 had a problem deploying to prod November 13, 2024 01:55 — with GitHub Actions Failure

yqchen0205 deleted the add_lora_for_vllm branch November 14, 2024 01:29

stephen-nju pushed a commit to stephen-nju/opencompass that referenced this pull request May 14, 2025

add single lora adapter support for vLLM inference. (open-compass#1679)

45f6004

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add single lora adapter support for vLLM inference. #1679

Add single lora adapter support for vLLM inference. #1679

Uh oh!

yqchen0205 commented Nov 12, 2024

Uh oh!

Uh oh!

Uh oh!

Add single lora adapter support for vLLM inference. #1679

Add single lora adapter support for vLLM inference. #1679

Uh oh!

Conversation

yqchen0205 commented Nov 12, 2024

Motivation

Modification

Use cases (Optional)

Uh oh!

Uh oh!

Uh oh!