SpecReason

This repo contains a proof-of-concept example of the system described in the paper SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning.

Getting started

Environment setup

Create a conda environment: conda create -n specreason python=3.12 -y
Activate the conda environment: conda activate specreason
Install vLLM 0.8.2: pip install vllm
- To run speculative decoding, you need to fix an issue in vLLM and install from source. See spec_decoding_fix.md for more details.
Install other dependencies: pip install datasets (and whatever shows up in error messages 🙂)

Launching vLLM servers

The script spec_reason.py requires two vLLM engines to be up and running. We use the following command to launch a 32B base model and a 1.5B small model on two A6000-48GB GPUs, both with TP=2.

You might need to adjust the ratio of how the KV cache memory space is partitioned (--gpu-memory-utilization 0.1) according to your hardware setup.

VLLM_USE_V1=0 vllm serve Qwen/QwQ-32B --dtype auto -tp 2 --max_model_len 8192 --gpu-memory-utilization 0.8 --enable-prefix-caching --port 30000
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --dtype auto -tp 2 --max_model_len 8192 --gpu-memory-utilization 0.1 --enable-prefix-caching --port 30001

Running SpecReason

mkdir results
OUTPUT_DIR=./results
python spec_reason.py --dataset_name aime --problem_id 60 --repeat_id 0 --score_threshold 7.0 --score_method greedy --token_budget 8192 --output_dir "$OUTPUT_DIR"

References

@article{pan2025specreason,
  title={SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning},
  author={Pan, Rui and Dai, Yinwei and Zhang, Zhihao and Oliaro, Gabriele and Jia, Zhihao and Netravali, Ravi},
  journal={arXiv preprint arXiv:2504.07891},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
spec_decoding_fix.md		spec_decoding_fix.md
spec_reason.py		spec_reason.py
spec_reason_della.sh		spec_reason_della.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpecReason

Getting started

Environment setup

Launching vLLM servers

Running SpecReason

References

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ruipeterpan/specreason

Folders and files

Latest commit

History

Repository files navigation

SpecReason

Getting started

Environment setup

Launching vLLM servers

Running SpecReason

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages