DREAM

Setup & Installation

cd DREAM
pip install -r requirements.txt

# Download weight demo based on llava-v1.6-vicuna-7b
git clone https://huggingface.co/Alexhu1999/DREAM-llava-v1.6-vicuna-7b

Inference

The inference code we provide automatically allocates model weights (loading a model across multiple GPUs), allowing you to run models that exceed the memory of a single GPU.

With UI

We have provided a suggested web interface, which you can use by running the following command. After the model is fully loaded, a URL will be output in the terminal, which you can enter into your browser to access.

python -m dream.application.webui --ea-model-path /home/apc/models/DREAM-Vicuna-7B-v1.3 --base-model-path /home/apc/models/vicuna-7b-v1.3 --model-type vicuna --total-token 8

The total-token is the number of draft tokens. For smaller models and advanced GPUs, this value can be set larger. Adjusting according to the specific device and model can achieve better results. If set to -1, DREAM will automatically configure this parameter.

Train

Generate Train Data

You can run the following command to generate the training data.

python -m dream.ge_data.allocation --outdir [path of data]

Train the Auto-regression Head

cd dream/model
deepspeed main_deepspeed.py --deepspeed_config /home/apc/DREAM/dream/train/ds_config.json --tmpdir /home/apc/Bingle/data/llava_vicuna_mmt_0/12_data/sharegpt_0_7999_mufp16 --cpdir /home/apc/DREAM/dream/train/vicuna-7b-ckpt --configpath /home/apc/DREAM/dream/train/vicuna_7B_config.json

Evaluation

You can test the speed of DREAM on MT-bench using the following command.

python -m dream.evaluation.eval_llava\
		 --ea-model-path [path of DREAM weight]\ 
		 --base-model-path [path of the original model]\

The above two commands will each generate a .jsonl file that records the generation results and wall time.

Acknowledgements

This project has been influenced by many excellent projects in the LLM community, such as Medusa, EAGLE, FastChat, and others. We first release LLaVA version, others will merge together soon.

📄 Citation

If you find our work useful, please consider citing:

@misc{hu2025dreamdraftingrefinedtarget,
  title={DREAM: Drafting with Refined Target Features and Entropy-Adaptive Cross-Attention Fusion for Multimodal Speculative Decoding}, 
  author={Yunhai Hu and Tianhua Xia and Zining Liu and Rahul Raman and Xingyu Liu and Bo Bao and Eric Sather and Vithursan Thangarasa and Sai Qian Zhang},
  year={2025},
  eprint={2505.19201},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2505.19201}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dream		dream
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DREAM

Contents

Setup & Installation

Inference

With UI

Train

Generate Train Data

Train the Auto-regression Head

Evaluation

Acknowledgements

📄 Citation

About

Uh oh!

Releases

Packages

Languages

License

SAI-Lab-NYU/DREAM

Folders and files

Latest commit

History

Repository files navigation

DREAM

Contents

Setup & Installation

Inference

With UI

Train

Generate Train Data

Train the Auto-regression Head

Evaluation

Acknowledgements

📄 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages