RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild

This repository is the official implementation of RealisDance-DiT. RealisDance-DiT is a structurally simple, empirically robust, and experimentally strong baseline model for controllable character animation in the wild.

News

2025-05-19: Released the inference code and weights of RealisDance-DiT.
2025-05-19: You may also be interested in our Uni3C.
2024-10-15: Released RealisDance and the code of pose preparation for RealisDance.
2024-09-10: Now you can try more interesting AI video editing in XunGuang.
2024-09-09: You may also be interested in our human part repair method RealisHuman.

Gallery

Here are several character animation generated by RealisDance-DiT. Note that the GIFs shown here have some degree of visual quality degradation. Please visit our project page for more original videos

TODO List

Note: This released project has two slight differences between the paper.

We only use SMPL-CS and HaMer in this version in order to support Uni3C. Because it is difficult to align and render DWPose in new camera view.
(Coming soon) In the Shifted RoPE part, we also shifted the RoPE along the frame dimension. Because we find that share the first frame RoPE will sometimes introduce slight artifacts at the first frame.

Quick Start

1. Setup Repository and Environment

git clone https://github.com/theFoxofSky/RealisDance.git
cd RealisDance

conda create -n realisdance python=3.10
conda activate realisdance

pip install -r requirements.txt

# FA3 (Optional)
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
git checkout ea3ecea97a1393c092863330aff9a162bb5ce443  # very important, using other FA3 will yield bad results
cd hopper
python setup.py install

2. Download Checkpoints

Please download the checkpoints from the huggingface repo to './pretrained_models'. Please make sure the structure of './pretrained_models' is consistent with the one in the huggingface repo.

./pretrained_models
|---image_encoder/
    |---...
|---scheduler/
    |---...
|---text_encoder/
    |---...
|---tokenizer/
    |---...
|---transformer/
    |---...
|---vae/
    |---...
|---model_index.json

3. Quick Inference

Inference with Demo sequences

python inference.py \
    --ref __assets__/demo/ref.png \
    --smpl __assets__/demo/smpl.mp4 \
    --hamer __assets__/demo/hamer.mp4 \
    --prompt "A blonde girl is doing somersaults on the grass. Behind the grass is a river, \
    and behind the river are trees and mountains. The girl is wearing black yoga pants and a black sports vest." \
    --save-dir ./output

Inference with TeaCache for acceleration (Optional, may cause quality degradation)

python inference.py \
    --ref __assets__/demo/ref.png \
    --smpl __assets__/demo/smpl.mp4 \
    --hamer __assets__/demo/hamer.mp4 \
    --prompt "A blonde girl is doing somersaults on the grass. Behind the grass is a river, \
    and behind the river are trees and mountains. The girl is wearing black yoga pants and a black sports vest." \
    --save-dir ./output \
    --enable-teacache

Inference with small GPU memory (Optional, will be super slow. Can be used with TeaCache)

python inference.py \
    --ref __assets__/demo/ref.png \
    --smpl __assets__/demo/smpl.mp4 \
    --hamer __assets__/demo/hamer.mp4 \
    --prompt "A blonde girl is doing somersaults on the grass. Behind the grass is a river, \
    and behind the river are trees and mountains. The girl is wearing black yoga pants and a black sports vest." \
    --save-dir ./output \
    --save-gpu-memory

Inference with multi GPUs (Optional. Can be used with TeaCache)

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 inference.py \
    --ref __assets__/demo/ref.png \
    --smpl __assets__/demo/smpl.mp4 \
    --hamer __assets__/demo/hamer.mp4 \
    --prompt "A blonde girl is doing somersaults on the grass. Behind the grass is a river, \
    and behind the river are trees and mountains. The girl is wearing black yoga pants and a black sports vest." \
    --save-dir ./output \
    --multi-gpu

4. Custom Batch Inference

Prepare your reference images and conditions (coming soon)

TODO

The root dir should be structured as:

root/
|---ref/
    |---1.png
    |---2.png
    |---...
|---smpl/
    |---1.mp4
    |---2.mp4
    |---...
|---hamer/
    |---1.mp4
    |---2.mp4
    |---...
|---prompt/
    |---1.txt
    |---2.txt
    |---...

Batch inference

python inference.py --save-dir ./output --root $PATH-TO-ROOT-DIR

Disclaimer

This project is released for academic use. We disclaim responsibility for user-generated content.

Contact Us

Jingkai Zhou: fs.jingkaizhou@gmail.com

BibTeX

@article{zhou2025realisdance-dit,
  title={RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild},
  author={Zhou, Jingkai and Wu, Yifan and Li, Shikai and Wei, Min and Fan, Chao and Chen, Weihua and Jiang, Wei and Wang, Fan},
  journal={arXiv preprint arXiv:2504.14977},
  year={2025}
}

@article{zhou2024realisdance,
  title={RealisDance: Equip controllable character animation with realistic hands},
  author={Zhou, Jingkai and Wang, Benzhi and Chen, Weihua and Bai, Jingqi and Li, Dongyang and Zhang, Aixi and Xu, Hao and Yang, Mingyang and Wang, Fan},
  journal={arXiv preprint arXiv:2409.06202},
  year={2024}
}

Acknowledgements

Thanks Shikai Li for condition paraperation and Chenjie Cao for pose align.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
__assets__		__assets__
pretrained_models		pretrained_models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild

News

Gallery

TODO List

Quick Start

1. Setup Repository and Environment

2. Download Checkpoints

3. Quick Inference

4. Custom Batch Inference

Disclaimer

Contact Us

BibTeX

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

damo-cv/RealisDance

Folders and files

Latest commit

History

Repository files navigation

RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild

News

Gallery

TODO List

Quick Start

1. Setup Repository and Environment

2. Download Checkpoints

3. Quick Inference

4. Custom Batch Inference

Disclaimer

Contact Us

BibTeX

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages