Skip to content

scnuhealthy/VTON360

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎥 [CVPR 2025] VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Zijian He1 , Yuwei Ning2 , Yipeng Qin3, Guangrun Wang1, Sibei Yang4, Liang lin1,5, Guanbin Li1,5*
* Corresponding authors 1Sun Yat-sen University, 2Huazhong University of Science and Technology,
3Cardiff University, 4ShanghaiTech University, 5Peng Cheng Laboratory

✨ News

  • [03.18.2025] We release our code and demo data for 3D lifting.

⚙️ Installation

Clone the repo and create a new conda env.

git clone https://github.com/scnuhealthy/VTON360.git
conda create -n vton360 python=? (TO BE CHECKED)
conda activate vton360

1. Diffusion Dependency

2. NeRF Studio

  1. Install NeRF Studio
python -m pip install --upgrade pip==24.2
pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install ninja git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install nerfstudio==1.0.0
ns-install-cli # Optional, for tab completion.
  1. Install gsplat
pip install gsplat==0.1.2.1
  1. Install the requirement packages
pip install -r requirements.txt
  1. Install our customized Splatfacto. See src/splatfactox/README.md for more details.
cd src
pip install -e .

🗄️ Data

Use Our Preprocessed Data

We provide several data rendered from Thuman2.1 and MVHumannet for you here with code aq4f. You can refer to the next Sec. Render from Thuman2.1 to render manually.

Render from Thuman2.1

A. Download Thuman2.1

... Download to /PATH/TO/Thuman2.1

B. Render Multi-view Images from Thuman2.1's .obj files

Change the thuman_root and save_root in src/render_from_thuman/render_multiview_images.py and run the script.

cd src/render_from_thuman/
python render_multiview_images.py

C. Process Multi-view Images as NerfStudio's Format and Extract Masks

  1. Download ckpt folder used for human parsing from here, and place it in src/render_from_thuman/ckpt

  2. Change the root in src/render_from_thuman/process2ns_fmt.py to the rendered multi-view images from previous step and run the script.

cd src/render_from_thuman/
python process2ns_fmt.py

D. Extract Cloth From Rendered Images

Note that this step is optional if you use your own cloth images.

cd src/render_from_thuman/
python get_cloth_thuman.py

E. Extract Intrinsic and Extrinsic Parameters

revise the path inv_export.py

cd src/render_from_thuman/
python inv_export.py

▶️ Get Started

1. Multi-view Consistant Try-On

A. Download the checkpoint and pre-trained models

  1. Put the checkpoint into 'src/multiview_consist_edit/checkpoints' We provide two checkpoints: 'thuman_tryon_mvattn_multi/checkpoints-30000' and 'mvhumannet_tryon_mvattn_multi/checkpoints-40000'

checkpoint for thuman: here with code 32h3.

checkpoint for mvhumannet: here with code mahx

  1. Download clip-vit-base-patch32 and sd-vae-ft-mse from Huggingface

  2. set the path in the 'src/multiview_consist_edit/config'

  3. Download the parsing checkpoint from here, and put them into 'src/multiview_consist_edit/parse_tool/ckpt/'

B. Image editing

cd src/multiview_consist_edit
python infer_tryon_multi.py

The edited results are saved into 'output_root'.

Since the limitation of GPU memory, set 'output_front=False' and rerun to get the back views prediction

python infer_tryon_multi.py

C. Post-precross

cd parse_tool
python postprocess_parse.py 'output_root'
cd ../
python postprocess_thuman.py --image_root 'output_root' --output_root 'output_post_root' 
or python postprocess_mvhumannet.py --image_root 'output_root' --output_root 'output_post_root' 

D. Training

accelerate config
accelerate launch train_tryon_multi.py

2. 3D Lifting

A. Prepare Your Data as NeRF Studio's Dataset Format

You need to prepare three components for 3D lifting with NeRF Studio.

  • images: multi-view images containing the target person.
  • mask: mask for the target human in multi-view images.
  • transforms.json: NeRF Studio's dataset configuration.

We provide a demo dataset in src/demo_data/splatfactox_demo_data.

You can simply replace the images and mask directories if you use our try-on result.

You can refer to NeRF Studio's Dataset Format for more details if you want to use your own data.

B. Run Our Customized Splatfacto

Splatfacto is an implementation of 3DGS. You can refer to here for our customized version of splatfacto.

cd src
bash scripts/splatfactox.sh

Citation

If you find this code or find the paper useful for your research, please consider citing:

@article{he2025vton,
  title={VTON 360: High-fidelity virtual try-on from any viewing direction},
  author={He, Zijian and Ning, Yuwei and Qin, Yipeng and Wang, Wangrun and Yang, Sibei and Lin, Liang and Li, Guanbin},
  journal={arXiv preprint arXiv:2503.12165},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published