V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

[ICCV 2025] This is the official implementation of "V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction", Zewei Zhou, Hao Xiang, Zhaoliang Zheng, Seth Z. Zhao, Mingyue Lei, Yun Zhang, Tianhui Cai, Xinyi Liu, Johnson Liu, Maheswari Bajji, Xin Xia, Zhiyu Huang, Bolei Zhou, Jiaqi Ma

V2XPnP is the first open-source V2X spatio-temporal fusion framework for cooperative perception and prediction. This framework combines the intermediate fusion strategy and one-step communication and integrates diverse attention fusion modules in the unified Transformer architecture for V2X spatial-temporal information. Our benchmark model zoo includes 11 SOAT models across no fusion, early fusion, late fusion, and intermediate fusion.

V2XPnP Sequential Dataset is the first large-scale, real-world V2X sequential dataset featuring multiple agents and all V2X collaboration modes, ie, vehicle-to-vehicle (V2V), infrastructure-to-infrastructure (I2I), vehicle-centric (VC), and infrastructure-centric (IC).

Supported by the UCLA Mobility Lab

News

2025/06: Our further work TurboTrain is also accepted by ICCV 2025!
2025/06: V2XPnP is accepted by ICCV 2025!
2025/03: V2XPnP Dataset 1.0 release (Train [P1,P2, P3, P4], Val, Test, Map)
2024/12: V2XPnP paper release

CodeBase Features

Support both simulation and real-world V2X dataset
Multiple Tasks supported
- Cooperative perception and prediction
- Cooperative single-frame perception
- Cooperative temporal perception
- Cooperative prediction
SOTA model supported
- No Fusion (Decoupled)
- FaF [CVPR2018] (No Fusion-End2end)
- Early Fusion
- Late Fusion (Decoupled)
- F-Cooper [SEC2019]
- V2VNet [ECCV2020]
- DiscoNet [NeurIPS 2021]
- V2X-ViT [ECCV 2022]
- CoBEVFlow [NeurIPS 2023]
- FFNet [NeurIPs 2023]
- V2XPnP [Ours]

Release Plan

2024/12: ✅ Sample Data of V2XPnP in Google Drive
2025/03: ✅ V2XPnP Dataset 1.0 (68 scenarios - Train [P1,P2, P3, P4], Val, Test, Map)
2025/09: V2XPnP Codebase - Official Version 1.0
2025/10: V2XPnP Dataset 2.0 (Whole 100 scenarios)

Dataset

The V2XPnP Dataset 1.0 can be downloaded in Train [P1,P2, P3, P4], Val, Test, and Map. The sample data of V2XPnP Sequential Dataset can be accessed in Google Drive, and we will release the v2.0 dataset later. The sequential perceptions data format follows the OpenCOOD, and the trajectory dataset records the whole trajectory of each agent in each scenario.

Benchmark Tasks

End-to-end cooperative perception and prediction
Cooperative temporal perception
Cooperative prediction
To be added

Acknowledgement

The codebase is built upon OpenCOOD in the OpenCDA ecosystem family, and the V2X-Real, another project in OpenCDA, serves as one of the data sources for this project.

We want to thank the following data annotators and reviewers at UCLA:

Mingxuan Gao, Yuxin Bao, Yuhan Zhang, Anthony Chui, Jiajin Cui, Judas Lopez, Yingshi Ye, Michelle Zhao, Ethan Huang, Vincent Ton, Henry Wei, Yuxiang Wei, Aiden Wong, Julia Chen, Alex Gorin, Yanling Sang, Qizhen Zhao, Dongjun Chao, Jingyang Xu, XingXiang Huang.

Citation

If you find this repository useful for your research, please consider giving us a star 🌟 and citing our paper.

@article{zhou2024v2xpnp,
 title={V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction},
 author={Zhou, Zewei and Xiang, Hao and Zheng, Zhaoliang and Zhao, Seth Z. and Lei, Mingyue and Zhang, Yun and Cai, Tianhui and Liu, Xinyi and Liu, Johnson and Bajji, Maheswari and Xia, Xin and Huang, Zhiyu and Zhou, Bolei and Ma, Jiaqi},
 journal={arXiv preprint arXiv:2412.01812},
 year={2024}
}

Other useful citations:

@article{zhou2025turbotrain,
 title={TurboTrain: Towards Efficient and Balanced Multi-Task Learning for Multi-Agent Perception and Prediction},
 author={Zhou, Zewei and Zhao, Seth Z. and Cai, Tianhui and Huang, Zhiyu and Zhou, Bolei and Ma, Jiaqi},
 journal={arXiv preprint arXiv:2508.04682},
 year={2025}
}

@article{xiang2024v2xreal,
 title={V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception},
 author={Xiang, Hao and Zheng, Zhaoliang and Xia, Xin and Xu, Runsheng and Gao, Letian and Zhou, Zewei and Han, Xu and Ji, Xinkai and Li, Mingxi and Meng, Zonglin and others},
 journal={arXiv preprint arXiv:2403.16034},
 year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
images		images
opencood		opencood
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

News

Overview

CodeBase Features

Release Plan

Dataset

Benchmark Tasks

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Zewei-Zhou/V2XPnP

Folders and files

Latest commit

History

Repository files navigation

V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

News

Overview

CodeBase Features

Release Plan

Dataset

Benchmark Tasks

Acknowledgement

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages