Skip to content
/ RALF Public

Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".

License

Notifications You must be signed in to change notification settings

mlvlab/RALF

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RALF

Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".

Introduction

This RAF branch is for training RAF.

Installation

  • Python 3.10
  • PyTorch 1.12.1
conda create -n raf python=3.10 -y
conda activate raf
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch -y
pip install ftfy regex tqdm

Preparation

Datasets

Download COCO and LVIS to data.

~/data
    ├── coco
    │   ├── annotations/instances_val2017.json
    │   ├── train2017
    │   └── val2017
    └── lvis
        ├── lvis_v1_val.json
        ├── train2017
        └── val2017

To train RAF, we utilized object features extracted from OADP. Referencing OADP, prepare oake features for COCO and LVIS under the clip_region.

~/clip_region
    ├── coco_oake_object_train
    └── lvis_oake_object_train

Generate Features

Generate preprocessed region features from annotation file, using the following commands.

python make_gt_region_feats.py --dataset coco --train_val val
python make_gt_region_feats.py --dataset lvis --train_val val

It produces coco/val and lvis/val as follows.

~/clip_region
    ├── coco/val
    ├── lvis/val
    ├── coco_oake_object_train
    └── lvis_oake_object_train

Noun Chunks

Download v3det_{dataset}_strict.json and v3det_gpt_noun_chunk_{dataset}_strict.pkl which is the noun chunk file generated by GPT from here.

~
├── v3det_coco_strict.json
├── v3det_lvis_strict.json
├── v3det_gpt_noun_chunk_coco_strict.pkl
└── v3det_gpt_noun_chunk_lvis_strict.pkl

Train RAF

Train RAF with the following command.

COCO

python raf.py --dataset coco --work_dir output/raf_coco --concept_pkl_path v3det_gpt_noun_chunk_coco_strict.pkl --oake_file_path clip_region/coco_oake_info_strict.pkl

LVIS

python raf.py --dataset lvis --work_dir output/raf_lvis --concept_pkl_path v3det_gpt_noun_chunk_lvis_strict.pkl --oake_file_path clip_region/lvis_oake_info_strict.pkl

The checkpoint is saved as output/raf_{dataset}/weight_10.pth and will be used as {dataset}_strict.pth under various baselines. Also, it can be downloaded from here.

About

Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published