-
Notifications
You must be signed in to change notification settings - Fork 549
Description
This issue will be kept open and pinned for a long time, as we hope to hear everyone's opinions, suggestions, and needs!
We want to make YOLO-World stronger and encourage more diverse applications, especially practical ones. We maintain an open and free attitude. YOLO-World is currently in active development and improvement, and we are trying our best to do well in upstream pre-training and downstream deployment tools. At present, our manpower is limited, so we hope you can give us some time and contribute your experience or help when you can!
If you have a good idea or need, just reply to this issue and @ me. I will respond promptly when I see it, and consider adding it to the TODO list.
这个issue将会长时间保持开放并置顶,因为我们希望听到大家的意见、建议和需求!
我们希望让YOLO-World变得更强大,并鼓励更多样化的应用,尤其是实际应用。我们保持开放和自由的态度。YOLO-World目前正处于积极的开发和改进阶段,我们正在尽最大努力做好上游预训练和下游部署工具。目前,我们的人力有限,因此希望大家能给我们一些时间,并在可以的时候贡献您的经验或帮助!如果您有好的想法或需求,请回复此问题并@我。我看到后会及时回应,并考虑将其加入待办事项列表。
TODO List (Community Version)
🎯: High priority or on-going.
- Optimize
torch.enisum
(👍 thank @taofuyu for replace einsum() with other ops #118) - Support more language models, CLIP-Large (high priority), BEIT-3 (@mio410), and T5-Encoder.
- Support image prompts (Could YOLO-World use some images as category? #102)
- Support prompt tuning, few-shot learning (internal demand) #141
- Results on ODinW (Results on the ODinW benchmark #98).
- 🎯 Fix ONNX bugs & ONNX demo & ONNX detailed documentations (ONNX input #27 ONNX export #33 onnx export #77 ONNX export questions #50).
- TensorRT export & TensorRT demo & TensorRT documentations (TensorRT #29).
- 🎯 Fine-tune more 1280-resolution pre-trained models (More pre-trained YOLO-Worldv2 models with an input resolution of 1280 #142).
- 🎯 Fine-tune with bad results on COCO without
mask-refine
(YOLO-WORLD-S在coco上finetune无法复现,且validation map呈现下降趋势 #160 finetuneing on custom dataset #72 yolo-wolrd-l 在 coco上finetune 无法成功复现 #76). - 🎯 Evaluate open-vocabulary/zero-shot capability after fine-tuning or prompt-tuning (OpenSet Detection Issue after COCO Fine-Tuning #78 Zero-shot performance about YOLOWorldPromptDetector #154).
- 🎯 Demo with image prompts (Whether there is a demo about the detection of image prompts? #208).
- Optimize training pipelines to improve resource utilization (During DDP training, eight GPUs are temporarily locked at 100% usage. #165).
- Batch/distributed inference (image_demo运行起来太慢了,有没有连续快速图片识别的方法 #246 目标检测和实例分割批量推理 #253)
- Video inference (inference on video #182 Perform inference on a video using custom weights #263)
- ONNX with text inputs & text embeddings (onnx模型导出问题 #285)
- Demos with caption inputs (How to detect any object using caption inputs ? #334 Text input #315 )