A collection of Model Context Protocol (MCP) servers and services that integrate specialized computer vision capabilities with language models. This repository demonstrates how to build modular CV tools that can be easily composed and orchestrated through MCP.
- Object Detection MCP - YOLO-based object detection with MinIO integration
- OCR + Image Generation MCP - Combined OCR and image generation with iterative validation workflows
- Image Generator Server - FLUX.1-schnell diffusion model service
- OCR Server - Multi-model OCR service (Qwen-VL, Janus)
- Python 3.11+
- UV package manager
- Docker with GPU support
- MinIO server (for MCP servers)
# Object Detection
cd object_detection_mcp
uv run object_detector.py
# OCR + Image Generation
cd ocr_imagen_mcp
uv run ocr_imagen.py
# Image Generator
docker buildx build -t flux-schnell -f image_generator_server/Dockerfile .
docker run --gpus all -p 6070:6070 flux-schnell
# OCR Server
docker buildx build -t ocr-server -f ocr_server/Dockerfile .
docker run --gpus all -p 6080:6080 -p 6081:6081 ocr-server
Add to your Claude Desktop configuration:
{
"mcpServers": {
"object_detection": {
"command": "uv",
"args": ["--directory", "/path/to/object_detection_mcp", "run", "object_detector.py"],
"env": {
"YOLO_MODEL_NAME": "yolo11m.pt",
"YOLO_CONF_THRESHOLD": "0.45",
"MINIO_URL": "localhost:9000",
"MINIO_ACCESS_KEY": "your-key",
"MINIO_SECRET_KEY": "your-secret"
}
}
}
}
cv-mcp-tools/
├── object_detection_mcp/ # YOLO object detection MCP server
├── ocr_imagen_mcp/ # Combined OCR + image generation MCP
├── image_generator_server/ # Standalone FLUX image generation service
├── ocr_server/ # Standalone OCR service
└── CLAUDE.md # Development guide for Claude Code
- Automated Content Analysis - Object detection and OCR for document processing
- Iterative Image Generation - Generate images with text validation loops
- Multi-Modal Workflows - Combine vision and language models for complex tasks
- Modular CV Pipeline - Mix and match components as needed
Each component has its own README with detailed setup instructions: