βοΈ Open-source, 100% local audio transcription and subtitling suite with a full-featured web UI βοΈ
Warning
This rewrite is under development. The initial stages of development are focused on enhancing the quality and reliability of the APIs. The goal is to ensure easier scalability, broader compatibility, and overall improved performance. After the APIs are reliable and ready, the focus will move to the implementation of a better web UI.
Tip
The WhisperX API, which powers Anysub, is available for testing. For instructions on how to run it, refer to the README.
- π£οΈ Transcribe any media to text: audio, video, etc.
- Upload a file to transcribe.
- Speaker detection and diarization.
- WhisperX alignment.
- Better segment splitting.
- π Translate transcriptions to any language supported by Libretranslate
- π 100% Local: transcription, translation and subtitle edition happen 100% on your machine (can even work offline!).
- π Fast: uses WhisperX as the Whisper backend: get much faster transcription times on CPU!
- π₯ Download transcriptions in:
- VTT - Speakers colorized
- ASS - Speakers colorized
- JSON
- TXT
- π CPU: Anysub is fully optimized to run efficiently on CPU-only systems
- π₯ GPU Acceleration: Leverage NVIDIA GPUs to achieve significantly faster transcription times
- π¦Ύ Backend workers
- Anysub can seamlessly orchestrate multiple whisperx-api workers, balancing the job queue across all available resources. Uses asynq.
- π§ User authentication. You can now register multiple users with separate workspaces.
- Web UI
- Create
- Translate
- Download subtitles
- Summarize
- Subtitle editor
- Transcribe from URLs (any source supported by yt-dlp)
- Subtitle editor
- Transcription highlighting based on media position
- CPS (Characters per second) warnings
- Segment splitting
- Segment insertion
- Subtitle language selection
- Quick and easy setup: use the quick start script, or run through a few steps
- AI summarization of transcriptions: either using OpenAI or Ollama
- No longer using MongoDB. Uses an MariaDB backend.
- Uses WhisperX backend: better accuracy, speaker diarization, alignment...
- Anysub isn't limited to a single machine! With the worker system, you can set up multiple whisperx-api workers on different servers (or on the same one). Anysub will then handle the tasks, making the best use of all available resources.
At present, there is no testing documentation. Comprehensive testing guidelines will be provided once the To-Dos Before Release are completed.
The WhisperX-API is available for testing as standalone; check out the README for running instructions.
You will need golang, templ, docker, npm and optionally gow.
docker compose up
- Run
npm run dev
to start development environment. - Visit http://localhost:1337
- Local folder as media input.
- Full-text search all transcriptions.
- Audio recording from the browser.
- Backend:
- Frontend: