Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
-
Updated
Jun 11, 2025 - Python
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Project that allows one to use a microphone with OpenAI whisper.
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.
A stream-translator fork with VAD based audio slicing & GPT / Gemini translation.
whisper.cpp bindings for python
A feature-rich Python-based Telegram bot for OpenAI API & Perplexity API
Live translation tool utilizing OpenAI's Whisper model for real-time audio transcription/translation with BYOK OpenAI API key for your choice of language.
WhisperX FastAPI integration
Drop-in replacement for the OpenAI's Whisper API using the same API but running locally
YouTube Video Summarization App built using open source LLM and Framework like Llama 2, Haystack, Whisper, and Streamlit. This app smoothly runs on CPU as Llama 2 model is in GGUF format loaded through Llama.cpp.
A working Speech to Speech AI assistant that can interact with you, manage your system, and more!
This repository provides a Flask app that processes voice messages recorded through Twilio or Twilio Studio, transcribes them using OpenAI's Whisper ASR, generates responses with GPT-3.5, and sends the replies as SMS using Twilio.
Discord bot that downloads and transcribes twitter space audio file
YASS.ai - Team Orange's entry to the Flow AI Hackathon 2023
A simple UI tool written in Python, for recording audio from a microphone and automatically transcribing the recording using OpenAI's Whisper model via OpenAI's API.
A subtitle generator for videos up to 10GB, automatically transcribing and translating spoken content into Brazilian Portuguese. Ideal for multilingual content, this tool creates accurate `.srt` files for seamless integration with video players.
A Streamlit-based web application that transcribes audio files using OpenAI's Whisper API. You can either upload an MP3 file or input a YouTube URL to convert video audio into text within seconds.
The VoiceProcessingToolkit is an all-encompassing suite designed for sophisticated voice detection, wake word recognition, text-to-speech synthesis, and advanced audio processing. It offers intuitive interfaces to streamline the integration of voice processing capabilities into your applications
Add a description, image, and links to the whisper-api topic page so that developers can more easily learn about it.
To associate your repository with the whisper-api topic, visit your repo's landing page and select "manage topics."