Note: This project is still under active development, and some features may be unstable, please use with caution
The English README is automatically generated, please refer to the Chinese version for the most accurate information
This is an AI-powered steward system based on large language models that can interact with users through voice or text to help control smart home devices and computer programs.
- 2024-12-18: Added support for HomeAssistant, can now control HomeAssistant/Mi Home devices, check omni-ha for more details
- Supports multi-turn dialogue for continuous user interaction
- Supports tool calling to execute complex tasks on your computer
- Supports multiple LLM models that can be switched as needed
- Highly extensible - you can easily customize and share your own tools
- 🎤 Voice recognition and interaction
- 🏠 Smart home control (HomeAssistant/Bemfa devices/Mi Home devices)
- 💻 Computer program management (start/stop programs)
- 🔍 Online information retrieval (via Stepfun Web Search or Kimi AI)
- ⌨️ Command line operations
- 📂 File management (file search/read/write/compress/list directory)
We prepared a series of demo videos, please watch demo videos to understand the main features and usage of the system.
- Python 3.8+
- Chrome browser (for Kimi AI functionality)
- Windows OS (some features only support Windows, Linux and Mac untested)
- Clone repository
git clone https://github.com/OmniSteward/OmniSteward.git
cd OmniSteward
- Install dependencies
pip install -r requirements.txt
See examples/env.cmd file
OPENAI_API_BASE=your_api_base # OpenAI format API base URL
OPENAI_API_KEY=your_api_key # OpenAI format API key
SILICON_FLOW_API_KEY=your_api_key # Silicon Flow API key for ASR, Rerank, see [LLM Platforms](docs/PLATFORM.md)
BEMFA_UID=your_bemfa_uid # Bemfa platform UID (optional, for smart home control)
BEMFA_TOPIC=your_bemfa_topic # Bemfa platform Topic (optional, for smart home control)
KIMI_PROFILE_PATH=path_to_chrome_profile # Chrome user data directory (optional, for Kimi AI, uses default path if not set)
LOCATION=your_location # Your geographic location (optional, for system prompts)
LLM_MODEL=your_llm_model # LLM model to use, optional, defaults to Qwen2.5-7B-Instruct
For obtaining OpenAI format API key and base URL, see LLM Platforms
Reference links:
This project supports two usage modes:
- Command Line Interface (CLI): Interact through command line, direct usage.
- Web Mode: Requires frontend project, interact through WebUI, can be used remotely on phone, tablet, computer to manage smart home devices
Please first configure environment variables in examples/env.cmd
file (see Environment Variables Configuration)
First start the VAD service:
python -m servers.vad_rpc
Then open a new command prompt window and run:
call examples\env.cmd # Apply environment variables
python -m core.cli --config configs/cli.py # Run CLI
See examples/cli_voice.cmd for more details
call examples\env.cmd # Apply environment variables
python -m core.cli --query "open NetEase Music" --config configs/cli.py
call examples\env.cmd # Apply environment variables
python -m core.cli --query "print hello" --config configs/cli_custom_tool.py
This example adds a simple print tool in configs/cli_custom_tool.py that can print any string. Check this file to learn how to easily add custom tools
- Requires frontend WebUI, called OmniSteward-Frontend
- Environment variables must be configured, especially Silicon Flow API key
- Frontend WebUI should run on
http://localhost:3000
, backend will forward requests to frontend when started - Backend service should run on
http://localhost:8000
Please first configure environment variables in examples/env.cmd
file (see Environment Variables Configuration), then run in project root:
call examples\env.cmd # Apply environment variables
python -m servers.steward --config configs/backend.py
See OmniSteward-Frontend project.
Use Chrome/Edge browser, open http://localhost:8000
to start using.
Note: For external network access, since Chrome/Edge blocks microphone under HTTP by default, we need to set chrome://flags/#unsafely-treat-insecure-origin-as-secure
to http://ip:port
, otherwise it cannot be used. See tutorial for reference.
Mobile phones can also use Chrome or Edge browser, open http://ip:port
to start using, requires same settings as above.
See TOOL_LIST.md
- Some features require specific API keys and environment configuration
- Command line tools require user confirmation before execution
- Smart home control features require corresponding hardware support
Currently this project is maintained by ElliottZheng, welcome to submit issues and pull requests!
Thanks to Stepfun Stars Program for supporting this project.
Copyright (c) 2024-present ElliottZheng
See steward-utils project for more custom tool examples.