Oumi v0.3 Changelog
🔧 Model Quantization (NEW)
Quantization is a crucially important family of methods for reducing model size, for example, prior to deployment. Oumi now supports applying Activation-aware Weight Quantization (AWQ) to all models. See how in our notebook.
Usage Example:
# Quick start - quantize TinyLlama to 4-bit
oumi quantize --method awq_q4_0 --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --output quantized_model
# With configuration file
oumi quantize --config quantization_config.yaml
⚖️ Judge API V2 (MAJOR UPDATE)
LLM-as-a-Judge is a method for using foundation models to reliably evaluate other foundation models. We’ve overhauled Oumi’s LLM-as-Judge interface for ease-of-use and flexibility. Check out our notebook here.
Usage Example:
from oumi.judges.simple_judge import SimpleJudge
# Built-in truthfulness judge
simple_judge = SimpleJudge(judge_config="oumi://configs/projects/judges/generic/truthfulness.yaml")
dataset = [{"request": "What is the capital of France?", "response": "Rome"}]
outputs = simple_judge.judge(dataset)
🎯 Adaptive Inference (NEW)
💪 Adaptive Inference, as we term it, refers to new features in Oumi for resuming training (or any task) when a job has crashed, as well as optimizing inference parallelization to maximize bandwidth. Learn more in our notebook.
🛠️ Developer Experience
- Updated contributing guidelines
- Enhanced documentation
- Tutorial notebook fixes
- Improved error handling and testing
- MLflow integration improvements
- Multi-node verl Slurm job support
- Rich logging handler option
New Contributors
Full Changelog: v0.2.1...v0.3.0