v0.3.0

@amarpal

Oumi v0.3 Changelog

🔧 Model Quantization (NEW)

Quantization is a crucially important family of methods for reducing model size, for example, prior to deployment. Oumi now supports applying Activation-aware Weight Quantization (AWQ) to all models. See how in our notebook.

Usage Example:

# Quick start - quantize TinyLlama to 4-bit
oumi quantize --method awq_q4_0 --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --output quantized_model

# With configuration file
oumi quantize --config quantization_config.yaml

⚖️ Judge API V2 (MAJOR UPDATE)

LLM-as-a-Judge is a method for using foundation models to reliably evaluate other foundation models. We’ve overhauled Oumi’s LLM-as-Judge interface for ease-of-use and flexibility. Check out our notebook here.

Usage Example:

from oumi.judges.simple_judge import SimpleJudge

# Built-in truthfulness judge
simple_judge = SimpleJudge(judge_config="oumi://configs/projects/judges/generic/truthfulness.yaml")

dataset = [{"request": "What is the capital of France?", "response": "Rome"}]
outputs = simple_judge.judge(dataset)

🎯 Adaptive Inference (NEW)

💪 Adaptive Inference, as we term it, refers to new features in Oumi for resuming training (or any task) when a job has crashed, as well as optimizing inference parallelization to maximize bandwidth. Learn more in our notebook.

🛠️ Developer Experience

Updated contributing guidelines
Enhanced documentation
Tutorial notebook fixes
Improved error handling and testing
MLflow integration improvements
Multi-node verl Slurm job support
Rich logging handler option

New Contributors

@amarpal made their first contribution in #1831
@42Shawn made their first contribution in #1837

Full Changelog: v0.2.1...v0.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!