The TypeScript LLM Evaluation Library
-
Updated
Jan 13, 2025 - TypeScript
The TypeScript LLM Evaluation Library
RawBench: Powerful, minimal framework for LLM prompt evaluation with YAML configuration, tool execution support, and comprehensive result tracking.
Project page for our paper "REALY: Rethinking the Evaluation of 3D Face Reconstruction".
The TypeScript LLM Evaluation File
Repository for benchmark saturation research project.
Add a description, image, and links to the evaluation-metrics topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-metrics topic, visit your repo's landing page and select "manage topics."