Skip to content

Project to define a reference stack for running AI evaluations, using industry-standard tools and deployment configurations.

License

Apache-2.0 and 2 other licenses found

Licenses found

Apache-2.0
LICENSE.Apache-2.0
CC-BY-4.0
LICENSE.CC-BY-4.0
Unknown
LICENSE.CDLA-2.0
Notifications You must be signed in to change notification settings

The-AI-Alliance/eval-ref-stack

README

Published Documentation

This repo contains the code and documentation for the AI Alliance Evaluation Reference Stack, which is designed to provide an out-of-the-box suite of tools for running almost any kind of model and app evaluations, e.g., for safety, performance, alignment to use cases and requirements, etc.

See also the Evaluation Is for Everyone project and the Achieving Confidence in Enterprise AI Applications project, which use this reference stack.

About the Evaluation Reference Stack

The project is developed by the AI Alliance Trust and Safety Work Group project. The rationale for the project is to provide enterprise developers with an easy to use, full-featured stack for running evaluations. The core of the stack are de facto industry standard tools that are used by many experts to right published evaluations for trust and safety purposes. Note that we also plan to offer an optional integration of these tools with Llama Stack, which is a popular, full-featured AI application framework.

The documentation describes how to use the stack.

Getting Involved

We welcome contributions as PRs. Please see our Alliance community repo for general information about contributing to any of our projects and initiatives. This section provides some specific details you need to know.

In particular, see the AI Alliance CONTRIBUTING instructions. You will need to agree with the AI Alliance Code of Conduct.

All code contributions are licensed under the Apache 2.0 LICENSE (which is also in this repo, LICENSE.Apache-2.0).

All documentation contributions are licensed under the Creative Commons Attribution 4.0 International (which is also in this repo, LICENSE.CC-BY-4.0).

All data contributions are licensed under the Community Data License Agreement - Permissive - Version 2.0 (which is also in this repo, LICENSE.CDLA-2.0).

We use the "Developer Certificate of Origin" (DCO).

Warning

Before you make any git commits with changes, understand what's required for DCO.

See the Alliance contributing guide section on DCO for details. In practical terms, supporting this requirement means you must use the -s flag with your git commit commands.

About the Documentation Website

The website is published using GitHub Pages, where the pages are written in Markdown and served using Jekyll. We use the Just the Docs Jekyll theme.

See GITHUB_PAGES.md for more information.

About

Project to define a reference stack for running AI evaluations, using industry-standard tools and deployment configurations.

Resources

License

Apache-2.0 and 2 other licenses found

Licenses found

Apache-2.0
LICENSE.Apache-2.0
CC-BY-4.0
LICENSE.CC-BY-4.0
Unknown
LICENSE.CDLA-2.0

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published