Skip to content
@basetenlabs

Baseten

Machine learning infrastructure for developers

Welcome to Baseten

Baseten is an AI infrastructure platform. We combine applied performance research, distributed multi-cloud infrastructure, and developer tooling to run models of all modalities in production.

Get started:

  • Deploy an open-source model in two clicks from the model library.
  • Read our docs to package and serve a fine-tuned or custom model.

Pinned Loading

  1. truss truss Public

    The simplest way to serve AI/ML models in production

    Python 1k 88

  2. truss-examples truss-examples Public

    Examples of models deployable with Truss

    Python 195 48

Repositories

Showing 10 of 64 repositories
  • ml-cookbook Public

    Ready-to-use ML training recipes to help you build and deploy models on Baseten.

    basetenlabs/ml-cookbook’s past year of commit activity
    2 MIT 0 0 2 Updated Aug 16, 2025
  • truss Public

    The simplest way to serve AI/ML models in production

    basetenlabs/truss’s past year of commit activity
    Python 1,040 MIT 88 64 (5 issues need help) 26 Updated Aug 15, 2025
  • genai-bench Public Forked from sgl-project/genai-bench

    Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

    basetenlabs/genai-bench’s past year of commit activity
    Python 0 MIT 17 0 0 Updated Aug 14, 2025
  • truss-examples Public

    Examples of models deployable with Truss

    basetenlabs/truss-examples’s past year of commit activity
    Python 195 MIT 48 14 57 Updated Aug 11, 2025
  • LatentSync Public Forked from bytedance/LatentSync

    Taming Stable Diffusion for Lip Sync!

    basetenlabs/LatentSync’s past year of commit activity
    Python 0 Apache-2.0 773 0 2 Updated Aug 6, 2025
  • TensorRT-Model-Optimizer Public Forked from NVIDIA/TensorRT-Model-Optimizer

    A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

    basetenlabs/TensorRT-Model-Optimizer’s past year of commit activity
    Python 1 117 0 3 Updated Aug 6, 2025
  • gorilla Public Forked from ShishirPatil/gorilla

    Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

    basetenlabs/gorilla’s past year of commit activity
    Python 0 Apache-2.0 1,277 0 0 Updated Aug 6, 2025
  • harmony Public Forked from openai/harmony

    Renderer for the harmony response format to be used with gpt-oss

    basetenlabs/harmony’s past year of commit activity
    Rust 0 Apache-2.0 176 0 0 Updated Aug 5, 2025
  • create-pull-request Public Forked from peter-evans/create-pull-request

    A GitHub action to create a pull request for changes to your repository in the actions workspace

    basetenlabs/create-pull-request’s past year of commit activity
    TypeScript 0 MIT 536 0 2 Updated Jul 22, 2025
  • action-slack Public Forked from 8398a7/action-slack

    Provides the function of slack notification to GitHub Actions.

    basetenlabs/action-slack’s past year of commit activity
    TypeScript 0 MIT 145 0 2 Updated Jul 22, 2025

Top languages

Loading…

Most used topics

Loading…