Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Aug 5, 2025

This PR addresses the documentation reorganization and feature documentation requirements outlined in the issue:

Changes Made

1. Proposal Reorganization

  • Moved all proposals from website/docs/proposals/ to docs/proposals/ at the repository root to prevent them from being rendered as website pages
  • Removed duplicate proposal folders from website/versioned_docs/version-v0.5.0/proposals/ and website/versioned_docs/version-v0.5.1/proposals/
  • Updated all proposal links in proposals.md to use GitHub URLs pointing to the new location (https://github.com/kaito-project/kaito/blob/main/docs/proposals/)

2. Model As OCI Artifacts Feature Documentation

Created a comprehensive new feature page at website/docs/model-as-oci-artifacts.md based on the proposal docs/proposals/20250609-model-as-oci-artifacts.md that includes:

  • Overview and motivation explaining the challenges with traditional containerized model distribution
  • Technical details about image building challenges and pulling inefficiencies
  • Solution architecture with ORAS push, improved compression, and split architecture approach
  • Performance analysis including the container pull process breakdown
  • Compatibility information for container runtimes and OCI registries
  • Benefits and performance results with the evaluation chart from the original proposal
  • Getting started guidance and next steps

3. Navigation and Versioning

  • Added "Model As OCI Artifacts" to the Features section in the main sidebar configuration
  • Made the feature page available in all versions from v0.5.0 to vNext by copying to versioned docs and updating versioned sidebars
  • Verified all links work correctly and the website builds successfully

Screenshot

The new feature page integrates seamlessly into the documentation with comprehensive content and proper navigation:

Model As OCI Artifacts Feature Page

The page includes detailed sections covering all aspects of the OCI Artifacts approach, from technical challenges to implementation benefits, with a professional layout and table of contents for easy navigation.

Fixes #1355.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

@chewong chewong changed the title [WIP] Document Model-As-OCI-Artifacts feature in website [WIP] docs: Document Model-As-OCI-Artifacts feature in website Aug 5, 2025
Copilot AI and others added 2 commits August 5, 2025 17:44
Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
… page

Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
@Copilot Copilot AI changed the title [WIP] docs: Document Model-As-OCI-Artifacts feature in website Document Model-As-OCI-Artifacts feature and reorganize proposals Aug 5, 2025
@Copilot Copilot AI requested a review from chewong August 5, 2025 17:54
Copilot finished work on behalf of chewong August 5, 2025 17:54

## Overview

Currently, KAITO employs a solution where the runtime library and model files are packaged within a single image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently, KAITO employs a solution where the runtime library and model files are packaged within a single image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical.
Currently, KAITO employs a solution where the runtime library and model files are packaged within a single container image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical.


The system uses an initContainer to download model files as OCI artifacts using ORAS:

```mermaid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need this to render mermaid diagram in website:

export default {
  markdown: {
    mermaid: true,
  },
  themes: ['@docusaurus/theme-mermaid'],
};


Traditional containerized model distribution faces several challenges:

- **Build Time**: With KAITO hosting multiple models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Build Time**: With KAITO hosting multiple models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt.
- **Build Time**: With KAITO hosting multiple preset models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt.


### 1. Build Image Using ORAS Push

Instead of sending large model files to docker builder context, KAITO uses ORAS push to add model files to OCI layout assembly. This achieves the same result as Docker build but is much more efficient.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Instead of sending large model files to docker builder context, KAITO uses ORAS push to add model files to OCI layout assembly. This achieves the same result as Docker build but is much more efficient.
Instead of sending large model files to docker builder context, KAITO uses [ORAS](https://github.com/oras-project/oras) to add model weights and configuration files to OCI layout assembly. This achieves the same result as `docker build` but is much more efficient.


The containerized solution is split into two parts:

- **Base Image**: Contains the runtime and dependencies
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Base Image**: Contains the runtime and dependencies
- **Base Image**: Contains the inference runtime and dependencies

The containerized solution is split into two parts:

- **Base Image**: Contains the runtime and dependencies
- **OCI Artifacts**: Contains the model files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **OCI Artifacts**: Contains the model files
- **OCI Artifacts**: Contains the model weights and configuration files


### Model Files Download Process

The system uses an initContainer to download model files as OCI artifacts using ORAS:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The system uses an initContainer to download model files as OCI artifacts using ORAS:
The system uses an [initContainer](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) to download model files as OCI artifacts using ORAS:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also mentions that for super large model, we allow direct download from huggingface instead of packaging them as OCI artifacts

@chewong chewong closed this Aug 5, 2025
@chewong chewong reopened this Aug 5, 2025
@chewong chewong marked this pull request as ready for review August 5, 2025 21:08
@chewong
Copy link
Collaborator

chewong commented Aug 5, 2025

@copilot could you address the comments

Copy link

codecov bot commented Aug 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@           Coverage Diff           @@
##             main    #1356   +/-   ##
=======================================
  Coverage   56.60%   56.60%           
=======================================
  Files          75       75           
  Lines        8252     8252           
=======================================
  Hits         4671     4671           
  Misses       3359     3359           
  Partials      222      222           
Components Coverage Δ
workspace 46.85% <ø> (ø)
presets 87.31% <ø> (ø)
main ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

kaito-pr-agent bot commented Aug 5, 2025

Title

Document Model-As-OCI-Artifacts feature and reorganize proposals


Description

  • Moved proposals from website/docs/proposals to docs/proposals to prevent rendering as website pages.

  • Removed duplicate proposal folders from website/versioned_docs.

  • Created a new feature page Model As OCI Artifacts under the "Features" section in the website.

  • Added "Model As OCI Artifacts" to the Features section in the main sidebar configuration.


Changes walkthrough 📝

Relevant files
Enhancement
11 files
sidebars.js
Added "Model As OCI Artifacts" to Features section             
+1/-0     
20250704-keda-scaler-for-inference-workloads.md
Added new proposal for Keda Scaler for Inference Workloads
[link]   
20250630-kaito-cli.md
Added new proposal for Kaito Kubectl CLI Plugin                   
[link]   
20250325-distributed-inference.md
Added new proposal for Distributed Inference                         
[link]   
20250609-model-as-oci-artifacts.md
Added new proposal for Model As OCI Artifacts                       
[link]   
20250715-inference-aware-routing-layer.md
Added new proposal for Inference-Aware Routing Layer         
[link]   
20250611-workspace-subresource-scale-api.md
Added new proposal for Support scale subresource api for workspace
[link]   
model-as-oci-artifacts.md
Created new feature page for Model As OCI Artifacts           
+146/-0 
model-as-oci-artifacts.md
Created new feature page for Model As OCI Artifacts           
+146/-0 
20250529-llama-3.3-70b-instruct.md
Added new proposal for Llama-3.3-70B-Instruct model support
[link]   
version-v0.5.0-sidebars.json
Added "Model As OCI Artifacts" to Features section             
+2/-1     
Additional files
38 files
20240205-mistral-instruct.md [link]   
20240205-mistral.md [link]   
20240206-phi-2.md [link]   
20240527-phi3-instruct.md [link]   
20241212-phi4-instruct.md [link]   
20250103-qwen2.5-coder.md [link]   
YYYYMMDD-model-template.md [link]   
model-as-oci-artifacts.md +146/-0 
proposals.md +11/-11 
proposals.md +11/-11 
20240205-mistral-instruct.md +0/-51   
20240205-mistral.md +0/-50   
20240206-phi-2.md +0/-50   
20240527-phi3-instruct.md +0/-53   
20241212-phi4-instruct.md +0/-50   
20250103-qwen2.5-coder.md +0/-50   
20250325-distributed-inference.md +0/-259 
20250529-llama-3.3-70b-instruct.md +0/-105 
20250609-model-as-oci-artifacts.md +0/-256 
20250611-workspace-subresource-scale-api.md +0/-174 
20250704-keda-scaler-for-inference-workloads.md +0/-842 
20250715-inference-aware-routing-layer.md +0/-268 
YYYYMMDD-model-template.md +0/-75   
proposals.md +11/-11 
20240205-mistral-instruct.md +0/-51   
20240205-mistral.md +0/-50   
20240206-phi-2.md +0/-50   
20240527-phi3-instruct.md +0/-53   
20241212-phi4-instruct.md +0/-50   
20250103-qwen2.5-coder.md +0/-50   
20250325-distributed-inference.md +0/-259 
20250529-llama-3.3-70b-instruct.md +0/-105 
20250609-model-as-oci-artifacts.md +0/-256 
20250611-workspace-subresource-scale-api.md +0/-174 
20250704-keda-scaler-for-inference-workloads.md +0/-842 
20250715-inference-aware-routing-layer.md +0/-268 
YYYYMMDD-model-template.md +0/-75   
version-v0.5.1-sidebars.json +2/-1     

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @chewong chewong closed this Aug 5, 2025
    @chewong chewong deleted the copilot/fix-1355 branch August 5, 2025 21:10
    Copy link

    kaito-pr-agent bot commented Aug 5, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Verify sidebar link

    Ensure that the new sidebar entry is correctly linked to the corresponding
    documentation page.

    website/sidebars.js [50]

    +'model-as-oci-artifacts',
     
    -
    Suggestion importance[1-10]: 6

    __

    Why: The suggestion asks to verify the correctness of the sidebar link, which is a reasonable check but does not involve modifying the code. It is important to ensure links are correct, but this suggestion does not offer a direct improvement to the functionality or correctness of the code.

    Low

    Copilot AI added a commit that referenced this pull request Aug 5, 2025
    Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    Status: Done
    Development

    Successfully merging this pull request may close these issues.

    Document Model-As-OCI-Artifacts feature in website
    2 participants