-
Notifications
You must be signed in to change notification settings - Fork 123
Document Model-As-OCI-Artifacts feature and reorganize proposals #1356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
… page Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
|
||
## Overview | ||
|
||
Currently, KAITO employs a solution where the runtime library and model files are packaged within a single image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, KAITO employs a solution where the runtime library and model files are packaged within a single image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical. | |
Currently, KAITO employs a solution where the runtime library and model files are packaged within a single container image. This method ensures a reliable and self-contained environment, particularly effective for distributing small models. However, as large language models grow, bundling them within containerized images becomes impractical. |
|
||
The system uses an initContainer to download model files as OCI artifacts using ORAS: | ||
|
||
```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this to render mermaid diagram in website:
export default {
markdown: {
mermaid: true,
},
themes: ['@docusaurus/theme-mermaid'],
};
|
||
Traditional containerized model distribution faces several challenges: | ||
|
||
- **Build Time**: With KAITO hosting multiple models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Build Time**: With KAITO hosting multiple models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt. | |
- **Build Time**: With KAITO hosting multiple preset models, base images are frequently updated due to vulnerability fixes and feature requests. Each time the base image is updated, every model image needs to be rebuilt. |
|
||
### 1. Build Image Using ORAS Push | ||
|
||
Instead of sending large model files to docker builder context, KAITO uses ORAS push to add model files to OCI layout assembly. This achieves the same result as Docker build but is much more efficient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of sending large model files to docker builder context, KAITO uses ORAS push to add model files to OCI layout assembly. This achieves the same result as Docker build but is much more efficient. | |
Instead of sending large model files to docker builder context, KAITO uses [ORAS](https://github.com/oras-project/oras) to add model weights and configuration files to OCI layout assembly. This achieves the same result as `docker build` but is much more efficient. |
|
||
The containerized solution is split into two parts: | ||
|
||
- **Base Image**: Contains the runtime and dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Base Image**: Contains the runtime and dependencies | |
- **Base Image**: Contains the inference runtime and dependencies |
The containerized solution is split into two parts: | ||
|
||
- **Base Image**: Contains the runtime and dependencies | ||
- **OCI Artifacts**: Contains the model files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **OCI Artifacts**: Contains the model files | |
- **OCI Artifacts**: Contains the model weights and configuration files |
|
||
### Model Files Download Process | ||
|
||
The system uses an initContainer to download model files as OCI artifacts using ORAS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The system uses an initContainer to download model files as OCI artifacts using ORAS: | |
The system uses an [initContainer](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) to download model files as OCI artifacts using ORAS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also mentions that for super large model, we allow direct download from huggingface instead of packaging them as OCI artifacts
@copilot could you address the comments |
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #1356 +/- ##
=======================================
Coverage 56.60% 56.60%
=======================================
Files 75 75
Lines 8252 8252
=======================================
Hits 4671 4671
Misses 3359 3359
Partials 222 222
🚀 New features to boost your workflow:
|
TitleDocument Model-As-OCI-Artifacts feature and reorganize proposals Description
Changes walkthrough 📝
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Co-authored-by: chewong <10557231+chewong@users.noreply.github.com>
This PR addresses the documentation reorganization and feature documentation requirements outlined in the issue:
Changes Made
1. Proposal Reorganization
website/docs/proposals/
todocs/proposals/
at the repository root to prevent them from being rendered as website pageswebsite/versioned_docs/version-v0.5.0/proposals/
andwebsite/versioned_docs/version-v0.5.1/proposals/
proposals.md
to use GitHub URLs pointing to the new location (https://github.com/kaito-project/kaito/blob/main/docs/proposals/
)2. Model As OCI Artifacts Feature Documentation
Created a comprehensive new feature page at
website/docs/model-as-oci-artifacts.md
based on the proposaldocs/proposals/20250609-model-as-oci-artifacts.md
that includes:3. Navigation and Versioning
Screenshot
The new feature page integrates seamlessly into the documentation with comprehensive content and proper navigation:
The page includes detailed sections covering all aspects of the OCI Artifacts approach, from technical challenges to implementation benefits, with a professional layout and table of contents for easy navigation.
Fixes #1355.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.