chore: Revert the model used in the rag e2e test back to Phi-3 #1342

bangqipropel · 2025-08-01T17:45:19Z

Reason for Change:
Revert the model used in the rag e2e test back to Phi-3 to avoid the query test failure

Requirements

added unit tests and e2e tests (if applicable).

Issue Fixed:

Notes for Reviewers:

kaito-pr-agent · 2025-08-01T17:45:50Z

Title

Update E2E Tests and Add NVIDIA A10 GPU Support

Description

Added support for NVIDIA A10 GPU SKUs in Azure
Updated E2E tests to use new GPU SKUs
Changed default Azure region for E2E workflows
Updated AKS cluster VM size for better performance

Changes walkthrough 📝

Relevant files

Enhancement

4 files

azure_sku_handler.go `Added NVIDIA A10 GPU SKUs`	+2/-0
preset_test.go `Updated SKU for E2E tests`	+13/-14
preset_vllm_test.go `Updated SKU for vLLM E2E tests`	+8/-8
rag_test.go `Updated SKU and model for RAG E2E tests`	+44/-7

Miscellaneous

1 files

action.yml `Removed unused environment variable`	+0/-1

Configuration changes

3 files

e2e-workflow.yml `Changed default Azure region`	+1/-1
ragengine-e2e-workflow.yml `Changed default Azure region`	+1/-1
Makefile `Updated AKS cluster VM size`	+2/-2

Need help?
Type /help how to ... in the comments thread for any questions about PR-Agent usage.
Check out the documentation for more information.

kaito-pr-agent · 2025-08-01T17:46:16Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review SKU Availability Ensure that the new SKUs 'Standard_NV36ads_A10_v5' and 'Standard_NV72ads_A10_v5' are available in the target Azure regions and that they meet the necessary requirements for the workloads. {SKU: "Standard_NV36ads_A10_v5", GPUCount: 1, GPUMemGB: 24, GPUModel: "NVIDIA A10"}, {SKU: "Standard_NV72ads_A10_v5", GPUCount: 2, GPUMemGB: 48, GPUModel: "NVIDIA A10"}, Node Count Validation Verify that the node count specified for the new SKUs is appropriate and sufficient for the expected workload. For example, the change from 'Standard_NC12s_v3' to 'Standard_NV36ads_A10_v5' might require adjustments in the number of nodes. uniqueID := fmt.Sprint("preset-falcon-", rand.Intn(1000)) workspaceObj = utils.GenerateInferenceWorkspaceManifest(uniqueID, namespaceName, "", numOfNode, "Standard_NV36ads_A10_v5", &metav1.LabelSelector{ MatchLabels: map[string]string{"kaito-workspace": "custom-preset-e2e-test-falcon"}, }, nil, PresetFalcon7BModel, nil, nil, validAdapters, "") Model Compatibility Confirm that the new model 'phi-4-mini-instruct' is compatible with the 'Standard_NV36ads_A10_v5' SKU and that the configuration settings in the `createConfigForWorkspace` function are correct for this model. uniqueID := fmt.Sprint("preset-phi3-", rand.Intn(1000)) workspaceObj = utils.GenerateInferenceWorkspaceManifestWithVLLM(uniqueID, namespaceName, "", numOfReplica, "Standard_NV36ads_A10_v5", &metav1.LabelSelector{ MatchLabels: map[string]string{"kaito-workspace": "rag-e2e-test-phi-4-mini-instruct-vllm"}, }, nil, PresetPhi4MiniInstructModel, nil, nil, nil, "") createAndValidateWorkspace(workspaceObj) }) return workspaceObj } func createConfigForWorkspace(workspaceObj *kaitov1beta1.Workspace) { if workspaceObj.Inference == nil \|\| workspaceObj.Resource.InstanceType == "" { return } // TODO: uncomment the following lines when A10 GPU support is added // handler := sku.GetCloudSKUHandler(consts.AzureCloudName) // gpuConfig := handler.GetGPUConfigBySKU(workspaceObj.Resource.InstanceType) // if gpuConfig == nil \|\| (gpuConfig.GPUCount <= 1 && lo.FromPtr(workspaceObj.Resource.Count) <= 1) { // return // } By("Creating config file", func() { cm := corev1.ConfigMap{

kaito-pr-agent · 2025-08-01T17:47:20Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Validate new SKU details Verify that the new SKUs `Standard_NV36ads_A10_v5` and `Standard_NV72ads_A10_v5` are correctly supported by the Azure infrastructure and that their specifications (GPU count, memory, model) are accurate. pkg/sku/azure_sku_handler.go [26-27] +{SKU: "Standard_NV36ads_A10_v5", GPUCount: 1, GPUMemGB: 24, GPUModel: "NVIDIA A10"}, +{SKU: "Standard_NV72ads_A10_v5", GPUCount: 2, GPUMemGB: 48, GPUModel: "NVIDIA A10"}, - Suggestion importance[1-10]: 7 __ Why: The suggestion asks to verify the correctness and accuracy of the new SKUs, which is important but not critical. It does not involve code changes.	Medium
	Check SKU compatibility Ensure that the new SKU `Standard_NV36ads_A10_v5` is compatible with the models and configurations being tested. test/e2e/preset_test.go [96] +workspaceObj = utils.GenerateInferenceWorkspaceManifest(uniqueID, namespaceName, "", numOfNode, "Standard_NV36ads_A10_v5", - Suggestion importance[1-10]: 7 __ Why: The suggestion asks to ensure compatibility, which is important but not critical. It does not involve code changes.	Medium
	Verify model-SKU fit Confirm that the new SKU `Standard_NV36ads_A10_v5` supports the Llama 3.1-8B Instruct model requirements. test/e2e/preset_vllm_test.go [70] +workspaceObj = createLlama3_1_8BInstructWorkspaceWithPresetPublicModeAndVLLM(numOfNode, "Standard_NV36ads_A10_v5") - Suggestion importance[1-10]: 7 __ Why: The suggestion asks to confirm that the new SKU supports the model requirements, which is important but not critical. It does not involve code changes.	Medium
	Confirm SKU suitability Ensure that the new SKU `Standard_NV36ads_A10_v5` meets the requirements for the embedding and inference services. test/rage2e/rag_test.go [271] +ragEngineObj = GenerateLocalEmbeddingRAGEngineManifest(uniqueID, namespaceName, "Standard_NV36ads_A10_v5", "BAAI/bge-small-en-v1.5", - Suggestion importance[1-10]: 7 __ Why: The suggestion asks to ensure that the new SKU meets the requirements for the embedding and inference services, which is important but not critical. It does not involve code changes.	Medium

Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>

bangqipropel requested review from Fei-Guo, chewong and zhuangqh as code owners August 1, 2025 17:45

github-project-automation bot added this to KAITO Roadmap Aug 1, 2025

bangqipropel had a problem deploying to unit-tests August 1, 2025 17:45 — with GitHub Actions Error

bangqipropel had a problem deploying to e2e-test August 1, 2025 17:45 — with GitHub Actions Error

kaito-pr-agent bot added the Review effort 4/5 label Aug 1, 2025

bangqipropel force-pushed the fix_e2e branch from 74cf039 to 980c0c8 Compare August 1, 2025 17:48

bangqipropel had a problem deploying to e2e-test August 1, 2025 17:48 — with GitHub Actions Error

bangqipropel temporarily deployed to unit-tests August 1, 2025 17:48 — with GitHub Actions Inactive

bangqipropel force-pushed the fix_e2e branch 2 times, most recently from 587d73a to 8eed9ef Compare August 2, 2025 05:22

bangqipropel temporarily deployed to unit-tests August 2, 2025 05:22 — with GitHub Actions Inactive

bangqipropel had a problem deploying to e2e-test August 2, 2025 05:22 — with GitHub Actions Failure

bangqipropel had a problem deploying to e2e-test August 2, 2025 05:22 — with GitHub Actions Error

bangqipropel force-pushed the fix_e2e branch from 8eed9ef to edab73e Compare August 2, 2025 06:32

bangqipropel temporarily deployed to unit-tests August 2, 2025 06:32 — with GitHub Actions Inactive

bangqipropel had a problem deploying to e2e-test August 2, 2025 06:32 — with GitHub Actions Failure

bangqipropel temporarily deployed to unit-tests August 2, 2025 06:32 — with GitHub Actions Inactive

bangqipropel had a problem deploying to e2e-test August 2, 2025 06:32 — with GitHub Actions Error

bangqipropel had a problem deploying to e2e-test August 2, 2025 06:51 — with GitHub Actions Error

rag e2e rollback to phi3

4e3b2ba

Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>

bangqipropel force-pushed the fix_e2e branch from edab73e to 4e3b2ba Compare August 2, 2025 07:35

bangqipropel temporarily deployed to unit-tests August 2, 2025 07:35 — with GitHub Actions Inactive

bangqipropel had a problem deploying to e2e-test August 2, 2025 07:35 — with GitHub Actions Error

bangqipropel had a problem deploying to e2e-test August 2, 2025 07:35 — with GitHub Actions Failure

bangqipropel changed the title ~~chore: Fix e2e~~ chore:Revert the model used in the rag e2e test back to Phi-3 Aug 2, 2025

bangqipropel changed the title ~~chore:Revert the model used in the rag e2e test back to Phi-3~~ chore: Revert the model used in the rag e2e test back to Phi-3 Aug 2, 2025

bangqipropel temporarily deployed to e2e-test August 2, 2025 07:49 — with GitHub Actions Inactive

zhuangqh approved these changes Aug 2, 2025

View reviewed changes

zhuangqh merged commit db679e8 into kaito-project:main Aug 2, 2025
15 of 19 checks passed

github-project-automation bot moved this to Done in KAITO Roadmap Aug 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Revert the model used in the rag e2e test back to Phi-3 #1342

chore: Revert the model used in the rag e2e test back to Phi-3 #1342

bangqipropel commented Aug 1, 2025 •

edited

Loading

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

chore: Revert the model used in the rag e2e test back to Phi-3 #1342

chore: Revert the model used in the rag e2e test back to Phi-3 #1342

Conversation

bangqipropel commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

Title

Description

Changes walkthrough 📝

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

PR Reviewer Guide 🔍

Uh oh!

kaito-pr-agent bot commented Aug 1, 2025

PR Code Suggestions ✨

Uh oh!

Uh oh!

Uh oh!

bangqipropel commented Aug 1, 2025 •

edited

Loading