Skip to content

chore: Revert the model used in the rag e2e test back to Phi-3 #1342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 2, 2025

Conversation

bangqipropel
Copy link
Collaborator

@bangqipropel bangqipropel commented Aug 1, 2025

Reason for Change:
Revert the model used in the rag e2e test back to Phi-3 to avoid the query test failure

Requirements

  • added unit tests and e2e tests (if applicable).

Issue Fixed:

Notes for Reviewers:

Copy link

kaito-pr-agent bot commented Aug 1, 2025

Title

Update E2E Tests and Add NVIDIA A10 GPU Support


Description

  • Added support for NVIDIA A10 GPU SKUs in Azure

  • Updated E2E tests to use new GPU SKUs

  • Changed default Azure region for E2E workflows

  • Updated AKS cluster VM size for better performance


Changes walkthrough 📝

Relevant files
Enhancement
4 files
azure_sku_handler.go
Added NVIDIA A10 GPU SKUs                                                               
+2/-0     
preset_test.go
Updated SKU for E2E tests                                                               
+13/-14 
preset_vllm_test.go
Updated SKU for vLLM E2E tests                                                     
+8/-8     
rag_test.go
Updated SKU and model for RAG E2E tests                                   
+44/-7   
Miscellaneous
1 files
action.yml
Removed unused environment variable                                           
+0/-1     
Configuration changes
3 files
e2e-workflow.yml
Changed default Azure region                                                         
+1/-1     
ragengine-e2e-workflow.yml
Changed default Azure region                                                         
+1/-1     
Makefile
Updated AKS cluster VM size                                                           
+2/-2     

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    kaito-pr-agent bot commented Aug 1, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    SKU Availability

    Ensure that the new SKUs 'Standard_NV36ads_A10_v5' and 'Standard_NV72ads_A10_v5' are available in the target Azure regions and that they meet the necessary requirements for the workloads.

    {SKU: "Standard_NV36ads_A10_v5", GPUCount: 1, GPUMemGB: 24, GPUModel: "NVIDIA A10"},
    {SKU: "Standard_NV72ads_A10_v5", GPUCount: 2, GPUMemGB: 48, GPUModel: "NVIDIA A10"},
    Node Count Validation

    Verify that the node count specified for the new SKUs is appropriate and sufficient for the expected workload. For example, the change from 'Standard_NC12s_v3' to 'Standard_NV36ads_A10_v5' might require adjustments in the number of nodes.

    uniqueID := fmt.Sprint("preset-falcon-", rand.Intn(1000))
    workspaceObj = utils.GenerateInferenceWorkspaceManifest(uniqueID, namespaceName, "", numOfNode, "Standard_NV36ads_A10_v5",
    	&metav1.LabelSelector{
    		MatchLabels: map[string]string{"kaito-workspace": "custom-preset-e2e-test-falcon"},
    	}, nil, PresetFalcon7BModel, nil, nil, validAdapters, "")
    Model Compatibility

    Confirm that the new model 'phi-4-mini-instruct' is compatible with the 'Standard_NV36ads_A10_v5' SKU and that the configuration settings in the createConfigForWorkspace function are correct for this model.

    		uniqueID := fmt.Sprint("preset-phi3-", rand.Intn(1000))
    		workspaceObj = utils.GenerateInferenceWorkspaceManifestWithVLLM(uniqueID, namespaceName, "", numOfReplica, "Standard_NV36ads_A10_v5",
    			&metav1.LabelSelector{
    				MatchLabels: map[string]string{"kaito-workspace": "rag-e2e-test-phi-4-mini-instruct-vllm"},
    			}, nil, PresetPhi4MiniInstructModel, nil, nil, nil, "")
    
    		createAndValidateWorkspace(workspaceObj)
    	})
    	return workspaceObj
    }
    
    func createConfigForWorkspace(workspaceObj *kaitov1beta1.Workspace) {
    	if workspaceObj.Inference == nil || workspaceObj.Resource.InstanceType == "" {
    		return
    	}
    
    	// TODO: uncomment the following lines when A10 GPU support is added
    	// handler := sku.GetCloudSKUHandler(consts.AzureCloudName)
    	// gpuConfig := handler.GetGPUConfigBySKU(workspaceObj.Resource.InstanceType)
    	// if gpuConfig == nil || (gpuConfig.GPUCount <= 1 && lo.FromPtr(workspaceObj.Resource.Count) <= 1) {
    	// 	return
    	// }
    
    	By("Creating config file", func() {
    		cm := corev1.ConfigMap{

    Copy link

    kaito-pr-agent bot commented Aug 1, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Validate new SKU details

    Verify that the new SKUs Standard_NV36ads_A10_v5 and Standard_NV72ads_A10_v5 are
    correctly supported by the Azure infrastructure and that their specifications (GPU
    count, memory, model) are accurate.

    pkg/sku/azure_sku_handler.go [26-27]

    +{SKU: "Standard_NV36ads_A10_v5", GPUCount: 1, GPUMemGB: 24, GPUModel: "NVIDIA A10"},
    +{SKU: "Standard_NV72ads_A10_v5", GPUCount: 2, GPUMemGB: 48, GPUModel: "NVIDIA A10"},
     
    -
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion asks to verify the correctness and accuracy of the new SKUs, which is important but not critical. It does not involve code changes.

    Medium
    Check SKU compatibility

    Ensure that the new SKU Standard_NV36ads_A10_v5 is compatible with the models and
    configurations being tested.

    test/e2e/preset_test.go [96]

    +workspaceObj = utils.GenerateInferenceWorkspaceManifest(uniqueID, namespaceName, "", numOfNode, "Standard_NV36ads_A10_v5",
     
    -
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion asks to ensure compatibility, which is important but not critical. It does not involve code changes.

    Medium
    Verify model-SKU fit

    Confirm that the new SKU Standard_NV36ads_A10_v5 supports the Llama 3.1-8B Instruct
    model requirements.

    test/e2e/preset_vllm_test.go [70]

    +workspaceObj = createLlama3_1_8BInstructWorkspaceWithPresetPublicModeAndVLLM(numOfNode, "Standard_NV36ads_A10_v5")
     
    -
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion asks to confirm that the new SKU supports the model requirements, which is important but not critical. It does not involve code changes.

    Medium
    Confirm SKU suitability

    Ensure that the new SKU Standard_NV36ads_A10_v5 meets the requirements for the
    embedding and inference services.

    test/rage2e/rag_test.go [271]

    +ragEngineObj = GenerateLocalEmbeddingRAGEngineManifest(uniqueID, namespaceName, "Standard_NV36ads_A10_v5", "BAAI/bge-small-en-v1.5",
     
    -
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion asks to ensure that the new SKU meets the requirements for the embedding and inference services, which is important but not critical. It does not involve code changes.

    Medium

    @bangqipropel bangqipropel force-pushed the fix_e2e branch 2 times, most recently from 587d73a to 8eed9ef Compare August 2, 2025 05:22
    Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
    @bangqipropel bangqipropel changed the title chore: Fix e2e chore:Revert the model used in the rag e2e test back to Phi-3 Aug 2, 2025
    @bangqipropel bangqipropel changed the title chore:Revert the model used in the rag e2e test back to Phi-3 chore: Revert the model used in the rag e2e test back to Phi-3 Aug 2, 2025
    @zhuangqh zhuangqh merged commit db679e8 into kaito-project:main Aug 2, 2025
    15 of 19 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    Status: Done
    Development

    Successfully merging this pull request may close these issues.

    2 participants