Skip to content

Conversation

zhuangqh
Copy link
Collaborator

Reason for Change:

Introduce manifest generator to let it easier to be maintained and tested.

zhuangqh added 2 commits July 18, 2025 23:05
Signed-off-by: jerryzhuang <zhuangqhc@gmail.com>
Signed-off-by: jerryzhuang <zhuangqhc@gmail.com>
Copy link

Title

Introduce Manifest Generator and Refactor Inference Generation


Description

  • Introduced manifest generator to simplify maintenance and testing.

  • Refactored GeneratePresetInference to use the new generator.

  • Moved manifest generation logic into separate functions.

  • Removed redundant tests for removed functions.


Changes walkthrough 📝

Relevant files
Enhancement
generator.go
Add manifest generator utilities                                                 

pkg/utils/generator/generator.go

  • Added WorkspaceGeneratorContext and GeneratorContext interfaces.
  • Defined ManifestType and TypedManifestModifier types.
  • Implemented GenerateManifest function.
  • +53/-0   
    Tests
    generator_test.go
    Add manifest generator tests                                                         

    pkg/utils/generator/generator_test.go

    • Added unit tests for GenerateManifest.
    +61/-0   
    manifests_test.go
    Remove outdated manifest tests                                                     

    pkg/workspace/manifests/manifests_test.go

    • Removed tests for old manifest generation functions.
    +0/-239 
    Refactoring
    preset-inferences.go
    Refactor inference generation using manifest generator     

    pkg/workspace/inference/preset-inferences.go

  • Updated GetInferenceImageInfo to return only image pull secrets.
  • Refactored GeneratePresetInference to use the new manifest generator.
  • Added new functions for generating pod specs and modifying them.
  • +219/-98
    manifests.go
    Refactor manifest generation using manifest generator       

    pkg/workspace/manifests/manifests.go

  • Refactored GenerateStatefulSetManifest and GenerateDeploymentManifest
    to use the new manifest generator.
  • Added new functions for setting pod specs and volume claim templates.
  • +57/-138

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Refactoring Complexity

    The refactoring introduces multiple new functions and modifies existing ones significantly. Ensure that the new functions are well-tested and that the logic is preserved correctly.

    // resource requirements
    var skuNumGPUs int
    // initially respect the user setting by deploying the model on the same number of nodes as the user requested
    numNodes := *workspaceObj.Resource.Count
    gpuConfig, err := utils.GetGPUConfigBySKU(workspaceObj.Resource.InstanceType)
    if err != nil {
    	gpuConfig, err = utils.TryGetGPUConfigFromNode(ctx, kubeClient, workspaceObj.Status.WorkerNodes)
    	if err != nil {
    		defaultNumGPU := resource.MustParse(model.GetInferenceParameters().GPUCountRequirement)
    		skuNumGPUs = int(defaultNumGPU.Value())
    	}
    }
    if gpuConfig != nil {
    	skuNumGPUs = gpuConfig.GPUCount
    	// Calculate the minimum number of nodes required to satisfy the model's total GPU memory requirement.
    	// The goal is to maximize GPU utilization and not spread the model across too many nodes.
    	totalGPUMemoryRequired := resource.MustParse(model.GetInferenceParameters().TotalGPUMemoryRequirement)
    	totalGPUMemoryPerNode := resource.NewQuantity(int64(gpuConfig.GPUMemGB)*consts.GiBToBytes, resource.BinarySI)
    
    	minimumNodes := 0
    	for ; totalGPUMemoryRequired.Sign() > 0; totalGPUMemoryRequired.Sub(*totalGPUMemoryPerNode) {
    		minimumNodes++
    	}
    	if minimumNodes < numNodes {
    		numNodes = minimumNodes
    	}
    }
    
    gctx := &generator.WorkspaceGeneratorContext{
    	Ctx:        ctx,
    	KubeClient: kubeClient,
    	Workspace:  workspaceObj,
    	Model:      model,
    }
    podOpts := []generator.TypedManifestModifier[generator.WorkspaceGeneratorContext, corev1.PodSpec]{
    	GenerateInferencePodSpec(gpuConfig, skuNumGPUs, numNodes),
    	SetModelDownloadInfo,
    	SetAdapterPuller,
    }
    
    var depObj client.Object
    // For multi-node distributed inference with vLLM, we need to use a StatefulSet instead of a Deployment
    // to ensure pods are created with individual identities (their ordinal indexes) -
    // https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-identity
    runtimeName := v1beta1.GetWorkspaceRuntimeName(workspaceObj)
    if model.SupportDistributedInference() && runtimeName == pkgmodel.RuntimeNameVLLM && numNodes > 1 {
    	podOpts = append(podOpts, SetDistributedInferenceProbe)
    
    	var ssOpts []generator.TypedManifestModifier[generator.WorkspaceGeneratorContext, appsv1.StatefulSet]
    	if checkIfNVMeAvailable(ctx, gpuConfig, kubeClient) {
    		ssOpts = append(ssOpts, manifests.AddStatefulSetVolumeClaimTemplates(GenerateModelWeightsCacheVolume(ctx, workspaceObj, model)))
    	} else {
    		podOpts = append(podOpts, SetDefaultModelWeightsVolume)
    	}
    
    	podSpec, err := generator.GenerateManifest(gctx, podOpts...)
    	if err != nil {
    		return nil, err
    	}
    
    	ssOpts = append(ssOpts,
    		manifests.GenerateStatefulSetManifest(revisionNum, numNodes),
    		manifests.SetStatefulSetPodSpec(podSpec),
    	)
    
    	depObj, err = generator.GenerateManifest(gctx, ssOpts...)
    } else {
    	podOpts = append(podOpts, SetDefaultModelWeightsVolume)
    
    	podSpec, err := generator.GenerateManifest(gctx, podOpts...)
    	if err != nil {
    		return nil, err
    	}
    
    	depObj, err = generator.GenerateManifest(gctx,
    		manifests.GenerateDeploymentManifest(revisionNum, numNodes),
    		manifests.SetDeploymentPodSpec(podSpec),
    	)
    }
    Function Overload

    Functions like GenerateStatefulSetManifest and GenerateDeploymentManifest now return a modifier function. Ensure that this change does not introduce any issues in how these manifests are generated and applied.

    func GenerateStatefulSetManifest(revisionNum string, replicas int) func(*generator.WorkspaceGeneratorContext, *appsv1.StatefulSet) error {
    	return func(ctx *generator.WorkspaceGeneratorContext, ss *appsv1.StatefulSet) error {
    		selector := map[string]string{
    			kaitov1beta1.LabelWorkspaceName: ctx.Workspace.Name,
    		}
    		labelselector := &v1.LabelSelector{
    			MatchLabels: selector,
    		}
    
    		ss.ObjectMeta = v1.ObjectMeta{
    			Name:      ctx.Workspace.Name,
    			Namespace: ctx.Workspace.Namespace,
    			Annotations: map[string]string{
    				kaitov1beta1.WorkspaceRevisionAnnotation: revisionNum,
    			},
    			OwnerReferences: []v1.OwnerReference{
    				*v1.NewControllerRef(ctx.Workspace, kaitov1beta1.GroupVersion.WithKind("Workspace")),
    			},
    		}
    		ss.Spec = appsv1.StatefulSetSpec{
    			Replicas:            lo.ToPtr(int32(replicas)),
    			PodManagementPolicy: appsv1.ParallelPodManagement,
    			PersistentVolumeClaimRetentionPolicy: &appsv1.StatefulSetPersistentVolumeClaimRetentionPolicy{
    				WhenScaled:  appsv1.RetainPersistentVolumeClaimRetentionPolicyType,
    				WhenDeleted: appsv1.DeletePersistentVolumeClaimRetentionPolicyType,
    			},
    			Selector: labelselector,
    			Template: corev1.PodTemplateSpec{
    				ObjectMeta: v1.ObjectMeta{
    					Labels: selector,
    				},
    			},
    		}
    
    		ss.Spec.ServiceName = fmt.Sprintf("%s-headless", ctx.Workspace.Name)
    		return nil
    	}
    Assumption Validation

    The code assumes there is only one container in the pod (// FIXME: assume only one container in the pod). Validate this assumption and handle cases where there might be multiple containers.

    spec.Volumes = append(spec.Volumes, adapterVolume)
    for i := range spec.Containers { // FIXME: assume only one container in the pod
    	spec.Containers[i].VolumeMounts = append(spec.Containers[i].VolumeMounts, adapterVolumeMount)

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Set correct service name

    Ensure the GenerateStatefulSetManifest function sets the ServiceName correctly.

    pkg/workspace/manifests/manifests.go [109]

     func GenerateStatefulSetManifest(revisionNum string, replicas int) func(*generator.WorkspaceGeneratorContext, *appsv1.StatefulSet) error {
    +  return func(ctx *generator.WorkspaceGeneratorContext, ss *appsv1.StatefulSet) error {
    +    // ... (rest of the original logic)
    +    ss.Spec.ServiceName = fmt.Sprintf("%s-headless", ctx.Workspace.Name)
    +    return nil
    +  }
    +}
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion ensures the GenerateStatefulSetManifest function sets the ServiceName correctly. This is a minor improvement that enhances the correctness of the function.

    Medium

    Signed-off-by: zhuangqh <zhuangqhc@gmail.com>

    type TypedManifestModifier[C GeneratorContext, T ManifestType] func(ctx *C, obj *T) error

    func GenerateManifest[C GeneratorContext, T ManifestType](ctx *C, modifiers ...TypedManifestModifier[C, T]) (*T, error) {
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Great to see the use of generics 👍🏻

    @zhuangqh zhuangqh merged commit aea0a42 into kaito-project:main Jul 22, 2025
    12 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    Status: Done
    Development

    Successfully merging this pull request may close these issues.

    3 participants