Skip to content

Conversation

julienmancuso
Copy link
Contributor

@julienmancuso julienmancuso commented Jun 3, 2025

Overview:

take into account number of workers from config

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

  • New Features

    • Enhanced deployment configuration with service and parent-level overrides for replicas and resource settings.
    • Improved environment variable management including merging and override capabilities for component deployments.
    • Added getter and setter methods for deployment configuration environment variables.
  • Bug Fixes

    • Ensured replica counts are consistently sourced from the parent configuration when specified.
  • Tests

    • Added comprehensive tests for deployment config overrides, environment variable merging, and resource settings.
    • Removed obsolete controller tests and replaced with expanded tests for deployment generation and config handling.
    • Updated test structure for improved coverage and reliability.
  • Refactor

    • Streamlined deployment logic by removing environment variable merging from the controller and consolidating it in deployment generation.

Copy link
Contributor

coderabbitai bot commented Jun 3, 2025

"""

Walkthrough

The changes refactor and relocate logic for merging and overriding deployment specifications and environment variables for DynamoComponentDeployments. Merging and override functions are removed from the controller and implemented in the internal dynamo package. Associated tests are deleted from the controller and reimplemented with expanded coverage in the dynamo package. Getter/setter methods and a new constant for deployment config environment variables are added.

Changes

File(s) Change Summary
.../api/v1alpha1/dynamocomponentdeployment_types.go, ..._test.go Added getter/setter methods for deployment config env var; added corresponding tests.
.../internal/consts/consts.go Added DynamoDeploymentConfigEnvVar constant.
.../internal/controller/dynamographdeployment_controller.go Removed all merging/override logic and helper functions for env vars and deployment config; cleaned imports.
.../internal/controller/dynamographdeployment_controller_test.go Deleted all tests for removed controller logic (env merging, config update, override).
.../internal/dynamo/graph.go Implemented merging/override logic for deployment specs/env vars; added helper functions for config/env merge.
.../internal/dynamo/graph_test.go Added comprehensive tests for new merging/override logic, env var handling, and resource/replica overrides.

Sequence Diagram(s)

sequenceDiagram
    participant ParentCR as DynamoGraphDeployment (Parent CR)
    participant GraphGen as GenerateDynamoComponentsDeployments
    participant CompDep as DynamoComponentDeployment

    ParentCR->>GraphGen: Provide parent spec, services config, ingress spec
    GraphGen->>CompDep: Create initial deployment from service config
    GraphGen->>CompDep: Merge with parent deployment spec (override)
    GraphGen->>CompDep: Merge environment variables (service > parent)
    GraphGen->>CompDep: Update DYN_DEPLOYMENT_CONFIG env var (set port if main)
    GraphGen->>CompDep: Override spec from DYN_DEPLOYMENT_CONFIG (replicas/resources)
    GraphGen->>CompDep: Set GPU scheduling annotations if needed
    GraphGen->>ParentCR: Return map of merged deployments
Loading

Poem

In the warren of code, we hop and we blend,
Merging configs and envs, to a logical end.
Old tests are retired, new ones take the stage,
With carrots of coverage, we bound through each page.
🥕 Hooray for the tidy, the merged, and the bright—
Deployments now shine in the spring morning light!
"""


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fff50c and c26aab2.

📒 Files selected for processing (1)
  • deploy/cloud/operator/internal/dynamo/graph.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (4)
deploy/cloud/operator/internal/dynamo/graph.go (4)

371-396: LGTM: Well-structured merging and override logic

The implementation correctly:

  • Uses mergo with override semantics for deployment spec merging
  • Merges environment variables with service-specific precedence
  • Updates deployment configuration and applies overrides in the proper sequence
  • Preserves CRD authority for replica values when explicitly set

The error handling and conditional logic are appropriate.


427-450: LGTM: Proper JSON config handling

The function correctly:

  • Only updates configuration for main components
  • Safely navigates the JSON structure before updating
  • Handles JSON marshalling/unmarshalling with proper error handling
  • Updates the port value when the service config exists

The implementation addresses the concern from previous reviews about handling missing port entries appropriately.


452-508: LGTM: Comprehensive resource override implementation

The function properly handles:

  • Parsing deployment configuration with appropriate error handling
  • Conditional resource override for GPU, CPU, Memory, and Custom resources
  • Preserving existing resource specifications when available
  • Setting LWS annotations for GPU scheduling

The resource initialization approach is correct as confirmed in previous review discussions - empty ResourceItem structs with omitempty tags will not cause Kubernetes API validation issues.


510-529: LGTM: Clean environment variable merging

The implementation correctly:

  • Uses a map-based approach for efficient merging
  • Preserves service-specific environment variable precedence over common ones
  • Converts back to slice format as expected by the API

The logic is straightforward and handles the merging semantics properly.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
deploy/cloud/operator/internal/consts/consts.go (1)

70-72: Consider aligning the constant name with existing Env-prefixed naming scheme

Most environment-variable constants in this file follow the pattern Env<Something>.
DynamoDeploymentConfigEnvVar is functionally correct, but it slightly breaks the naming consistency and makes “what is an env-var vs. a plain constant” harder to grep for.

-	DynamoDeploymentConfigEnvVar = "DYN_DEPLOYMENT_CONFIG"
+	EnvDynamoDeploymentConfig = "DYN_DEPLOYMENT_CONFIG"

Renaming now will avoid future churn, because this identifier will quickly propagate throughout the codebase.

deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)

180-191: Setter overwrites but never removes the env-var

SetDynamoDeploymentConfig(nil) is currently a no-op and leaves the last value in place.
If the caller wants to “clear” the config, there is no API.

Either:

  1. Document that nil is not supported, or
  2. Treat len(config)==0 as a delete request:
 if len(config) == 0 {
     // delete existing entry
     for i, env := range s.Spec.Envs {
         if env.Name == commonconsts.DynamoDeploymentConfigEnvVar {
             s.Spec.Envs = append(s.Spec.Envs[:i], s.Spec.Envs[i+1:]...)
             return
         }
     }
 }
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types_test.go (2)

131-134: Prefer bytes.Equal for byte-slice comparison

reflect.DeepEqual works but is slower and less explicit than bytes.Equal for raw byte slices.

- if got := s.GetDynamoDeploymentConfig(); !reflect.DeepEqual(got, tt.want) {
+ if got := s.GetDynamoDeploymentConfig(); !bytes.Equal(got, tt.want) {

(or convert both to string and compare).
This keeps intent crystal-clear and avoids accidental misuse when the type changes.


207-209: Test resilience: order-sensitive comparison of Envs slice

The assertion compares the entire slice literally. If unrelated env-vars are appended elsewhere in the code, this test will start failing even though SetDynamoDeploymentConfig still behaves correctly.

A more future-proof check:

var found bool
for _, e := range s.Spec.DynamoComponentDeploymentSharedSpec.Envs {
    if e.Name == commonconsts.DynamoDeploymentConfigEnvVar &&
       e.Value == string(tt.args.config) {
        found = true
        break
    }
}
if !found {
    t.Fatalf("env %s not set to expected value", commonconsts.DynamoDeploymentConfigEnvVar)
}
deploy/cloud/operator/internal/dynamo/graph.go (1)

512-531: mergeEnvs loses deterministic order

Using a map provides O(1) overrides but produces non-deterministic
ordering, which complicates diffing and may surprise users inspecting
the CR. A stable order (specific first, then common sorted) costs very
little:

func mergeEnvs(common, specific []corev1.EnvVar) []corev1.EnvVar {
    seen := map[string]struct{}{}
    merged := make([]corev1.EnvVar, 0, len(common)+len(specific))

    // keep specific order
    for _, e := range specific {
        merged = append(merged, e)
        seen[e.Name] = struct{}{}
    }
    // append remaining common envs alphabetically
    sort.Slice(common, func(i,j int) bool { return common[i].Name < common[j].Name })
    for _, e := range common {
        if _, ok := seen[e.Name]; !ok {
            merged = append(merged, e)
        }
    }
    return merged
}
deploy/cloud/operator/internal/dynamo/graph_test.go (1)

1832-1838: Sort both slices (want & got) to avoid order-dependent failures

mergeEnvs now returns a non-deterministic order (map iteration).
The test sorts only the got slice; if want is declared out of alpha
order the comparison fails spuriously.

- sort.Slice(got, func(i, j int) bool { return got[i].Name < got[j].Name })
+ sort.Slice(got,  func(i, j int) bool { return got[i].Name < got[j].Name })
+ sort.Slice(tt.want, func(i, j int) bool { return tt.want[i].Name < tt.want[j].Name })
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8deb3ea and 3fff50c.

📒 Files selected for processing (7)
  • deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2 hunks)
  • deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types_test.go (2 hunks)
  • deploy/cloud/operator/internal/consts/consts.go (1 hunks)
  • deploy/cloud/operator/internal/controller/dynamographdeployment_controller.go (1 hunks)
  • deploy/cloud/operator/internal/controller/dynamographdeployment_controller_test.go (0 hunks)
  • deploy/cloud/operator/internal/dynamo/graph.go (3 hunks)
  • deploy/cloud/operator/internal/dynamo/graph_test.go (2 hunks)
💤 Files with no reviewable changes (1)
  • deploy/cloud/operator/internal/controller/dynamographdeployment_controller_test.go
🧰 Additional context used
🧬 Code Graph Analysis (4)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)
deploy/cloud/operator/internal/consts/consts.go (1)
  • DynamoDeploymentConfigEnvVar (71-71)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types_test.go (2)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (4)
  • DynamoComponentDeploymentSpec (40-46)
  • DynamoComponentDeploymentStatus (112-118)
  • DynamoComponentDeploymentSharedSpec (52-83)
  • DynamoComponentDeployment (129-135)
deploy/cloud/operator/internal/consts/consts.go (1)
  • DynamoDeploymentConfigEnvVar (71-71)
deploy/cloud/operator/internal/dynamo/graph_test.go (3)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (6)
  • DynamoComponentDeploymentSharedSpec (52-83)
  • IngressSpec (98-109)
  • DynamoComponentDeployment (129-135)
  • DynamoComponentDeploymentSpec (40-46)
  • ExternalService (89-92)
  • DynamoComponentDeploymentStatus (112-118)
deploy/cloud/operator/internal/dynamo/graph.go (9)
  • Resources (61-66)
  • DynamoGraphConfig (165-169)
  • ServiceConfig (88-92)
  • Config (77-86)
  • Autoscaling (72-75)
  • DynamoConfig (54-59)
  • GenerateDynamoComponentsDeployments (286-425)
  • ServiceArgs (102-106)
  • SetLwsAnnotations (260-283)
deploy/cloud/operator/internal/consts/consts.go (3)
  • KubeLabelDynamoComponent (47-47)
  • KubeLabelDynamoNamespace (48-48)
  • DynamoDeploymentConfigEnvVar (71-71)
deploy/cloud/operator/internal/dynamo/graph.go (3)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2)
  • DynamoComponentDeploymentSharedSpec (52-83)
  • DynamoComponentDeployment (129-135)
deploy/cloud/operator/internal/consts/consts.go (2)
  • DynamoServicePort (9-9)
  • DynamoDeploymentConfigEnvVar (71-71)
deploy/sdk/src/dynamo/sdk/core/protocol/deployment.py (1)
  • Resources (25-75)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (2)
deploy/cloud/operator/internal/controller/dynamographdeployment_controller.go (1)

137-141: Side-effect of removed merging logic: replicas/resources overrides now only happen in dynamo.GenerateDynamoComponentsDeployments

Just a heads-up: the controller no longer performs any overrides/merging.
If any downstream consumers still rely on the old behaviour (e.g. unit tests, CLI tooling), they must now invoke GenerateDynamoComponentsDeployments instead. Please verify docs and examples accordingly.

deploy/cloud/operator/internal/dynamo/graph.go (1)

371-382: Consider failure-tolerant access to Spec.Services and move env-merge up

  1. If the parent CR omits the Services map (nil), the look-up on
    parentDynamoGraphDeployment.Spec.Services[service.Name] is safe (a
    read on a nil map returns <zero, false>), but the second use just
    below (parentDynamoGraphDeployment.Spec.Services) will panic when
    you attempt to range over or assign.
    A defensive nil check before the whole override-block avoids a future
    foot-gun when the code is refactored:
- if configOverride, ok := parentDynamoGraphDeployment.Spec.Services[service.Name]; ok {
+ svcOverrides := parentDynamoGraphDeployment.Spec.Services
+ if svcOverrides != nil {
+     if configOverride, ok := svcOverrides[service.Name]; ok {
         // ...
     }
 }
  1. mergeEnvs is called after the mergo.Merge.
    If the parent override itself sets Envs, they are overwritten again
    by the later merge. Swapping the two calls guarantees deterministic
    precedence:
- if len(parentDynamoGraphDeployment.Spec.Envs) > 0 {
-     deployment.Spec.Envs = mergeEnvs(parentDynamoGraphDeployment.Spec.Envs, deployment.Spec.Envs)
- }
- err := updateDynDeploymentConfig(deployment, commonconsts.DynamoServicePort)
+ deployment.Spec.Envs = mergeEnvs(parentDynamoGraphDeployment.Spec.Envs,
+                                  deployment.Spec.Envs)
+ err := updateDynDeploymentConfig(deployment, commonconsts.DynamoServicePort)

Would you double-check that swapping the order does not break any of the
new tests? The public contract is unchanged, so the suite should still
pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants