Skip to content

Conversation

agourlay
Copy link
Member

@agourlay agourlay commented May 6, 2025

Add API validation for IntegerIndexParams to prevent users from creating integer indexes which do not support lookup nor range to avoid bad performance.

@agourlay agourlay requested review from timvisee and coszio May 6, 2025 09:29
@agourlay agourlay marked this pull request as ready for review May 6, 2025 09:30
Copy link
Contributor

coderabbitai bot commented May 6, 2025

📝 Walkthrough

Walkthrough

This change introduces and integrates validation logic for payload indexing parameters and schema definitions across multiple modules. Validation traits are implemented for several structs and enums, particularly focusing on integer index parameters, payload schema parameters, and field schemas. The validation ensures that certain invalid configurations, such as both "lookup" and "range" capabilities being disabled simultaneously, are detected and reported. Validation attributes are added to relevant struct fields to enable recursive validation. Additionally, a new test is included to verify that the API correctly rejects invalid payload indexing configurations, returning appropriate error messages.

Suggested reviewers

  • timvisee
  • generall
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/segment/src/data_types/index.rs (1)

72-84: Good validation logic to prevent unusable index configurations

The validation logic correctly prevents users from creating integer indexes where both lookup and range capabilities are disabled simultaneously, which aligns with the PR objective. The error message is clear and descriptive.

One minor suggestion: consider adding a brief comment explaining why this validation is necessary (e.g., "Prevent creation of indexes without any usable operations").

 pub fn validate_integer_index_params(
     lookup: &Option<bool>,
     range: &Option<bool>,
 ) -> Result<(), ValidationErrors> {
+    // Prevent creation of indexes without any usable operations
     if lookup == &Some(false) && range == &Some(false) {
         let mut errors = ValidationErrors::new();
         let error =
             ValidationError::new("the 'lookup' and 'range' capabilities can't be both disabled");
         errors.add("lookup", error);
         return Err(errors);
     }
     Ok(())
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e0c072 and 627b76e.

📒 Files selected for processing (7)
  • lib/api/build.rs (1 hunks)
  • lib/api/src/grpc/qdrant.rs (3 hunks)
  • lib/api/src/grpc/validate.rs (2 hunks)
  • lib/segment/src/data_types/index.rs (2 hunks)
  • lib/segment/src/types.rs (1 hunks)
  • src/common/update.rs (1 hunks)
  • tests/openapi/test_payload_indexing.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/common/update.rs (2)
lib/segment/src/data_types/index.rs (1)
  • validate (60-69)
lib/api/src/grpc/validate.rs (15)
  • validate (14-14)
  • validate (19-21)
  • validate (29-31)
  • validate (39-41)
  • validate (49-54)
  • validate (58-64)
  • validate (68-74)
  • validate (78-85)
  • validate (89-97)
  • validate (101-112)
  • validate (116-123)
  • validate (127-134)
  • validate (138-145)
  • validate (149-169)
  • validate (173-175)
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: test-low-resources
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: lint
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (12)
src/common/update.rs (1)

226-228: Appropriate addition of #[validate(nested)] attribute.

Adding the nested validation attribute to the field_schema field ensures that payload schema validation will be recursively performed when validating a CreateFieldIndex. This is essential for enforcing the new validation logic for integer index parameters where both lookup and range capabilities cannot be simultaneously disabled.

lib/api/build.rs (1)

209-210: Proper extension of validation configuration.

Adding empty validation constraints for CreateFieldIndexCollection.field_index_params and PayloadIndexParams.index_params correctly sets up nested validation for these fields. This configuration enables the validation framework to recursively validate the nested IntegerIndexParams struct, ensuring consistent validation at both the API and internal levels.

tests/openapi/test_payload_indexing.py (1)

13-29: Well-structured test for integer index parameters validation.

This test verifies that the API correctly rejects index creation requests where both lookup and range capabilities are disabled for integer fields. The test appropriately checks both the HTTP status code (422) and validates the specific error message.

The test effectively confirms that the validation logic introduced in this PR is working at the API level, preventing users from creating integer indexes that would lead to poor performance.

lib/segment/src/types.rs (2)

1777-1790: Implementation looks good, consider adding a comment

This implementation for the Validate trait correctly delegates validation to the IntegerIndexParams while allowing all other variants to pass validation. This prevents users from creating invalid integer indexes.

Consider adding a brief comment explaining why only IntegerIndexParams requires validation and what constraints it enforces, to make the validation logic clearer for future readers.


1799-1808: Correctly propagates validation to field parameters

This implementation properly handles both simple field types and fields with parameters by delegating validation to the payload schema parameters when necessary.

This ensures that validation rules are applied consistently regardless of how the field schema is defined.

lib/api/src/grpc/validate.rs (2)

5-5: Good dependency inclusion

Importing the validation function from the segment module ensures consistent validation logic across the codebase.


387-402: Type-specific validation pattern is well implemented

The implementation correctly handles all variants of IndexParams, only applying validation to IntegerIndexParams while allowing other types to pass validation without checks.

This selective validation approach is appropriate when specific types have unique constraints that others don't need to enforce.

lib/api/src/grpc/qdrant.rs (3)

868-868: Good addition of validation to PayloadIndexParams struct.

Adding the Validate derive macro enables validation on the struct, which is essential for ensuring that users can't create integer indexes with invalid configurations.


877-877: Properly configured nested validation for index_params.

The #[validate(nested)] attribute ensures validation is applied recursively to the various implementations of IndexParams, particularly the IntegerIndexParams variant. This helps enforce the rule that both "lookup" and "range" capabilities cannot be simultaneously disabled.


4394-4394: Correctly implemented nested validation for field index parameters.

Adding #[validate(nested)] to the field_index_params field ensures that validation is applied recursively to the PayloadIndexParams structure and its fields, maintaining validation integrity throughout the API.

lib/segment/src/data_types/index.rs (2)

3-3: Added validator imports for the validation feature

This imports the necessary traits and types from the validator crate to implement validation logic for the index parameters.


59-70: LGTM: Clean implementation of Validate trait

The validation implementation properly extracts the relevant fields and delegates to a dedicated validation function. This is a good separation of concerns.

Comment on lines +404 to +414
impl Validate for super::qdrant::IntegerIndexParams {
fn validate(&self) -> Result<(), ValidationErrors> {
let super::qdrant::IntegerIndexParams {
lookup,
range,
is_principal: _,
on_disk: _,
} = &self;
validate_integer_index_params(lookup, range)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Proper implementation of constraint validation

This implementation correctly extracts the relevant fields from IntegerIndexParams and delegates to the shared validation function.

To verify this implementation works as expected, let's see if there are any tests confirming that creating an integer index with both lookup and range set to false fails validation:


🏁 Script executed:

#!/bin/bash
# Search for tests related to IntegerIndexParams validation
rg -A 10 "test.*integer.*index.*params" --glob "*.{rs,py}"

Length of output: 60


Missing IntegerIndexParams Validation Tests

I didn’t find any existing tests that verify IntegerIndexParams fails validation when both lookup and range are false. Please add tests to cover this scenario.

• lib/api/src/grpc/validate.rs (around impl Validate for super::qdrant::IntegerIndexParams)
• New test file under tests/ (e.g. integer_index_params_validation.rs)

#[test]
fn integer_index_params_rejects_empty() {
    let params = IntegerIndexParams {
        lookup: false,
        range: false,
        is_principal: false,
        on_disk: false,
    };
    assert!(params.validate().is_err());
}

@agourlay agourlay merged commit fc64334 into dev May 7, 2025
17 checks passed
@agourlay agourlay deleted the validate-IntegerIndexParams branch May 7, 2025 14:35
generall pushed a commit that referenced this pull request May 22, 2025
* Validate IntegerIndexParams

* focus
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants