Skip to content

Conversation

coszio
Copy link
Contributor

@coszio coszio commented Jul 30, 2025

We had restricted midpoint to be in the range (0,1) for the sake of gauss and exp decay. Linear decay can actually accept in the range [0,1]. This PR enables this possibility

@coszio coszio requested review from generall and mrscoopers July 30, 2025 15:50
Copy link
Contributor

coderabbitai bot commented Jul 30, 2025

📝 Walkthrough

"""

Walkthrough

This change updates documentation and validation logic related to the midpoint parameter in the DecayParamsExpression structure across several files. Documentation comments in protocol buffer definitions, Rust structs, OpenAPI schemas, and markdown documentation were revised to specify that the midpoint should be between 0 and 1. Additionally, the validation logic for the midpoint parameter was adjusted to allow a value of 0 (inclusive) for the linear decay kind, while maintaining stricter bounds (exclusive of 0 and 1) for Gaussian and exponential decay kinds. No public API signatures or exported entity declarations were modified.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

  • Complexity label: Simple
  • Rationale: The majority of changes are documentation updates clarifying the valid range for a parameter. The only logic change is a straightforward adjustment to validation conditions for one function, with clear and limited scope. There are no changes to public interfaces or complex algorithmic modifications.
    """

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e6d06e9 and 85a3a1b.

📒 Files selected for processing (1)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch allow-0-midpoint-for-lindecay

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
docs/grpc/docs.md (1)

2564-2569: Clarify Decay Midpoint Ranges & Fix Table Pipe in gRPC Docs

The code enforces different midpoint ranges for each decay kind—this should be documented explicitly—and the inline |x - target| needs its pipes escaped to avoid breaking the Markdown table.

• File: docs/grpc/docs.md (around lines 2564–2569)
• Update the “midpoint” row as follows:

-| midpoint | [float](#float) | optional | The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`. |
+| midpoint | [float](#float) | optional | The midpoint of the decay. For `lin_decay`, the range is **inclusive** `[0, 1]`; for `exp_decay` and `gauss_decay`, the range is **exclusive** `(0, 1)`. Defaults to `0.5`. Output will be this value when `\|x - target\| == scale`. |

Also ensure the same clarification and pipe‐escaping is applied in the REST schema (lib/api/src/rest/schema.rs), Protobuf comments, and OpenAPI spec so all user‐facing docs stay in sync.

🧹 Nitpick comments (2)
docs/redoc/master/openapi.json (1)

15168-15170: Fix grammar and explicitly state inclusive bounds for midpoint.

A small typo (1.Defaults) slipped in, and the phrase “between 0 and 1” is ambiguous given that only the linear decay allows 0, while Gaussian/exp require > 0. The OpenAPI description should reflect this nuance to avoid confusion.

-            "description": "The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.",
+            "description": "The midpoint of the decay. For linear decay it must be in the inclusive range [0, 1]; \
+for Gaussian and exponential decay it must be in the range (0, 1]. Defaults to 0.5. Output equals this value when \
+`|x - target| == scale`.",
lib/api/src/grpc/qdrant.rs (1)

5574-5574: Clarify inclusive vs exclusive bounds in the docstring

“Should be between 0 and 1” is ambiguous—readers may assume exclusive bounds, yet the PR intentionally allows 0 (inclusive) for the linear decay kind while keeping exclusive bounds for the other kinds. Consider spelling this out to avoid misunderstanding:

-/// The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
+/// The midpoint of the decay.  
+/// • Linear decay: 0 ≤ midpoint ≤ 1  
+/// • Gaussian & exponential decay: 0 < midpoint < 1  
+/// Defaults to 0.5. Output equals this value when `|x - target| == scale`.

This keeps the auto-generated gRPC docs consistent with the refined validation logic introduced in parsed_formula.rs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76cc633 and 366fd58.

📒 Files selected for processing (6)
  • docs/grpc/docs.md (1 hunks)
  • docs/redoc/master/openapi.json (1 hunks)
  • lib/api/src/grpc/proto/points.proto (1 hunks)
  • lib/api/src/grpc/qdrant.rs (1 hunks)
  • lib/api/src/rest/schema.rs (1 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the `target` field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from `midpoint` and `scale` parameters. The `target` field is not involved in the lambda calculation.
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (3)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

lib/api/src/grpc/proto/points.proto (2)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

lib/api/src/grpc/qdrant.rs (2)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

lib/api/src/rest/schema.rs (2)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

docs/grpc/docs.md (2)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

docs/redoc/master/openapi.json (2)

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint and scale parameters. The target field is not involved in the lambda calculation.

Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.

🧬 Code Graph Analysis (1)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)
lib/segment/src/common/operation_error.rs (1)
  • validation_error (99-103)
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

2568-2568: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: integration-tests
  • GitHub Check: test-consistency

Comment on lines 638 to 640
optional float scale = 3;
// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
// The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
optional float midpoint = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Midpoint range wording now conflicts with validation logic

Validation still forbids midpoint = 0 for exp_decay and gauss_decay, but this comment implies all decay kinds accept the full [0,1] interval.
Clarify per-kind rules to prevent client-side confusion.

-    // The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
+    // The midpoint of the decay.
+    // • lin_decay   : allowed range = [0, 1] (inclusive)
+    // • exp_decay   : allowed range = (0, 1] (0 is invalid)
+    // • gauss_decay : allowed range = (0, 1] (0 is invalid)
+    // Defaults to 0.5. Output equals this value when `|x - target| == scale`.

This restores consistency between the gRPC spec and the server-side validator.

🤖 Prompt for AI Agents
In lib/api/src/grpc/proto/points.proto around lines 638 to 640, the comment for
the midpoint field incorrectly states that the midpoint should be between 0 and
1 for all decay kinds, but validation forbids midpoint = 0 for exp_decay and
gauss_decay. Update the comment to clarify the allowed midpoint range per decay
kind, specifying that for exp_decay and gauss_decay the midpoint cannot be 0, to
align the documentation with the validation logic and avoid client confusion.

Comment on lines 825 to 827
pub scale: Option<f32>,
/// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
/// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.
pub midpoint: Option<f32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Docstring off by one & missing space – sync with validator

Same mismatch as in the proto plus a minor typo (1.Defaults). Recommend tightening the comment and fixing the spacing.

-    /// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.
+    /// The midpoint of the decay.
+    /// 
+    /// • `lin_decay`   – valid range **[0.0, 1.0]** (inclusive)  
+    /// • `exp_decay`   – valid range **(0.0, 1.0]**  
+    /// • `gauss_decay` – valid range **(0.0, 1.0]**  
+    ///
+    /// Defaults to `0.5`. The decay function outputs this value when `|x - target| == scale`.

Keeps the REST schema, OpenAPI generation and runtime checks perfectly aligned.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub scale: Option<f32>,
/// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
/// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.
pub midpoint: Option<f32>,
pub scale: Option<f32>,
/// The midpoint of the decay.
///
/// • `lin_decay` – valid range **[0.0, 1.0]** (inclusive)
/// • `exp_decay` – valid range **(0.0, 1.0]**
/// • `gauss_decay` – valid range **(0.0, 1.0]**
///
/// Defaults to `0.5`. The decay function outputs this value when `|x - target| == scale`.
pub midpoint: Option<f32>,
🤖 Prompt for AI Agents
In lib/api/src/rest/schema.rs around lines 825 to 827, the docstring for the
midpoint field is missing a space after the period and is slightly misaligned
with the validator comments. Fix the spacing by adding a space after "1." and
tighten the comment wording to clearly state that midpoint should be between 0
and 1, defaults to 0.5, and that output equals this value when |x - target|
equals scale, ensuring it matches the validator and proto comments exactly.

DecayKind::Gauss | DecayKind::Exp => {
if midpoint <= 0.0 || midpoint >= 1.0 {
return Err(OperationError::validation_error(format!(
"Decay midpoint should be between 0.0 and 1.0 (exclusive), got {midpoint}."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 0.0 (exclusive) and 1.0 (exclusive) for consistency?

DecayKind::Lin => {
if !(0.0..=1.0).contains(&midpoint) {
return Err(OperationError::validation_error(format!(
"Decay midpoint should be between 0.0 (inclusive) and 1.0 (exclusive), got {midpoint}."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this Rust expression making 1.0 also acceptable? Do we want to have 1.0 acceptable, though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my bad, the text is incorrect

@generall generall merged commit a770158 into dev Aug 1, 2025
16 checks passed
@generall generall deleted the allow-0-midpoint-for-lindecay branch August 1, 2025 19:49
timvisee pushed a commit that referenced this pull request Aug 11, 2025
* allow [0,1] range for midpoint in lindecay

* clippyyyyy

* Better error message
@timvisee timvisee mentioned this pull request Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants