-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[decay formula] allow [0,1] range for midpoint in lin_decay #6959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 Walkthrough""" WalkthroughThis change updates documentation and validation logic related to the Estimated code review effort🎯 2 (Simple) | ⏱️ ~7 minutes
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
docs/grpc/docs.md (1)
2564-2569
: Clarify Decay Midpoint Ranges & Fix Table Pipe in gRPC DocsThe code enforces different midpoint ranges for each decay kind—this should be documented explicitly—and the inline
|x - target|
needs its pipes escaped to avoid breaking the Markdown table.• File:
docs/grpc/docs.md
(around lines 2564–2569)
• Update the “midpoint” row as follows:-| midpoint | [float](#float) | optional | The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`. | +| midpoint | [float](#float) | optional | The midpoint of the decay. For `lin_decay`, the range is **inclusive** `[0, 1]`; for `exp_decay` and `gauss_decay`, the range is **exclusive** `(0, 1)`. Defaults to `0.5`. Output will be this value when `\|x - target\| == scale`. |Also ensure the same clarification and pipe‐escaping is applied in the REST schema (
lib/api/src/rest/schema.rs
), Protobuf comments, and OpenAPI spec so all user‐facing docs stay in sync.
🧹 Nitpick comments (2)
docs/redoc/master/openapi.json (1)
15168-15170
: Fix grammar and explicitly state inclusive bounds formidpoint
.A small typo (
1.Defaults
) slipped in, and the phrase “between 0 and 1” is ambiguous given that only the linear decay allows0
, while Gaussian/exp require> 0
. The OpenAPI description should reflect this nuance to avoid confusion.- "description": "The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.", + "description": "The midpoint of the decay. For linear decay it must be in the inclusive range [0, 1]; \ +for Gaussian and exponential decay it must be in the range (0, 1]. Defaults to 0.5. Output equals this value when \ +`|x - target| == scale`.",lib/api/src/grpc/qdrant.rs (1)
5574-5574
: Clarify inclusive vs exclusive bounds in the docstring“Should be between 0 and 1” is ambiguous—readers may assume exclusive bounds, yet the PR intentionally allows
0
(inclusive) for the linear decay kind while keeping exclusive bounds for the other kinds. Consider spelling this out to avoid misunderstanding:-/// The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`. +/// The midpoint of the decay. +/// • Linear decay: 0 ≤ midpoint ≤ 1 +/// • Gaussian & exponential decay: 0 < midpoint < 1 +/// Defaults to 0.5. Output equals this value when `|x - target| == scale`.This keeps the auto-generated gRPC docs consistent with the refined validation logic introduced in
parsed_formula.rs
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/grpc/docs.md
(1 hunks)docs/redoc/master/openapi.json
(1 hunks)lib/api/src/grpc/proto/points.proto
(1 hunks)lib/api/src/grpc/qdrant.rs
(1 hunks)lib/api/src/rest/schema.rs
(1 hunks)lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs
(1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the `target` field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from `midpoint` and `scale` parameters. The `target` field is not involved in the lambda calculation.
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (3)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap()
is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id()
approach. Test code can use mock implementations that serve the testing goals.
lib/api/src/grpc/proto/points.proto (2)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
lib/api/src/grpc/qdrant.rs (2)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
lib/api/src/rest/schema.rs (2)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
docs/grpc/docs.md (2)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
docs/redoc/master/openapi.json (2)
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the target
field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from midpoint
and scale
parameters. The target
field is not involved in the lambda calculation.
Learnt from: coszio
PR: #6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
🧬 Code Graph Analysis (1)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)
lib/segment/src/common/operation_error.rs (1)
validation_error
(99-103)
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
2568-2568: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing
(MD056, table-column-count)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: storage-compat-test
- GitHub Check: lint
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consensus-compose
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: e2e-tests
- GitHub Check: integration-tests
- GitHub Check: test-consistency
optional float scale = 3; | ||
// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`. | ||
// The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`. | ||
optional float midpoint = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Midpoint range wording now conflicts with validation logic
Validation still forbids midpoint = 0
for exp_decay
and gauss_decay
, but this comment implies all decay kinds accept the full [0,1]
interval.
Clarify per-kind rules to prevent client-side confusion.
- // The midpoint of the decay. Should be between 0 and 1. Defaults to 0.5. Output will be this value when `|x - target| == scale`.
+ // The midpoint of the decay.
+ // • lin_decay : allowed range = [0, 1] (inclusive)
+ // • exp_decay : allowed range = (0, 1] (0 is invalid)
+ // • gauss_decay : allowed range = (0, 1] (0 is invalid)
+ // Defaults to 0.5. Output equals this value when `|x - target| == scale`.
This restores consistency between the gRPC spec and the server-side validator.
🤖 Prompt for AI Agents
In lib/api/src/grpc/proto/points.proto around lines 638 to 640, the comment for
the midpoint field incorrectly states that the midpoint should be between 0 and
1 for all decay kinds, but validation forbids midpoint = 0 for exp_decay and
gauss_decay. Update the comment to clarify the allowed midpoint range per decay
kind, specifying that for exp_decay and gauss_decay the midpoint cannot be 0, to
align the documentation with the validation logic and avoid client confusion.
pub scale: Option<f32>, | ||
/// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`. | ||
/// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`. | ||
pub midpoint: Option<f32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Docstring off by one & missing space – sync with validator
Same mismatch as in the proto plus a minor typo (1.Defaults
). Recommend tightening the comment and fixing the spacing.
- /// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`.
+ /// The midpoint of the decay.
+ ///
+ /// • `lin_decay` – valid range **[0.0, 1.0]** (inclusive)
+ /// • `exp_decay` – valid range **(0.0, 1.0]**
+ /// • `gauss_decay` – valid range **(0.0, 1.0]**
+ ///
+ /// Defaults to `0.5`. The decay function outputs this value when `|x - target| == scale`.
Keeps the REST schema, OpenAPI generation and runtime checks perfectly aligned.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
pub scale: Option<f32>, | |
/// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`. | |
/// The midpoint of the decay. Should be between 0 and 1.Defaults to 0.5. Output will be this value when `|x - target| == scale`. | |
pub midpoint: Option<f32>, | |
pub scale: Option<f32>, | |
/// The midpoint of the decay. | |
/// | |
/// • `lin_decay` – valid range **[0.0, 1.0]** (inclusive) | |
/// • `exp_decay` – valid range **(0.0, 1.0]** | |
/// • `gauss_decay` – valid range **(0.0, 1.0]** | |
/// | |
/// Defaults to `0.5`. The decay function outputs this value when `|x - target| == scale`. | |
pub midpoint: Option<f32>, |
🤖 Prompt for AI Agents
In lib/api/src/rest/schema.rs around lines 825 to 827, the docstring for the
midpoint field is missing a space after the period and is slightly misaligned
with the validator comments. Fix the spacing by adding a space after "1." and
tighten the comment wording to clearly state that midpoint should be between 0
and 1, defaults to 0.5, and that output equals this value when |x - target|
equals scale, ensuring it matches the validator and proto comments exactly.
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs
Outdated
Show resolved
Hide resolved
DecayKind::Gauss | DecayKind::Exp => { | ||
if midpoint <= 0.0 || midpoint >= 1.0 { | ||
return Err(OperationError::validation_error(format!( | ||
"Decay midpoint should be between 0.0 and 1.0 (exclusive), got {midpoint}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe 0.0 (exclusive) and 1.0 (exclusive) for consistency?
DecayKind::Lin => { | ||
if !(0.0..=1.0).contains(&midpoint) { | ||
return Err(OperationError::validation_error(format!( | ||
"Decay midpoint should be between 0.0 (inclusive) and 1.0 (exclusive), got {midpoint}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this Rust expression making 1.0 also acceptable? Do we want to have 1.0 acceptable, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my bad, the text is incorrect
* allow [0,1] range for midpoint in lindecay * clippyyyyy * Better error message
We had restricted
midpoint
to be in the range (0,1) for the sake of gauss and exp decay. Linear decay can actually accept in the range [0,1]. This PR enables this possibility