-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Refactor: move HardwareUsage to the Usage map #6532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors how hardware usage data is represented by moving the HardwareUsage field into a dedicated Usage map, which wraps the hardware measurements.
- Refactored function parameters and response construction to wrap HardwareUsage with a Usage struct.
- Updated gRPC proto definitions and multiple API endpoints to accommodate the new Usage map.
- Modified internal API logic and remote shard processing to extract hardware usage from the Usage wrapper.
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/tonic/api/update_common.rs | Updated parameter naming and response construction to wrap HardwareUsage in Usage. |
src/tonic/api/query_common.rs | Wrapped hardware usage conversion inside a Usage struct in multiple endpoints. |
src/tonic/api/points_internal_api.rs | Updated usage extraction and response assignment for internal API responses. |
src/tonic/api/points_api.rs | Updated response construction for Points API endpoints to use the new Usage map. |
lib/collection/src/shards/remote_shard.rs | Refactored hardware usage extraction from responses using unwrap_or_default(). |
lib/api/src/grpc/proto/points.proto | Updated proto definitions by replacing HardwareUsage with Usage wrapper. |
Comments suppressed due to low confidence (1)
src/tonic/api/update_common.rs:852
- [nitpick] The parameter name 'usage_hardware' is somewhat redundant since it is immediately wrapped into a 'Usage' struct. Consider renaming it to 'hardware' for conciseness.
pub fn points_operation_response_internal( ... usage_hardware: Option<HardwareUsage>, ... ) -> PointsOperationResponseInternal {
📝 WalkthroughWalkthroughThis change introduces a new wrapper message and struct called Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (13)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
lib/api/src/grpc/qdrant.rs (1)
6567-6570
: Consider enhancing documentation for the Usage structWhile you've added a separator comment, I recommend expanding the documentation to explain:
- The purpose of the Usage wrapper
- Its relationship with HardwareUsage
- The extensibility benefits it provides
-/// --- -/// -/// ## ----------- Measurements collector ---------- +/// Usage wraps various usage metrics for API operations. +/// +/// Currently contains hardware usage metrics, but designed to be +/// extensible for additional usage metrics in the future without +/// breaking API compatibility. +/// +/// ## ----------- Measurements collector ----------
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/api/src/grpc/qdrant.rs
(19 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: lint
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: test-consensus-compose
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (3)
lib/api/src/grpc/qdrant.rs (3)
5884-5884
: Change approved: Updated usage field to use the new Usage wrapperThis change replaces
HardwareUsage
with the newUsage
wrapper type, which aligns with the PR objective of making hardware usage information more extensible.
5998-5998
: Consistent type update across response structuresThe change from
Option<HardwareUsage>
toOption<Usage>
has been consistently applied across all relevant response structures, ensuring a uniform API design.Also applies to: 6010-6010, 6022-6022, 6034-6034, 6053-6053, 6065-6065, 6077-6077, 6092-6092, 6128-6128, 6140-6140, 6152-6152, 6164-6164, 6176-6176, 6188-6188, 6220-6220, 6232-6232, 9459-9459
6567-6576
: New Usage wrapper introduces good extensibilityThe new
Usage
struct with itshardware
field effectively encapsulates the existing hardware usage metrics while providing a foundation for adding more usage-related metrics in the future without breaking API compatibility.This follows good API design practices by:
- Maintaining backward compatibility (clients can access hardware usage through the nested field)
- Creating a dedicated namespace for all usage-related metrics
- Allowing for future extensions without structural changes to the response messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/actix/helpers.rs (3)
36-38
: Consider simplifying Option constructionThe implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using
Some(Usage { hardware: hardware_usage })
instead ofOption::from(Usage { hardware: hardware_usage })
.- usage: Option::from(Usage { + usage: Some(Usage { hardware: hardware_usage, }),
55-57
: Consider simplifying Option constructionThe implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using
Some(Usage { hardware: hardware_usage })
instead ofOption::from(Usage { hardware: hardware_usage })
.- usage: Option::from(Usage { + usage: Some(Usage { hardware: hardware_usage, }),
77-79
: Consider simplifying Option constructionThe implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using
Some(Usage { hardware: hardware_usage })
instead ofOption::from(Usage { hardware: hardware_usage })
.- usage: Option::from(Usage { + usage: Some(Usage { hardware: hardware_usage, }),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/redoc/master/openapi.json
(1 hunks)lib/api/src/grpc/proto/points_internal_service.proto
(3 hunks)lib/api/src/grpc/qdrant.rs
(21 hunks)lib/api/src/rest/models.rs
(1 hunks)lib/collection/src/shards/remote_shard.rs
(8 hunks)src/actix/helpers.rs
(4 hunks)src/schema_generator.rs
(2 hunks)
✅ Files skipped from review due to trivial changes (3)
- src/schema_generator.rs
- lib/api/src/grpc/proto/points_internal_service.proto
- docs/redoc/master/openapi.json
🚧 Files skipped from review as they are similar to previous changes (2)
- lib/collection/src/shards/remote_shard.rs
- lib/api/src/grpc/qdrant.rs
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: integration-tests
- GitHub Check: test-consensus-compose
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: integration-tests-consensus
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (2)
lib/api/src/rest/models.rs (1)
48-58
: Well-designed abstraction layer for extensibilityThe refactoring introduces a new
Usage
wrapper struct that will make adding future usage metrics beyond hardware easier. This follows good design practices by creating a dedicated namespace for all usage-related information while maintaining compatibility with existing code.The implementation correctly preserves all necessary attributes (serialization, schemas, anonymization) and properly handles null values with
skip_serializing_if
.src/actix/helpers.rs (1)
8-8
: Import updated correctlyThe import statement has been properly updated to include the new
Usage
struct alongside the existingHardwareUsage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/grpc/docs.md (1)
174-174
: Fix indentation for the new TOC entry.The added
- [Usage](#qdrant-Usage)
line uses 4 spaces of indent but the style in this file expects 2 spaces for nested list items. Please reduce the indent to 2 spaces for consistency.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
174-174: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/grpc/docs.md
(19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
174-174: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
3057-3057: Link fragments should be valid
null
(MD051, link-fragments)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: storage-compat-test
- GitHub Check: test-consistency
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: lint
🔇 Additional comments (14)
docs/grpc/docs.md (14)
2390-2390
: Docs correctly include the newusage
field.The
CountResponse
now documents theoptional Usage usage
field as intended.
2588-2588
: Docs correctly include the newusage
field.The
DiscoverBatchResponse
now documents theoptional Usage usage
field as intended.
2649-2649
: Docs correctly include the newusage
field.The
DiscoverResponse
now documents theoptional Usage usage
field as intended.
3011-3011
: Docs correctly include the newusage
field.The
GetResponse
now documents theoptional Usage usage
field as intended.
3539-3539
: Docs correctly include the newusage
field.The
PointsOperationResponse
now documents theoptional Usage usage
field as intended.
3860-3860
: Docs correctly include the newusage
field.The
QueryBatchResponse
now documents theoptional Usage usage
field as intended.
3877-3877
: Docs correctly include the newusage
field.The
QueryGroupsResponse
now documents theoptional Usage usage
field as intended.
3954-3954
: Docs correctly include the newusage
field.The
QueryResponse
now documents theoptional Usage usage
field as intended.
4284-4284
: Docs correctly include the newusage
field.The
ScrollResponse
now documents theoptional Usage usage
field as intended.
4319-4319
: Docs correctly include the newusage
field.The
SearchBatchResponse
now documents theoptional Usage usage
field as intended.
4336-4336
: Docs correctly include the newusage
field.The
SearchGroupsResponse
now documents theoptional Usage usage
field as intended.
4371-4371
: Docs correctly include the newusage
field.The
SearchMatrixOffsetsResponse
now documents theoptional Usage usage
field as intended.
4420-4420
: Docs correctly include the newusage
field.The
SearchMatrixPairsResponse
now documents theoptional Usage usage
field as intended.
4535-4535
: Docs correctly include the newusage
field.The
SearchResponse
now documents theoptional Usage usage
field as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/grpc/docs.md (2)
174-174
: Fix list indentation for new Table of Contents entry.
The new- [Usage](#qdrant-Usage)
item is indented by 4 spaces but should use 2 spaces to align with existing list entries (MD007).🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
174-174: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
3050-3058
: Enhance the Usage section with context.
You've correctly added the<a name="qdrant-Usage"></a>
anchor and the### Usage
header with the hardware field table. Consider adding a brief introductory line explaining thatUsage
is a wrapper for hardware metrics and why this change was made, to improve clarity for readers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/grpc/docs.md
(19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
174-174: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (1)
docs/grpc/docs.md (1)
2390-2390
: Verify all response message tables include the newusage
field consistently.
You've added| usage | [Usage](#qdrant-Usage) | optional | |
to a number of response message tables. Please confirm that this field is present in every relevant response type in the gRPC docs, and update any that may be missing (e.g., in pagination or batch endpoints).Also applies to: 2588-2588, 2649-2649, 3011-3011, 3539-3539, 3861-3861, 3878-3878, 3956-3956, 4024-4024, 4041-4041, 4141-4141, 4285-4285, 4320-4320, 4337-4337, 4372-4372, 4421-4421, 4536-4536
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
docs/grpc/docs.md (3)
269-269
: Fix list indentation in Table of ContentsThe new
- [Usage](#qdrant-Usage)
entry is indented by 4 spaces, but top-level items use 2 spaces. Please adjust to match MD007:- - [Usage](#qdrant-Usage) + - [Usage](#qdrant-Usage)🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
269-269: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
4754-4765
: Refactor ASCII separators inUsage
sectionThe decorative ASCII lines under
### Usage
break consistency with the rest of the docs and may confuse readers:--------------------------------------------- ----------- Measurements collector ---------- ---------------------------------------------Consider replacing them with a brief descriptive sentence or removing them entirely, for example:
- --------------------------------------------- - ----------- Measurements collector ---------- - --------------------------------------------- + Wrapper message for usage metrics.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
4758-4758: Heading style
Expected: atx; Actual: setext(MD003, heading-style)
4758-4758
: Use ATX-style heading instead of setextThe line
----------- Measurements collector ----------
uses setext underlining, triggering MD003. Either remove this line or convert it to plain text or an ATX-style heading:- ----------- Measurements collector ---------- + _Measurements collector for hardware usage metrics._🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
4758-4758: Heading style
Expected: atx; Actual: setext(MD003, heading-style)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/grpc/docs.md
(19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
269-269: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
4758-4758: Heading style
Expected: atx; Actual: setext
(MD003, heading-style)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: lint
🔇 Additional comments (1)
docs/grpc/docs.md (1)
2391-2391
: Approve addition ofusage
wrapper fieldsThe addition of
| usage | [Usage](#qdrant-Usage) | optional |
to the response message tables is consistent and correctly reflects the refactor to wrap
HardwareUsage
in the newUsage
message.Also applies to: 2590-2590, 2650-2650, 3012-3012, 3531-3531, 3853-3853, 3870-3870, 3947-3947, 4016-4016, 4033-4033, 4133-4133, 4278-4278, 4313-4313, 4330-4330, 4364-4364, 4414-4414, 4529-4529
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (9)
tools/generate_grpc_docs.dockerfile (3)
7-10
: Consolidate file removals to reduce layers.
You can use brace expansion or a single pattern to eliminate the four.proto
files in one go:-RUN rm collections_internal_service.proto \ - points_internal_service.proto \ - shard_snapshots_service.proto \ - raft_service.proto +RUN rm {collections_internal_service,points_internal_service,shard_snapshots_service,raft_service}.proto
12-17
: Streamline filtering withsed
.
Chaininggrep -v
is less efficient and harder to maintain. A singlesed
command can drop all unwanted lines:-RUN grep -v 'collections_internal_service.proto' qdrant.proto \ - | grep -v 'points_internal_service.proto' \ - | grep -v 'shard_snapshots_service.proto' \ - | grep -v 'raft_service.proto' \ - > qdrant.proto.tmp \ - && mv qdrant.proto.tmp qdrant.proto +RUN sed -E '/collections_internal_service.proto|points_internal_service.proto|shard_snapshots_service.proto|raft_service.proto/d' \ + qdrant.proto > qdrant.proto.tmp \ + && mv qdrant.proto.tmp qdrant.proto
19-20
: Declare/out
as a volume.
Since--doc_out=/out
writes generated docs there, explicitly declaring a volume ensures the directory is exposed for mounting and persists output:VOLUME /out
lib/segment/src/index/field_index/geo_index/mod.rs (1)
748-756
:RamMmap
builder dereferencesBox
– avoid double mmap load
finalize()
creates anMmap
index, immediately moves it intoopen_mmap
, and then callsindex.load()
again:let GeoMapIndex::Mmap(index) = builder.finalize()? else { … }; let mut index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index)); index.load()?; // <-- second pass
MmapGeoMapIndex::new
already callsload
, so the extraindex.load()
duplicates work (opens JSON, mmap-s, counts ones, etc.).
Not critical but wastes a few milliseconds in tests.Optional improvement:
-let mut index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index)); -index.load()?; -Ok(index) +let index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index)); +Ok(index) // ImmutableGeoMapIndex::open_mmap already has all metadata; + // callers can invoke `load()` only if they really need it.lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (5)
51-53
: Fix minor typo in inline commentThere is an extra space in the word “storage” (
s torage
).
Correcting small typos keeps the codebase professional and easier to scan.- // Backing s torage, source of state, persists deletions + // Backing storage, source of state, persists deletions
55-58
: DeriveDebug
(and possiblyClone
) forStorage
enum
Storage
is frequently pattern-matched and occasionally logged.
Adding#[derive(Debug)]
(and optionallyClone
) makes troubleshooting easier without affecting behaviour.-enum Storage { +#[derive(Debug)] +enum Storage { RocksDb(DatabaseColumnScheduledDeleteWrapper), Mmap(Box<MmapGeoMapIndex>), }
83-93
: Avoid an extra heap allocation inopen_mmap
open_mmap
receives a value, allocates aBox
, and then immediately stores it inStorage::Mmap
.
Passing aBox<MmapGeoMapIndex>
from the caller avoids this extra allocation and keeps ownership semantics clear.- pub fn open_mmap(index: MmapGeoMapIndex) -> Self { + pub fn open_mmap(index: Box<MmapGeoMapIndex>) -> Self { ... - storage: Storage::Mmap(Box::new(index)), + storage: Storage::Mmap(index),
164-167
: Doc-comment wording / typo“Loads in-memory index fmmap RocksDB storage” contains a typo and still mentions RocksDB.
Recommend rephrasing for clarity.- /// Loads in-memory index fmmap RocksDB storage. + /// Loads the in-memory index from the mmap-backed storage.
183-203
: Pre-allocateAHashSet
capacity & preserve orderingWhen constructing each
AHashSet
, you already know the number of IDs
(ids_end - ids_start
). Supplying this as the initial capacity avoids
rehashing.
Additionally, relying on the order ofindex.points_map
for later
binary searches assumes it is sorted; consider enforcing this with
debug_assert!(is_sorted_by_key(…))
to guard against future regressions.- ( - hash, - index.points_map_ids[ids_start as usize..ids_end as usize] - .iter() - .copied() - // Filter deleted points - .filter(|id| !index.deleted.get(*id as usize).unwrap_or_default()) - .collect(), - ) + { + let slice = &index.points_map_ids[ids_start as usize..ids_end as usize]; + let mut set = + AHashSet::with_capacity(slice.len()); + set.extend( + slice.iter() + .copied() + .filter(|id| !index.deleted.get(*id as usize).unwrap_or_default()), + ); + (hash, set) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
clippy.toml
(1 hunks)lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
(1 hunks)lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
(3 hunks)lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs
(2 hunks)lib/segment/src/index/field_index/geo_index/mod.rs
(21 hunks)lib/segment/src/index/field_index/map_index/immutable_map_index.rs
(1 hunks)lib/segment/src/index/field_index/map_index/mod.rs
(3 hunks)lib/segment/src/index/field_index/map_index/mutable_map_index.rs
(1 hunks)lib/segment/src/index/field_index/mmap_point_to_values.rs
(1 hunks)lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
(1 hunks)lib/segment/src/index/field_index/numeric_index/mod.rs
(4 hunks)lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
(1 hunks)lib/segment/src/index/field_index/numeric_index/tests.rs
(1 hunks)lib/segment/src/types.rs
(1 hunks)tools/generate_grpc_docs.dockerfile
(1 hunks)tools/generate_grpc_docs.dockerfile.sh
(1 hunks)
✅ Files skipped from review due to trivial changes (10)
- lib/segment/src/index/field_index/numeric_index/mod.rs
- lib/segment/src/index/field_index/map_index/mutable_map_index.rs
- clippy.toml
- lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
- lib/segment/src/index/field_index/map_index/mod.rs
- lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
- lib/segment/src/index/field_index/map_index/immutable_map_index.rs
- tools/generate_grpc_docs.dockerfile.sh
- lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
- lib/segment/src/types.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/field_index/geo_index/mod.rs (3)
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (8)
open_rocksdb
(64-78)load
(98-103)open_mmap
(83-93)points_count
(284-286)points_values_count
(288-290)max_values_per_point
(292-294)values_count
(313-317)from
(30-41)lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (11)
is_on_disk
(395-397)load
(197-245)new
(86-195)std
(112-112)std
(124-124)std
(150-150)std
(174-174)points_count
(374-378)points_values_count
(380-382)max_values_per_point
(384-386)values_count
(268-274)lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs (7)
load
(73-116)new
(54-63)new
(198-208)points_count
(153-155)points_values_count
(157-159)max_values_per_point
(161-163)values_count
(221-226)
🪛 Hadolint (2.12.0)
tools/generate_grpc_docs.dockerfile
[error] 5-5: Use COPY instead of ADD for files and folders
(DL3020)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: lint
- GitHub Check: integration-tests-consensus
- GitHub Check: storage-compat-test
🔇 Additional comments (7)
lib/segment/src/index/field_index/numeric_index/tests.rs (1)
349-353
: Method renaming looks good.The changes correctly update the test to use the renamed
db_wrapper()
method instead of the previousget_db_wrapper()
method, maintaining consistent API naming throughout the codebase. This aligns with the method renaming pattern mentioned in the summary.tools/generate_grpc_docs.dockerfile (1)
3-3
: WORKDIR is correctly set.
Pointing to/protos
aligns with where the proto files will reside and be processed.lib/segment/src/index/field_index/mmap_point_to_values.rs (1)
381-391
:iter()
returnsOption
that is neverNone
– consider flattening the API
get_values(idx)
always yieldsSome(iterator)
for everyidx < self.len()
because:
iter()
only iterates within0..self.len()
.get_range()
returns a zero‐length range (start = 0, count = 0) for points that have no data, soget_values()
wraps an empty iterator inSome(..)
.This means the
Option
layer is redundant and forces every caller to unwrap (or pattern-match) needlessly.-pub fn iter(&self) -> impl Iterator< - Item = ( - PointOffsetType, - Option<impl Iterator<Item = T::Referenced<'_>> + '_>, - ), -> + Clone { - (0..self.len() as PointOffsetType).map(|idx| (idx, self.get_values(idx))) +pub fn iter( + &self, +) -> impl Iterator< + Item = ( + PointOffsetType, + impl Iterator<Item = T::Referenced<'_>> + '_, + ), + > + Clone { + (0..self.len() as PointOffsetType) + .map(|idx| (idx, self.get_values(idx).into_iter().flatten())) }Benefits
• Call-sites become simpler (for (id, vals) in ptv.iter() { … }
).
• No extraOption
allocation/branching at runtime.
• Makes the intent (“each point always has an iterator, though it might be empty”) explicit.[ suggest_optional_refactor ]
lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (2)
29-44
:Counts
/PointKeyValue
markedCopy
– validate mmap-safetyDeriving
Copy
is convenient but keep in mind that these structs are backed by memory-mapped pages.
Copying many tiny structs is fine, yet stale copies can be held after the underlying mmap is mutated (e.g. when rebuilding the index in-place). If you later introduce in-place updates, silent duplication might become a foot-gun.Nothing to change right now, just double-check that:
Counts
andPointKeyValue
are indeed immutable after the index is built.- No code path mutates a copied instance expecting the change to propagate back to the mmap.
If both are guaranteed, the
Copy
derive is safe.
63-74
: Visibility relaxed topub(super)
– keep encapsulation in mindMaking the internal mmap slices publicly accessible inside the module hierarchy is required for the new immutable-from-mmap flow – looks good.
Please consider adding a brief module-level comment documenting which sub-modules are allowed to mutate these fields to avoid accidental write access from unrelated code.
lib/segment/src/index/field_index/geo_index/mod.rs (1)
57-66
:new_mmap
now returns eitherMmap
orImmutable
– audit downstream call-sitesSwitching the returned enum variant depending on
is_on_disk
is sensible, but it is a silent breaking change for any code that previously assumedGeoMapIndex::Mmap
when callingnew_mmap(..., /*is_on_disk=*/false)
.Please grep the codebase (services, tests, benchmarks) for pattern
new_mmap(
and check that every caller:
- Treats both
Immutable
andMmap
variants correctly, or- Passes
is_on_disk = true
when it truly needs the on-disk behaviour.A small helper like
is_mmap_variant()
or updating the doc-comment ofnew_mmap
would make the new contract explicit.lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (1)
95-104
: Clean dispatch inload
Nice abstraction – delegating to
load_rocksdb
/load_mmap
keeps the public
API compact while hiding backend details. 👍
Could you please rebase this PR onto the latest dev? |
It seems your rebase went wrong, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tools/generate_grpc_docs.dockerfile (2)
1-1
: Pin base image version for reproducibility.Using
latest
can introduce breaking changes when the image is updated. Please pin to a stable tag or digest, for example:-FROM pseudomuto/protoc-gen-doc:latest +FROM pseudomuto/protoc-gen-doc:v1.0.0
5-5
: Use COPY instead of ADD for clarity and best practice.
ADD
has extra behaviors (e.g., auto-extracting archives) that aren't required here. Replace with:-ADD lib/api/src/grpc/proto/ . +COPY lib/api/src/grpc/proto/ .🧰 Tools
🪛 Hadolint (2.12.0)
[error] 5-5: Use COPY instead of ADD for files and folders
(DL3020)
🧹 Nitpick comments (1)
tools/generate_grpc_docs.dockerfile (1)
7-17
: Merge RUN steps and optimize proto filtering.Combining the removal and filtering steps into a single
RUN
reduces image layers and simplifies the commands. Also, usegrep -E
(orgrep -Ev
) to exclude multiple patterns in one go:-RUN rm collections_internal_service.proto \ - points_internal_service.proto \ - shard_snapshots_service.proto \ - raft_service.proto - -RUN grep -v 'collections_internal_service.proto' qdrant.proto \ - | grep -v 'points_internal_service.proto' \ - | grep -v 'shard_snapshots_service.proto' \ - | grep -v 'raft_service.proto' \ - > qdrant.proto.tmp \ - && mv qdrant.proto.tmp qdrant.proto +RUN rm collections_internal_service.proto points_internal_service.proto \ + shard_snapshots_service.proto raft_service.proto && \ + grep -Ev 'collections_internal_service\.proto|points_internal_service\.proto|shard_snapshots_service\.proto|raft_service\.proto' \ + qdrant.proto > qdrant.proto.tmp && \ + mv qdrant.proto.tmp qdrant.proto
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/api/src/rest/models.rs
(1 hunks)src/actix/helpers.rs
(4 hunks)src/tonic/api/points_internal_api.rs
(4 hunks)tools/generate_grpc_docs.dockerfile
(1 hunks)tools/generate_grpc_docs.dockerfile.sh
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- tools/generate_grpc_docs.dockerfile.sh
- src/actix/helpers.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/tonic/api/points_internal_api.rs
- lib/api/src/rest/models.rs
🧰 Additional context used
🪛 Hadolint (2.12.0)
tools/generate_grpc_docs.dockerfile
[error] 5-5: Use COPY instead of ADD for files and folders
(DL3020)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: storage-compat-test
- GitHub Check: lint
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: integration-tests-consensus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
lib/api/src/grpc/ops.rs (1)
25-31
: Consider simplifying the is_empty implementationThe current implementation uses pattern matching to destructure the
Usage
struct, but since there's only one field being accessed, you could simplify this to:impl Usage { pub fn is_empty(&self) -> bool { - let Usage { hardware } = self; - - hardware.is_none() + self.hardware.is_none() } }This is just a minor stylistic suggestion that makes the code slightly more concise while preserving functionality.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
lib/api/src/grpc/ops.rs
(2 hunks)src/tonic/api/points_api.rs
(3 hunks)src/tonic/api/query_common.rs
(16 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/tonic/api/points_api.rs
- src/tonic/api/query_common.rs
⏰ Context from checks skipped due to timeout of 90000ms (12)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: lint
🔇 Additional comments (1)
lib/api/src/grpc/ops.rs (1)
33-36
: LGTM! Good utility function implementationThe
usage_or_none
function is well-implemented, converting anOption<HardwareUsage>
into anOption<Usage>
while correctly preserving the semantics that an empty value should be represented asNone
. This aligns with the PR objective of making hardware usage more extendable.
--------- Co-authored-by: jojii <jojii@gmx.net>
The hardware usage was moved into dedicated map in order to make it eventually extendable