Skip to content

Conversation

n0x29a
Copy link
Contributor

@n0x29a n0x29a commented May 13, 2025

The hardware usage was moved into dedicated map in order to make it eventually extendable

@n0x29a n0x29a requested a review from Copilot May 13, 2025 13:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors how hardware usage data is represented by moving the HardwareUsage field into a dedicated Usage map, which wraps the hardware measurements.

  • Refactored function parameters and response construction to wrap HardwareUsage with a Usage struct.
  • Updated gRPC proto definitions and multiple API endpoints to accommodate the new Usage map.
  • Modified internal API logic and remote shard processing to extract hardware usage from the Usage wrapper.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/tonic/api/update_common.rs Updated parameter naming and response construction to wrap HardwareUsage in Usage.
src/tonic/api/query_common.rs Wrapped hardware usage conversion inside a Usage struct in multiple endpoints.
src/tonic/api/points_internal_api.rs Updated usage extraction and response assignment for internal API responses.
src/tonic/api/points_api.rs Updated response construction for Points API endpoints to use the new Usage map.
lib/collection/src/shards/remote_shard.rs Refactored hardware usage extraction from responses using unwrap_or_default().
lib/api/src/grpc/proto/points.proto Updated proto definitions by replacing HardwareUsage with Usage wrapper.
Comments suppressed due to low confidence (1)

src/tonic/api/update_common.rs:852

  • [nitpick] The parameter name 'usage_hardware' is somewhat redundant since it is immediately wrapped into a 'Usage' struct. Consider renaming it to 'hardware' for conciseness.
pub fn points_operation_response_internal( ... usage_hardware: Option<HardwareUsage>, ... ) -> PointsOperationResponseInternal {

Copy link
Contributor

coderabbitai bot commented May 13, 2025

📝 Walkthrough

Walkthrough

This change introduces a new wrapper message and struct called Usage to encapsulate hardware usage metrics in both gRPC and REST APIs. The usage field in numerous response messages and data structures is refactored from directly using HardwareUsage to using the new Usage wrapper, which contains an optional hardware field. All relevant code, documentation, OpenAPI schemas, and tests are updated to reflect this new structure. Helper functions and methods for wrapping, unwrapping, and checking the presence of usage data are added, and the conversion logic between internal and external response types is adjusted accordingly. Supporting scripts and Dockerfiles for documentation generation are also introduced.

Suggested reviewers

  • JojiiOfficial
  • agourlay

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca9aed1 and 6c7c323.

📒 Files selected for processing (3)
  • lib/api/src/grpc/ops.rs (2 hunks)
  • src/tonic/api/points_api.rs (3 hunks)
  • src/tonic/api/query_common.rs (16 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/tonic/api/points_api.rs
  • src/tonic/api/query_common.rs
  • lib/api/src/grpc/ops.rs
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: lint
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (windows-latest)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@n0x29a n0x29a requested review from JojiiOfficial and agourlay May 13, 2025 13:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/api/src/grpc/qdrant.rs (1)

6567-6570: Consider enhancing documentation for the Usage struct

While you've added a separator comment, I recommend expanding the documentation to explain:

  1. The purpose of the Usage wrapper
  2. Its relationship with HardwareUsage
  3. The extensibility benefits it provides
-/// ---
-///
-/// ## ----------- Measurements collector ----------
+/// Usage wraps various usage metrics for API operations.
+/// 
+/// Currently contains hardware usage metrics, but designed to be
+/// extensible for additional usage metrics in the future without
+/// breaking API compatibility.
+/// 
+/// ## ----------- Measurements collector ----------
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 830ff21 and 5f74d43.

📒 Files selected for processing (1)
  • lib/api/src/grpc/qdrant.rs (19 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: lint
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (3)
lib/api/src/grpc/qdrant.rs (3)

5884-5884: Change approved: Updated usage field to use the new Usage wrapper

This change replaces HardwareUsage with the new Usage wrapper type, which aligns with the PR objective of making hardware usage information more extensible.


5998-5998: Consistent type update across response structures

The change from Option<HardwareUsage> to Option<Usage> has been consistently applied across all relevant response structures, ensuring a uniform API design.

Also applies to: 6010-6010, 6022-6022, 6034-6034, 6053-6053, 6065-6065, 6077-6077, 6092-6092, 6128-6128, 6140-6140, 6152-6152, 6164-6164, 6176-6176, 6188-6188, 6220-6220, 6232-6232, 9459-9459


6567-6576: New Usage wrapper introduces good extensibility

The new Usage struct with its hardware field effectively encapsulates the existing hardware usage metrics while providing a foundation for adding more usage-related metrics in the future without breaking API compatibility.

This follows good API design practices by:

  1. Maintaining backward compatibility (clients can access hardware usage through the nested field)
  2. Creating a dedicated namespace for all usage-related metrics
  3. Allowing for future extensions without structural changes to the response messages

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/actix/helpers.rs (3)

36-38: Consider simplifying Option construction

The implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using Some(Usage { hardware: hardware_usage }) instead of Option::from(Usage { hardware: hardware_usage }).

-        usage: Option::from(Usage {
+        usage: Some(Usage {
             hardware: hardware_usage,
         }),

55-57: Consider simplifying Option construction

The implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using Some(Usage { hardware: hardware_usage }) instead of Option::from(Usage { hardware: hardware_usage }).

-            usage: Option::from(Usage {
+            usage: Some(Usage {
                 hardware: hardware_usage,
             }),

77-79: Consider simplifying Option construction

The implementation correctly wraps the hardware usage in the new Usage struct. For slightly better readability, consider using Some(Usage { hardware: hardware_usage }) instead of Option::from(Usage { hardware: hardware_usage }).

-        usage: Option::from(Usage {
+        usage: Some(Usage {
             hardware: hardware_usage,
         }),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f74d43 and 0b061a3.

📒 Files selected for processing (7)
  • docs/redoc/master/openapi.json (1 hunks)
  • lib/api/src/grpc/proto/points_internal_service.proto (3 hunks)
  • lib/api/src/grpc/qdrant.rs (21 hunks)
  • lib/api/src/rest/models.rs (1 hunks)
  • lib/collection/src/shards/remote_shard.rs (8 hunks)
  • src/actix/helpers.rs (4 hunks)
  • src/schema_generator.rs (2 hunks)
✅ Files skipped from review due to trivial changes (3)
  • src/schema_generator.rs
  • lib/api/src/grpc/proto/points_internal_service.proto
  • docs/redoc/master/openapi.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/collection/src/shards/remote_shard.rs
  • lib/api/src/grpc/qdrant.rs
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
🔇 Additional comments (2)
lib/api/src/rest/models.rs (1)

48-58: Well-designed abstraction layer for extensibility

The refactoring introduces a new Usage wrapper struct that will make adding future usage metrics beyond hardware easier. This follows good design practices by creating a dedicated namespace for all usage-related information while maintaining compatibility with existing code.

The implementation correctly preserves all necessary attributes (serialization, schemas, anonymization) and properly handles null values with skip_serializing_if.

src/actix/helpers.rs (1)

8-8: Import updated correctly

The import statement has been properly updated to include the new Usage struct alongside the existing HardwareUsage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/grpc/docs.md (1)

174-174: Fix indentation for the new TOC entry.

The added - [Usage](#qdrant-Usage) line uses 4 spaces of indent but the style in this file expects 2 spaces for nested list items. Please reduce the indent to 2 spaces for consistency.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

174-174: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdfcdb2 and 0f11ffb.

📒 Files selected for processing (1)
  • docs/grpc/docs.md (19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

174-174: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


3057-3057: Link fragments should be valid
null

(MD051, link-fragments)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: storage-compat-test
  • GitHub Check: test-consistency
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: lint
🔇 Additional comments (14)
docs/grpc/docs.md (14)

2390-2390: Docs correctly include the new usage field.

The CountResponse now documents the optional Usage usage field as intended.


2588-2588: Docs correctly include the new usage field.

The DiscoverBatchResponse now documents the optional Usage usage field as intended.


2649-2649: Docs correctly include the new usage field.

The DiscoverResponse now documents the optional Usage usage field as intended.


3011-3011: Docs correctly include the new usage field.

The GetResponse now documents the optional Usage usage field as intended.


3539-3539: Docs correctly include the new usage field.

The PointsOperationResponse now documents the optional Usage usage field as intended.


3860-3860: Docs correctly include the new usage field.

The QueryBatchResponse now documents the optional Usage usage field as intended.


3877-3877: Docs correctly include the new usage field.

The QueryGroupsResponse now documents the optional Usage usage field as intended.


3954-3954: Docs correctly include the new usage field.

The QueryResponse now documents the optional Usage usage field as intended.


4284-4284: Docs correctly include the new usage field.

The ScrollResponse now documents the optional Usage usage field as intended.


4319-4319: Docs correctly include the new usage field.

The SearchBatchResponse now documents the optional Usage usage field as intended.


4336-4336: Docs correctly include the new usage field.

The SearchGroupsResponse now documents the optional Usage usage field as intended.


4371-4371: Docs correctly include the new usage field.

The SearchMatrixOffsetsResponse now documents the optional Usage usage field as intended.


4420-4420: Docs correctly include the new usage field.

The SearchMatrixPairsResponse now documents the optional Usage usage field as intended.


4535-4535: Docs correctly include the new usage field.

The SearchResponse now documents the optional Usage usage field as intended.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/grpc/docs.md (2)

174-174: Fix list indentation for new Table of Contents entry.
The new - [Usage](#qdrant-Usage) item is indented by 4 spaces but should use 2 spaces to align with existing list entries (MD007).

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

174-174: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


3050-3058: Enhance the Usage section with context.
You've correctly added the <a name="qdrant-Usage"></a> anchor and the ### Usage header with the hardware field table. Consider adding a brief introductory line explaining that Usage is a wrapper for hardware metrics and why this change was made, to improve clarity for readers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f11ffb and b6aff10.

📒 Files selected for processing (1)
  • docs/grpc/docs.md (19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

174-174: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (1)
docs/grpc/docs.md (1)

2390-2390: Verify all response message tables include the new usage field consistently.
You've added | usage | [Usage](#qdrant-Usage) | optional | | to a number of response message tables. Please confirm that this field is present in every relevant response type in the gRPC docs, and update any that may be missing (e.g., in pagination or batch endpoints).

Also applies to: 2588-2588, 2649-2649, 3011-3011, 3539-3539, 3861-3861, 3878-3878, 3956-3956, 4024-4024, 4041-4041, 4141-4141, 4285-4285, 4320-4320, 4337-4337, 4372-4372, 4421-4421, 4536-4536

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docs/grpc/docs.md (3)

269-269: Fix list indentation in Table of Contents

The new - [Usage](#qdrant-Usage) entry is indented by 4 spaces, but top-level items use 2 spaces. Please adjust to match MD007:

-    - [Usage](#qdrant-Usage)
+  - [Usage](#qdrant-Usage)
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

269-269: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


4754-4765: Refactor ASCII separators in Usage section

The decorative ASCII lines under ### Usage break consistency with the rest of the docs and may confuse readers:

---------------------------------------------
----------- Measurements collector ----------
---------------------------------------------

Consider replacing them with a brief descriptive sentence or removing them entirely, for example:

- ---------------------------------------------
- ----------- Measurements collector ----------
- ---------------------------------------------
+ Wrapper message for usage metrics.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

4758-4758: Heading style
Expected: atx; Actual: setext

(MD003, heading-style)


4758-4758: Use ATX-style heading instead of setext

The line ----------- Measurements collector ---------- uses setext underlining, triggering MD003. Either remove this line or convert it to plain text or an ATX-style heading:

- ----------- Measurements collector ----------
+ _Measurements collector for hardware usage metrics._
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

4758-4758: Heading style
Expected: atx; Actual: setext

(MD003, heading-style)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ffd10b and 8f01508.

📒 Files selected for processing (1)
  • docs/grpc/docs.md (19 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

269-269: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


4758-4758: Heading style
Expected: atx; Actual: setext

(MD003, heading-style)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: lint
🔇 Additional comments (1)
docs/grpc/docs.md (1)

2391-2391: Approve addition of usage wrapper fields

The addition of

| usage | [Usage](#qdrant-Usage) | optional |

to the response message tables is consistent and correctly reflects the refactor to wrap HardwareUsage in the new Usage message.

Also applies to: 2590-2590, 2650-2650, 3012-3012, 3531-3531, 3853-3853, 3870-3870, 3947-3947, 4016-4016, 4033-4033, 4133-4133, 4278-4278, 4313-4313, 4330-4330, 4364-4364, 4414-4414, 4529-4529

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
tools/generate_grpc_docs.dockerfile (3)

7-10: Consolidate file removals to reduce layers.
You can use brace expansion or a single pattern to eliminate the four .proto files in one go:

-RUN rm collections_internal_service.proto \
-       points_internal_service.proto \
-       shard_snapshots_service.proto \
-       raft_service.proto
+RUN rm {collections_internal_service,points_internal_service,shard_snapshots_service,raft_service}.proto

12-17: Streamline filtering with sed.
Chaining grep -v is less efficient and harder to maintain. A single sed command can drop all unwanted lines:

-RUN grep -v 'collections_internal_service.proto' qdrant.proto \
-    | grep -v 'points_internal_service.proto' \
-    | grep -v 'shard_snapshots_service.proto' \
-    | grep -v 'raft_service.proto' \
-    > qdrant.proto.tmp \
-    && mv qdrant.proto.tmp qdrant.proto
+RUN sed -E '/collections_internal_service.proto|points_internal_service.proto|shard_snapshots_service.proto|raft_service.proto/d' \
+    qdrant.proto > qdrant.proto.tmp \
+    && mv qdrant.proto.tmp qdrant.proto

19-20: Declare /out as a volume.
Since --doc_out=/out writes generated docs there, explicitly declaring a volume ensures the directory is exposed for mounting and persists output:

VOLUME /out
lib/segment/src/index/field_index/geo_index/mod.rs (1)

748-756: RamMmap builder dereferences Box – avoid double mmap load

finalize() creates an Mmap index, immediately moves it into open_mmap, and then calls index.load() again:

let GeoMapIndex::Mmap(index) = builder.finalize()? else {};
let mut index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index));
index.load()?;                // <-- second pass

MmapGeoMapIndex::new already calls load, so the extra index.load() duplicates work (opens JSON, mmap-s, counts ones, etc.).
Not critical but wastes a few milliseconds in tests.

Optional improvement:

-let mut index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index));
-index.load()?;
-Ok(index)
+let index = GeoMapIndex::Immutable(ImmutableGeoMapIndex::open_mmap(*index));
+Ok(index)          // ImmutableGeoMapIndex::open_mmap already has all metadata;
+                   // callers can invoke `load()` only if they really need it.
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (5)

51-53: Fix minor typo in inline comment

There is an extra space in the word “storage” (s torage).
Correcting small typos keeps the codebase professional and easier to scan.

-    // Backing s torage, source of state, persists deletions
+    // Backing storage, source of state, persists deletions

55-58: Derive Debug (and possibly Clone) for Storage enum

Storage is frequently pattern-matched and occasionally logged.
Adding #[derive(Debug)] (and optionally Clone) makes troubleshooting easier without affecting behaviour.

-enum Storage {
+#[derive(Debug)]
+enum Storage {
     RocksDb(DatabaseColumnScheduledDeleteWrapper),
     Mmap(Box<MmapGeoMapIndex>),
 }

83-93: Avoid an extra heap allocation in open_mmap

open_mmap receives a value, allocates a Box, and then immediately stores it in Storage::Mmap.
Passing a Box<MmapGeoMapIndex> from the caller avoids this extra allocation and keeps ownership semantics clear.

-    pub fn open_mmap(index: MmapGeoMapIndex) -> Self {
+    pub fn open_mmap(index: Box<MmapGeoMapIndex>) -> Self {
 ...
-            storage: Storage::Mmap(Box::new(index)),
+            storage: Storage::Mmap(index),

164-167: Doc-comment wording / typo

“Loads in-memory index fmmap RocksDB storage” contains a typo and still mentions RocksDB.
Recommend rephrasing for clarity.

-    /// Loads in-memory index fmmap RocksDB storage.
+    /// Loads the in-memory index from the mmap-backed storage.

183-203: Pre-allocate AHashSet capacity & preserve ordering

When constructing each AHashSet, you already know the number of IDs
(ids_end - ids_start). Supplying this as the initial capacity avoids
rehashing.
Additionally, relying on the order of index.points_map for later
binary searches assumes it is sorted; consider enforcing this with
debug_assert!(is_sorted_by_key(…)) to guard against future regressions.

-                (
-                    hash,
-                    index.points_map_ids[ids_start as usize..ids_end as usize]
-                        .iter()
-                        .copied()
-                        // Filter deleted points
-                        .filter(|id| !index.deleted.get(*id as usize).unwrap_or_default())
-                        .collect(),
-                )
+                {
+                    let slice = &index.points_map_ids[ids_start as usize..ids_end as usize];
+                    let mut set =
+                        AHashSet::with_capacity(slice.len());
+                    set.extend(
+                        slice.iter()
+                            .copied()
+                            .filter(|id| !index.deleted.get(*id as usize).unwrap_or_default()),
+                    );
+                    (hash, set)
+                }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f01508 and c846793.

📒 Files selected for processing (16)
  • clippy.toml (1 hunks)
  • lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs (1 hunks)
  • lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (3 hunks)
  • lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (2 hunks)
  • lib/segment/src/index/field_index/geo_index/mod.rs (21 hunks)
  • lib/segment/src/index/field_index/map_index/immutable_map_index.rs (1 hunks)
  • lib/segment/src/index/field_index/map_index/mod.rs (3 hunks)
  • lib/segment/src/index/field_index/map_index/mutable_map_index.rs (1 hunks)
  • lib/segment/src/index/field_index/mmap_point_to_values.rs (1 hunks)
  • lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs (1 hunks)
  • lib/segment/src/index/field_index/numeric_index/mod.rs (4 hunks)
  • lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs (1 hunks)
  • lib/segment/src/index/field_index/numeric_index/tests.rs (1 hunks)
  • lib/segment/src/types.rs (1 hunks)
  • tools/generate_grpc_docs.dockerfile (1 hunks)
  • tools/generate_grpc_docs.dockerfile.sh (1 hunks)
✅ Files skipped from review due to trivial changes (10)
  • lib/segment/src/index/field_index/numeric_index/mod.rs
  • lib/segment/src/index/field_index/map_index/mutable_map_index.rs
  • clippy.toml
  • lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
  • lib/segment/src/index/field_index/map_index/mod.rs
  • lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
  • lib/segment/src/index/field_index/map_index/immutable_map_index.rs
  • tools/generate_grpc_docs.dockerfile.sh
  • lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
  • lib/segment/src/types.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/field_index/geo_index/mod.rs (3)
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (8)
  • open_rocksdb (64-78)
  • load (98-103)
  • open_mmap (83-93)
  • points_count (284-286)
  • points_values_count (288-290)
  • max_values_per_point (292-294)
  • values_count (313-317)
  • from (30-41)
lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (11)
  • is_on_disk (395-397)
  • load (197-245)
  • new (86-195)
  • std (112-112)
  • std (124-124)
  • std (150-150)
  • std (174-174)
  • points_count (374-378)
  • points_values_count (380-382)
  • max_values_per_point (384-386)
  • values_count (268-274)
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs (7)
  • load (73-116)
  • new (54-63)
  • new (198-208)
  • points_count (153-155)
  • points_values_count (157-159)
  • max_values_per_point (161-163)
  • values_count (221-226)
🪛 Hadolint (2.12.0)
tools/generate_grpc_docs.dockerfile

[error] 5-5: Use COPY instead of ADD for files and folders

(DL3020)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: lint
  • GitHub Check: integration-tests-consensus
  • GitHub Check: storage-compat-test
🔇 Additional comments (7)
lib/segment/src/index/field_index/numeric_index/tests.rs (1)

349-353: Method renaming looks good.

The changes correctly update the test to use the renamed db_wrapper() method instead of the previous get_db_wrapper() method, maintaining consistent API naming throughout the codebase. This aligns with the method renaming pattern mentioned in the summary.

tools/generate_grpc_docs.dockerfile (1)

3-3: WORKDIR is correctly set.
Pointing to /protos aligns with where the proto files will reside and be processed.

lib/segment/src/index/field_index/mmap_point_to_values.rs (1)

381-391: iter() returns Option that is never None – consider flattening the API

get_values(idx) always yields Some(iterator) for every idx < self.len() because:

  1. iter() only iterates within 0..self.len().
  2. get_range() returns a zero‐length range (start = 0, count = 0) for points that have no data, so get_values() wraps an empty iterator in Some(..).

This means the Option layer is redundant and forces every caller to unwrap (or pattern-match) needlessly.

-pub fn iter(&self) -> impl Iterator<
-    Item = (
-        PointOffsetType,
-        Option<impl Iterator<Item = T::Referenced<'_>> + '_>,
-    ),
-> + Clone {
-    (0..self.len() as PointOffsetType).map(|idx| (idx, self.get_values(idx)))
+pub fn iter(
+    &self,
+) -> impl Iterator<
+       Item = (
+           PointOffsetType,
+           impl Iterator<Item = T::Referenced<'_>> + '_,
+       ),
+   > + Clone {
+    (0..self.len() as PointOffsetType)
+        .map(|idx| (idx, self.get_values(idx).into_iter().flatten()))
 }

Benefits
• Call-sites become simpler (for (id, vals) in ptv.iter() { … }).
• No extra Option allocation/branching at runtime.
• Makes the intent (“each point always has an iterator, though it might be empty”) explicit.

[ suggest_optional_refactor ]

lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (2)

29-44: Counts / PointKeyValue marked Copy – validate mmap-safety

Deriving Copy is convenient but keep in mind that these structs are backed by memory-mapped pages.
Copying many tiny structs is fine, yet stale copies can be held after the underlying mmap is mutated (e.g. when rebuilding the index in-place). If you later introduce in-place updates, silent duplication might become a foot-gun.

Nothing to change right now, just double-check that:

  1. Counts and PointKeyValue are indeed immutable after the index is built.
  2. No code path mutates a copied instance expecting the change to propagate back to the mmap.

If both are guaranteed, the Copy derive is safe.


63-74: Visibility relaxed to pub(super) – keep encapsulation in mind

Making the internal mmap slices publicly accessible inside the module hierarchy is required for the new immutable-from-mmap flow – looks good.

Please consider adding a brief module-level comment documenting which sub-modules are allowed to mutate these fields to avoid accidental write access from unrelated code.

lib/segment/src/index/field_index/geo_index/mod.rs (1)

57-66: new_mmap now returns either Mmap or Immutable – audit downstream call-sites

Switching the returned enum variant depending on is_on_disk is sensible, but it is a silent breaking change for any code that previously assumed GeoMapIndex::Mmap when calling new_mmap(..., /*is_on_disk=*/false).

Please grep the codebase (services, tests, benchmarks) for pattern new_mmap( and check that every caller:

  1. Treats both Immutable and Mmap variants correctly, or
  2. Passes is_on_disk = true when it truly needs the on-disk behaviour.

A small helper like is_mmap_variant() or updating the doc-comment of new_mmap would make the new contract explicit.

lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs (1)

95-104: Clean dispatch in load

Nice abstraction – delegating to load_rocksdb / load_mmap keeps the public
API compact while hiding backend details. 👍

@n0x29a n0x29a requested a review from JojiiOfficial May 14, 2025 10:59
@JojiiOfficial
Copy link
Contributor

Could you please rebase this PR onto the latest dev?

@agourlay
Copy link
Member

agourlay commented May 14, 2025

It seems your rebase went wrong, the dev commits are on top of your branch instead of being the start of the branch.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tools/generate_grpc_docs.dockerfile (2)

1-1: Pin base image version for reproducibility.

Using latest can introduce breaking changes when the image is updated. Please pin to a stable tag or digest, for example:

-FROM pseudomuto/protoc-gen-doc:latest
+FROM pseudomuto/protoc-gen-doc:v1.0.0

5-5: Use COPY instead of ADD for clarity and best practice.

ADD has extra behaviors (e.g., auto-extracting archives) that aren't required here. Replace with:

-ADD lib/api/src/grpc/proto/ .
+COPY lib/api/src/grpc/proto/ .
🧰 Tools
🪛 Hadolint (2.12.0)

[error] 5-5: Use COPY instead of ADD for files and folders

(DL3020)

🧹 Nitpick comments (1)
tools/generate_grpc_docs.dockerfile (1)

7-17: Merge RUN steps and optimize proto filtering.

Combining the removal and filtering steps into a single RUN reduces image layers and simplifies the commands. Also, use grep -E (or grep -Ev) to exclude multiple patterns in one go:

-RUN rm collections_internal_service.proto \
-       points_internal_service.proto \
-       shard_snapshots_service.proto \
-       raft_service.proto
-
-RUN grep -v 'collections_internal_service.proto' qdrant.proto \
-    | grep -v 'points_internal_service.proto' \
-    | grep -v 'shard_snapshots_service.proto' \
-    | grep -v 'raft_service.proto' \
-    > qdrant.proto.tmp \
-    && mv qdrant.proto.tmp qdrant.proto
+RUN rm collections_internal_service.proto points_internal_service.proto \
+       shard_snapshots_service.proto raft_service.proto && \
+    grep -Ev 'collections_internal_service\.proto|points_internal_service\.proto|shard_snapshots_service\.proto|raft_service\.proto' \
+        qdrant.proto > qdrant.proto.tmp && \
+    mv qdrant.proto.tmp qdrant.proto
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 19661dc and 86470d1.

📒 Files selected for processing (5)
  • lib/api/src/rest/models.rs (1 hunks)
  • src/actix/helpers.rs (4 hunks)
  • src/tonic/api/points_internal_api.rs (4 hunks)
  • tools/generate_grpc_docs.dockerfile (1 hunks)
  • tools/generate_grpc_docs.dockerfile.sh (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • tools/generate_grpc_docs.dockerfile.sh
  • src/actix/helpers.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/tonic/api/points_internal_api.rs
  • lib/api/src/rest/models.rs
🧰 Additional context used
🪛 Hadolint (2.12.0)
tools/generate_grpc_docs.dockerfile

[error] 5-5: Use COPY instead of ADD for files and folders

(DL3020)

⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: integration-tests-consensus

@n0x29a n0x29a requested a review from generall May 14, 2025 16:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/api/src/grpc/ops.rs (1)

25-31: Consider simplifying the is_empty implementation

The current implementation uses pattern matching to destructure the Usage struct, but since there's only one field being accessed, you could simplify this to:

impl Usage {
    pub fn is_empty(&self) -> bool {
-        let Usage { hardware } = self;
-
-        hardware.is_none()
+        self.hardware.is_none()
    }
}

This is just a minor stylistic suggestion that makes the code slightly more concise while preserving functionality.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdb4c07 and ca9aed1.

📒 Files selected for processing (3)
  • lib/api/src/grpc/ops.rs (2 hunks)
  • src/tonic/api/points_api.rs (3 hunks)
  • src/tonic/api/query_common.rs (16 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/tonic/api/points_api.rs
  • src/tonic/api/query_common.rs
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: lint
🔇 Additional comments (1)
lib/api/src/grpc/ops.rs (1)

33-36: LGTM! Good utility function implementation

The usage_or_none function is well-implemented, converting an Option<HardwareUsage> into an Option<Usage> while correctly preserving the semantics that an empty value should be represented as None. This aligns with the PR objective of making hardware usage more extendable.

@n0x29a n0x29a merged commit 73e7b17 into dev May 14, 2025
17 checks passed
@n0x29a n0x29a deleted the field-usage branch May 14, 2025 19:01
generall pushed a commit that referenced this pull request May 22, 2025
---------

Co-authored-by: jojii <jojii@gmx.net>
@coderabbitai coderabbitai bot mentioned this pull request May 28, 2025
9 tasks
@coderabbitai coderabbitai bot mentioned this pull request Jun 17, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants