Skip to content

Conversation

generall
Copy link
Member

@generall generall commented Jul 10, 2025

  • indexing_threshold: 20k -> 10k
  • default max_segment_size: 200mb -> 256mb per CPU core
  • Default number of segments: 1 -> 0.5 per CPU

@generall generall requested a review from timvisee July 10, 2025 08:49
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested, but these adjustments look sane to me.

@timvisee timvisee changed the title adjsut default values for better performance Adjust default values for better performance Jul 10, 2025

This comment was marked as resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bd218f and c03b7b3.

📒 Files selected for processing (1)
  • docs/redoc/master/openapi.json (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: lint
  • GitHub Check: storage-compat-test

Comment on lines +7250 to 7252
"description": "Maximum size (in kilobytes) of vectors allowed for plain index, exceeding this threshold will enable vector indexing\n\nDefault value is 10,000, based on experiments and observations.\n\nTo disable vector indexing, set to `0`.\n\nNote: 1kB = 1 vector of size 256.",
"default": null,
"type": "integer",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

default field contradicts the new description

The description states “Default value is 10 000”, yet the "default" property is still null. Tools that generate clients or perform validation will treat null as the canonical default, not the textual explanation.

-            "default": null,
+            "default": 10000,

Align the JSON schema with the documented behavior to avoid silent misconfigurations.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description": "Maximum size (in kilobytes) of vectors allowed for plain index, exceeding this threshold will enable vector indexing\n\nDefault value is 10,000, based on experiments and observations.\n\nTo disable vector indexing, set to `0`.\n\nNote: 1kB = 1 vector of size 256.",
"default": null,
"type": "integer",
"description": "Maximum size (in kilobytes) of vectors allowed for plain index, exceeding this threshold will enable vector indexing\n\nDefault value is 10,000, based on experiments and observations.\n\nTo disable vector indexing, set to `0`.\n\nNote: 1kB = 1 vector of size 256.",
"default": 10000,
"type": "integer",
🤖 Prompt for AI Agents
In docs/redoc/master/openapi.json around lines 7250 to 7252, the "default" field
is set to null, which contradicts the description stating the default value is
10,000. Update the "default" property to 10000 to match the documented default
value and ensure consistency between the schema and its description.

@agourlay
Copy link
Member

Are we expecting an improvement on the CI benchmark after this PR?

@generall
Copy link
Member Author

Are we expecting an improvement on the CI benchmark after this PR?

No, CI benchmarks have explicitly configured values

Copy link
Contributor

@coszio coszio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think PR description is misleading. Default segment number becomes half of available cpus, so that there are 2 cpus per segment

@generall generall merged commit ee9b88c into dev Jul 10, 2025
18 checks passed
@generall generall deleted the better-defaults branch July 10, 2025 14:50
generall added a commit that referenced this pull request Jul 17, 2025
* adjsut default values for better performance

* upd openapi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants