Fix broken shard key mapping storage, keep shard key numbers #5838
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Our shard key mapping storage for user defined sharding appears to be broken.
We support two types of custom shard keys, a string and a number. The shard key numbers are not persisted properly and are converted into shard key strings on load. This breaks user defined sharding because the strings and numbers are not interchangeable.
This changes the format in which we persist shard keys to fix this very problem. The new format uses better typing, so that numbers and strings can properly be distinguished.
More specifically, we now have two formats:
Old
format is the original format from when shard key mappings were implementedNew
format is a more robust, properly persisting shard key numbersBoth formats can be stored and loaded from disk. The same
shard_key_mapping.json
file is used. Currently this PR still prefers to store in the old format, but the new format is chosen if any shard key numbers are used. This ensures backwards compatibility.In Qdrant 1.14 we change to always prefer the new format. In a later version the old format can be entirely removed if we so desire.
We used a
SaveOnDisk<ShardKeyMapping>
structure to handle persisting. I've replaced that with a specializedSaveOnDiskShardKeyMappingWrapper
structure that acts in the same way, but takes care of loading/persisting in both formats.Example
For example. If I now create a collection and add shard keys
"1"
and"2"
, we still persist in the old format:If I now add the shard key number
3
, it automagically changes into the new format:Again, this PR supports both formats interchangeably.
Tasks
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?