Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Nov 28, 2023

Allow use of indexers that are not fixed-size or which don't terminate the key by checking that the key
length matches exactly with the used search prefix. For unique indexes this is as simple as checking
the key lengths in the iterator and for multi-index we need to extract the length of the secondary key
and compare against that.

E.g. now struct{A, B string} with multi-index on B will be indexed internally with concatenated key B + A + <len>, where <len> is 16-bit unsigned big endian int (the KeySet limits key sizes to 16-bits). A Get against a multi-index will
return an iterator which will check that the node's key matches with the query, for example Get("foo") against
{"foo", "bar"} and {"foob", "ar"} (assuming no terminating \0) will return an iterator which has been
seeked to key "foobar3". The iterator will extract length (3) and compare against search key length (len("foo"))
and since length matches the object is returned. Next the iterator would advance to "foobar4" which would be
skipped since 4 != len("foo").

In effect this allows statedb to work with indexers that do not terminate the keys, but at the performance cost of
potentially traversing over non-matching tree nodes. The actual length extraction and comparison adds no measurable
time to the iteration (the Get benchmark that was added showed no difference).

This PR additionally extends the tests to cover queries against multi-indexes with overlapping prefixes.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 28, 2023
@joamaki joamaki force-pushed the pr/joamaki/statedb-unique-iterator branch 2 times, most recently from 7915ee8 to c114322 Compare November 28, 2023 13:06
@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Nov 28, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 28, 2023
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approach LGTM, nits

@joamaki joamaki force-pushed the pr/joamaki/statedb-unique-iterator branch 2 times, most recently from 2d32b2f to 3aa0b4d Compare November 28, 2023 13:59
@joamaki joamaki requested a review from bimmlerd November 28, 2023 14:00
@joamaki joamaki force-pushed the pr/joamaki/statedb-unique-iterator branch from 3aa0b4d to 8c4dff4 Compare November 28, 2023 14:16
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention to then generally drop termination etc from the indexing functions or is this purely a safeguard against mistakes in those functions?

@joamaki
Copy link
Contributor Author

joamaki commented Nov 29, 2023

Is the intention to then generally drop termination etc from the indexing functions or is this purely a safeguard against mistakes in those functions?

This is really just a safeguard. We want the termination when possible as that makes the traversal of the index efficient (we won't encounter unmatching nodes).

@joamaki joamaki marked this pull request as ready for review November 29, 2023 10:15
@joamaki joamaki requested a review from a team as a code owner November 29, 2023 10:15
@joamaki
Copy link
Contributor Author

joamaki commented Nov 29, 2023

/test

Signed-off-by: Jussi Maki <jussi@isovalent.com>
To implement special casing for querying unique indexes we need
to know whether the index is unique or not. To avoid extra lookups
store the uniqueness information in the index radix tree.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Previously statedb required the indexers to either terminate or return fixed-size
keys in order to exactly match with just prefix searching the trees. This is a
fairly large footgun.

This commit adds checks to the iterators to make sure the key matches exactly and is
not a longer key that shares the prefix. For indexers that don't terminate the key,
this may result in wasting bit of time by traversing non-matching nodes.

This was tested by "breaking" index.String by removing termination and validating
that the regression test passes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
For checking that the extra key length checks in iteration do not
add any significant cost.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add a test case for the database seek and iteration part of the
query API handler.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@joamaki joamaki force-pushed the pr/joamaki/statedb-unique-iterator branch from 8c4dff4 to da7bd80 Compare November 29, 2023 14:10
@joamaki
Copy link
Contributor Author

joamaki commented Nov 29, 2023

/test

@joamaki joamaki enabled auto-merge November 29, 2023 14:10
@joamaki joamaki added this pull request to the merge queue Nov 30, 2023
Merged via the queue into cilium:main with commit 262e1ae Nov 30, 2023
@joamaki joamaki deleted the pr/joamaki/statedb-unique-iterator branch November 30, 2023 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants