Skip to content

Conversation

taniabogatsch
Copy link
Contributor

Overview

#15092 made it possible for ART PRIMARY/UNIQUE indexes to have at most two row IDs per (unique) leaf.

Currently, the conflict manager (and ART::VerifyLeaf) still work with the assumption that there can never be more than one row ID per conflict hit. Since this assumption no longer holds, in some cases, we now have to register conflict hits for up to two row IDs. Related issue: https://github.com/duckdblabs/duckdb-internal/issues/4924. Later in the execution, we need to understand which of the two possible row IDs is visible to the transaction (and the conflict manager).

As far as I can tell, this mostly kept working because we use a vector for row ID scanning, which is order-preserving, and newer row IDs (more likely the ones visible to the current transaction) overwrote older row IDs.

Changes in this PR

  • Exposed CanFetch to DataTable, LocalStorage, RowGroupCollection to determine whether a row ID is visible to a transaction/in the local storage, or not.
  • The conflict manager now has the capacity to register a secondary hit for a conflict.

Also, I've refactored basically the entire conflict manager. That refactoring includes removing the ManagedSelection, and a lot of code paths. I think the number of lines only went up because I've added the CanFetch stuff and because I've also added a lot of comments and some formatting). 😅

This PR is a minor follow-up to #18015 and next up is turning the row ids into an unordered set instead of a vector.

@taniabogatsch taniabogatsch requested a review from Tishj July 9, 2025 12:11
@Mytherin
Copy link
Collaborator

Looks good from my side - @Tishj can you do a pass?

Copy link
Contributor

@Tishj Tishj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Mytherin Mytherin merged commit 66a80b2 into duckdb:main Jul 10, 2025
53 checks passed
@taniabogatsch taniabogatsch deleted the conflict-manager branch July 10, 2025 11:48
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Jul 24, 2025
Two-rowID-leaf support in the conflict manager and general refactoring (duckdb/duckdb#18194)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Jul 24, 2025
Two-rowID-leaf support in the conflict manager and general refactoring (duckdb/duckdb#18194)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants