Skip to content

Conversation

Mytherin
Copy link
Collaborator

Fixes #16836

This regression was caused by #15737

Effectively that change introduced an optimization for dictionary-compressed data where the validity data would be read directly from the dictionary - instead of being read from the separate validity data. This is possible because dictionary-compressed data stores validity data (at offset 0 in the dictionary).

However, when doing an UPDATE, we would not rewrite the dictionary data when changing only the validity - which would then cause the dictionary column to no longer contain the new (updated) validity data. The fix here is to also rewrite the main column data when updating the validity data.

Note that we currently do this for all primitive types - we could limit this to compression methods (like dictionary) that need this - but we can leave that for a future PR. (CC @Tishj).

…lues when reading files created by older versions of DuckDB
@duckdb-draftbot duckdb-draftbot marked this pull request as draft March 26, 2025 19:11
@Mytherin Mytherin marked this pull request as ready for review March 26, 2025 19:11
@Mytherin
Copy link
Collaborator Author

On second thought - I've reworked the PR to revert #15737 instead. While we can fix this on the writing side, when reading files created by older versions of DuckDB we would still produce incorrect results without reverting that PR. In effect - we can no longer rely on dictionary compression correctly containing NULL values due to updates potentially invalidating them - we will need to wait for DICT_FSST for that optimization.

@Mytherin Mytherin merged commit 2130141 into duckdb:v1.2-histrionicus Mar 27, 2025
49 of 50 checks passed
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Apr 8, 2025
Fix duckdb/duckdb#16836: rewrite main column data in case of an update that only modifies the validity (duckdb/duckdb#16851)
@Mytherin Mytherin deleted the issue16836 branch June 12, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant