Skip to content

Conversation

Mytherin
Copy link
Collaborator

When reading long chains of RleBpDecoder data - NextCounts can be called many times. Instead of checking if there is enough buffer available for every byte we can batch check if we are certain we have enough bytes available. The varint that we are reading is limited to max 5 bytes (as more does not fit into a uint32_t) so we can just call check_available once. This improves performance by ~10% when reading data with many small batches.

@Mytherin Mytherin merged commit 1584c22 into duckdb:main Feb 11, 2025
47 checks passed
Antonov548 added a commit to Antonov548/duckdb-r that referenced this pull request Feb 27, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
krlmlr pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 5, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
@Mytherin Mytherin deleted the uncheckednextcounts branch April 2, 2025 09:25
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 17, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Parquet reader: batch check if buffer is available in RLEBpDecoder (duckdb/duckdb#16185)
Unvendor ICU (duckdb/duckdb#16176)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant