Storage: make ROW_GROUP_SIZE
configurable
#14406
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes the
ROW_GROUP_SIZE
of DuckDB's storage format configurable using theROW_GROUP_SIZE
parameter that can be passed in when attaching:If none is specified, the default row group size (
122880
) is chosen (which is also the current row group size).The row group size influences the target row group size when ingesting data into a table. Note that the row group size only influences new data written to a database. A database can also be attached with a different row group size. The row group size does not need to be fixed within the same database file, even for row groups within the same table, e.g. the following is valid:
Backwards Compatibility
Previous versions of DuckDB can read files with varying row group sizes, however they do not support updating or deleting rows in tables with row group sizes >
122880
as the version manager/update manager are hard-coded to support only up to122880
rows.