Skip to content

Conversation

Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Jul 24, 2025

Currently during a checkpoint we rewrite all metadata associated with all tables. For larger tables, the metadata can be relatively large. This causes checkpoints on larger databases to always take a base amount of time - regardless of what has actually changed.

The WAL mitigates this to some extend by buffering changes - but rewriting metadata is still wasteful:

  • It causes checkpoints to take an increasing amount of time based on database size, regardless of change size
  • It causes checkpoints to write a lot of data to the database, even if little changed

This PR partially mitigates this issue by re-using table metadata for tables that are fully unchanged. During checkpointing, we check if any changes have been made to a table. If no changes have been made, the entire checkpoint for that table is short-circuited and we write a pointer to the previous metadata.

Note that this is not entirely free. We still need to read parts of the previous metadata in order to figure out which blocks are used by the previous metadata. This is so that we can correctly label these blocks as still in-use, so that they do not get accidentally garbage collected.

Performance

Consider a database storing TPC-H SF100. The biggest table (lineitem) has 600M rows.

CALL dbgen(sf=100);

Now insert a single row into a tiny table (nation) and checkpoint. We measure two times: the time of the CHECKPOINT command, and the full runtime of opening the database, running the insert + checkpoint, and closing the database.

INSERT INTO nation VALUES (26, 'New Country', 0, '');
CHECKPOINT;
Operation v1.3.2 New
Checkpoint 0.53s 0.07s
Full 0.64s 0.1s

Note that there is still a reasonably-sized cost here. The main remaining cost is that we are not serializing the exact blocks the row groups refer to, meaning we need to do a partial deserialization to figure this out. If we were to serialize the exact blocks of each row group we would be able to reduce the checkpoint time by another ~2.5X based on rudimentary experiments

@Mytherin Mytherin merged commit df7f4fa into duckdb:main Jul 24, 2025
55 checks passed
Mytherin added a commit that referenced this pull request Jul 24, 2025
…18395)

Follow-up from #18390

This PR implements metadata re-use at the row group level, so if we are
e.g. appending to a large table we no longer rewrite the metadata of
unchanged row-groups, and instead refer to the existing metadata
on-disk. In addition, this PR also performs a few fixes where we would
eagerly load columns unnecessarily.



### Performance

Consider a database storing TPC-H SF100. The biggest table (lineitem)
has 600M rows.

```sql
CALL dbgen(sf=100);
```


Now insert a single row into the largest table (`lineitem`) and
checkpoint. We measure two times: the time of the `CHECKPOINT` command,
and the full runtime of opening the database, running the insert +
checkpoint, and closing the database.

```sql
INSERT INTO lineitem VALUES (600000001, 0, 0, 0, 0, 0, 0, 0, '', '', DATE '2000-01-01', DATE '2000-01-01', DATE '2000-01-01', '', '', '');
CHECKPOINT;
```

| Operation  | v1.3.2 |  New  |
|------------|--------|-------|
| Checkpoint | 0.5s   | 0.11s |
| Full       | 0.63s  | 0.13s |
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390)
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390)
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390)
@Mytherin Mytherin deleted the metadatareuse branch September 3, 2025 07:06
Tmonster added a commit to Tmonster/duckdb-r that referenced this pull request Sep 8, 2025
bump iceberg to latest main
[chore] Fix amalgamation build in progress_bar (duckdb/duckdb#18910)
Bump inet & aws (duckdb/duckdb#18899)
fix: refine query ETA display and Kalman filter stability (duckdb/duckdb#18880)
Bump httpfs to v1.4-andium branch (duckdb/duckdb#18898)
Encryption now encoded as a bit, centralizing in set/getter (duckdb/duckdb#18897)
Add callback for when an extension fails to load, and also log this (duckdb/duckdb#18894)
Keep base data scan state alive in ColumnData::Update call (duckdb/duckdb#18893)
Expected errors 2053 (duckdb/duckdb#18892)
fixing auto-specifying ciphers and remove double storage (duckdb/duckdb#18891)
Add rowsort to upsert_default.test (duckdb/duckdb#18890)
bump aws and iceberg (duckdb/duckdb#18889)
[chore] Bump config test/configs/compressed_in_memory.json to new format (duckdb/duckdb#18888)
[Dev] Fix footgun in `string_t::SetSizeAndFinalize` (duckdb/duckdb#18885)
Json: no reinterpret<size_t*> (duckdb/duckdb#18886)
[C API] Result schema of prepared statements (duckdb/duckdb#18779)
Add `COPY (FORMAT BLOB)` to Andium too :^) (duckdb/duckdb#18884)
Avoid automatically checkpointing if the database instance has been invalidated (duckdb/duckdb#18881)
Update spatial+vss+sqlsmith in preparation for v1.4 (duckdb/duckdb#18882)
Internal duckdb/duckdb#5796: Window Progress (duckdb/duckdb#18860)
[Test] Small fixes to concurrent attach/detach test (duckdb/duckdb#18862)
Update ducdkb iceberg hash (duckdb/duckdb#18873)
Storage fuzzing + several fixes (duckdb/duckdb#18876)
Bump mbedtls to v3.6.4 (duckdb/duckdb#18871)
[minor] Incompatible DB error message: add newline (duckdb/duckdb#18861)
Bump & remove patches for delta, avro, excel, encodings, fts  (duckdb/duckdb#18869)
Add a FORCE_DEBUG flag to force `-DDEBUG`, similar to FORCE_ASSERT (duckdb/duckdb#18872)
Expected errors 2053 (duckdb/duckdb#18864)
update duckdb azure extension ref for 1.4.0 (duckdb/duckdb#18868)
Hold segment lock during GetColumnSegmentInfo (duckdb/duckdb#18859)
Centralize attached database paths in a DatabaseFilePathManager which is shared across databases created through the same DBInstanceCache (duckdb/duckdb#18857)
Add more encryption modes CTR and CBC (duckdb/duckdb#18619)
Bump Ducklake (duckdb/duckdb#18825)
No more `wal_encryption` flag (duckdb/duckdb#18851)
Fix `NULL` path for `json_each`/`json_tree` (duckdb/duckdb#18852)
Make ATTACH OR REPLACE atomic, keep list of used databases in MetaTransaction (duckdb/duckdb#18850)
WAL <> DB File Match Fixes (duckdb/duckdb#18849)
Add test_env to unit tester (duckdb/duckdb#18847)
Merge ossivalis into main (duckdb/duckdb#18844)
Bump MySQL/Postgres/SQLite (duckdb/duckdb#18848)
Add OnBeginExtensionLoad callback (duckdb/duckdb#18842)
Ignore null verification for statistics on structs (duckdb/duckdb#18813)
Document storage version flag in CLI + minor rendering fix (duckdb/duckdb#18841)
Add the `VARIANT` LogicalType (duckdb/duckdb#18609)
Avoid printing '99 hours', given in most cases that means estimate is… (duckdb/duckdb#18839)
Don't notify Py pkg when override git describe is set (duckdb/duckdb#18843)
Add support for reading/writing native parquet geometry types (duckdb/duckdb#18832)
Fix/run function in transaction (duckdb/duckdb#18741)
Avoid expensive checkpoints and write amplification by appending row groups, and limiting vacuum operations for the last number of row groups (duckdb/duckdb#18829)
fix: silence warnings about signed/unsigned conversions. (duckdb/duckdb#18835)
fix: sanitize input for enable_logging (duckdb/duckdb#18830)
Ensure a WAL file matches the DB file and checkpoint iteration (duckdb/duckdb#18823)
Fix: Preserve database configuration flags for tab completion in DuckDB shell (duckdb/duckdb#18482)
Extensions.yml: Pass down save_cache to inner workflows (duckdb/duckdb#18828)
Fix format-fix runs on Linux (duckdb/duckdb#18827)
Re-add accidentally removed check if copy_from is supported (duckdb/duckdb#18824)
[Fix] Bug in fixed-size buffer when throwing out-of-memory (duckdb/duckdb#18769)
For BC reasons - keep VARINT as alias for BIGNUM (duckdb/duckdb#18821)
Add callback to get a list of copy options, use this to provide suggestions and to erase options from import that are only used during exporting (duckdb/duckdb#18812)
fix: Add COLLATE NOCASE support to strpos function (duckdb/duckdb#18819)
[CI] install  libcurl4-openssl-dev with apt-get (duckdb/duckdb#18811)
Provide failing file name in Parquet reader error messages (duckdb/duckdb#18814)
Test runner: Expand '{UUID}' into a random UUID (duckdb/duckdb#18809)
Expected errors 2053 (duckdb/duckdb#18810)
Add support for non-aggregate window functions (duckdb/duckdb#18788)
Fix some unindented interactions between `EMPTY_RESULT_PULLUP` and `MATERIALIZED` CTEs (duckdb/duckdb#18805)
Typed macro parameters (duckdb/duckdb#18786)
Internal duckdb/duckdb#3273: Hashed Sort Callbacks (duckdb/duckdb#18796)
[chore] Fixup tidy-check on src/logging/log_manager.cpp by passing const & (duckdb/duckdb#18801)
Fixup progress_bar: avoid converting doubles into int32_t unchecked (duckdb/duckdb#18800)
Issue duckdb/duckdb#18767: Ignore Timestamp Offsets (duckdb/duckdb#18794)
bump httpfs so it includes curl option (duckdb/duckdb#18691)
Improve autocomplete suggestions (duckdb/duckdb#18773)
Remove everything python-package related (duckdb/duckdb#18789)
Support expressions as COPY file target (duckdb/duckdb#18795)
fix: coalesce query progress updates to reduce terminal writes (duckdb/duckdb#18672)
Task Scheduler: track exact task count, and re-signal on dequeue failure if there are tasks left (duckdb/duckdb#18792)
Internal duckdb/duckdb#3273: Parallel Window Masks (duckdb/duckdb#18731)
fix: improve speed of GetValue() for STRUCT type (duckdb/duckdb#18785)
Add `memory_limit` parameter to `benchmark_runner`/`test_runner.py` (duckdb/duckdb#18790)
Treat ENABLE_EXTENSION_AUTOINSTALL as the BOOL that it is (duckdb/duckdb#18778)
Move row id logic to separate RowIdColumnData class instead of inlining it into the RowGroup (duckdb/duckdb#18780)
Improve error messages for merge / vector reference (duckdb/duckdb#18777)
Use microsecond resolution for printing the current timestamp (duckdb/duckdb#18776)
Add `file_size_bytes` (de-)serialization (duckdb/duckdb#18775)
Propagate `DUCKDB_*_VERSION` in extensions and tests (duckdb/duckdb#18774)
[Test Fix] Forward output to file (duckdb/duckdb#18772)
[CI] Adjust test configs post logger PR (duckdb/duckdb#18771)
Revert "Use 1-based indexing for SQL-based JSON array extraction" (duckdb/duckdb#18758)
Make `duckdb_log` return a TIMESTAMP_TZ (duckdb/duckdb#18768)
Fix Path Typo in Extension's CMake Warning Message (duckdb/duckdb#18766)
Fix index resolution when querying table with index via view (duckdb/duckdb#18319)
Fix radix partitioning with more than 10 bits (duckdb/duckdb#18761)
Add support for auto-globbing within a directory: if no matches are found for a specific path, we retry with `/**/*.[ext]` appended (duckdb/duckdb#18760)
Refactor read_blob and read_text to use MultiFileFunction. (duckdb/duckdb#18706)
Add missing expected errors to the test cases (next chunk) (duckdb/duckdb#18753)
Minor logging fixes and more benchmarking (duckdb/duckdb#18755)
Extensions.yml should also check converted_to_draft (duckdb/duckdb#18754)
[Profiling] Add Profiling to Write Function (duckdb/duckdb#18724)
Fixing lazy polars execution on query result (duckdb/duckdb#18749)
Remove separate WAL encryption flag (duckdb/duckdb#18750)
Add leak suppressions to nightly runs (duckdb/duckdb#18748)
Append using a SQL query, instead of directly appending to a base table, and support user-provided queries through the QueryAppender (duckdb/duckdb#18738)
removed placeholder client directories for node and jdbc, its been > 1 yr (duckdb/duckdb#18757)
Add missing expected errors to the test cases (duckdb/duckdb#18746)
Add OS X notarization for DuckDB CLI and libduckdb.dylib (duckdb/duckdb#18747)
Use correct type for pushing collations in subqueries (duckdb/duckdb#18744)
Merge ossivalis into main (duckdb/duckdb#18719)
Secrets: if serialization_type is not specified, assume it's a key value secret (duckdb/duckdb#18743)
[C API] Function to set a copy callback for bind data (duckdb/duckdb#18739)
fix timetravel for default tables (duckdb/duckdb#18240)
[unittest] SkipLoggingSameError() to make unittester report one failure per case (duckdb/duckdb#18270)
Use 1-based indexing for SQL-based JSON array extraction (duckdb/duckdb#18735)
Add (CSV) file logger (duckdb/duckdb#17692)
feat: enhance .tables command with schema disambiguation and filtering (duckdb/duckdb#18641)
Internal duckdb/duckdb#5669: Loop Join Thresholds (duckdb/duckdb#18733)
Fix PIVOT in multiple statements (duckdb/duckdb#18729)
Minor fixes for other catalogs - mostly checking `IsDuckTable()` for unsupported operations (duckdb/duckdb#18720)
Added support for blob<->uuid conversions (duckdb/duckdb#18027)
#Fix 18558: add row_group scan fast path (duckdb/duckdb#18686)
Improved grammar generation script (duckdb/duckdb#18716)
Correctly throw an error when too few columns are supplied in MERGE INTO INSERT (duckdb/duckdb#18715)
[Profiling] Add Profiling to Read Function (duckdb/duckdb#18661)
Fix issue with materialized CTE optimization in flatten_dependent_join (duckdb/duckdb#18714)
Add Option to Allocate Using an Arena in `string_t` (duckdb/duckdb#17992)
Internal duckdb/duckdb#3273: Hashed Sort States (duckdb/duckdb#18690)
Python-style positional/named arguments for macro's (duckdb/duckdb#18684)
[Fix] Correctly handle table and index chunks in WAL replay buffering (duckdb/duckdb#18700)
Make ART construction iterative via ARTBuilder (duckdb/duckdb#18702)
Correctly handle collations for IN (subquery) (duckdb/duckdb#18698)
Hold row group lock for entire call of MoveToCollection (duckdb/duckdb#18694)
Expected errors 2053 (duckdb/duckdb#18695)
Issue duckdb/duckdb#18457: DateTrunc Simplification Warnings (duckdb/duckdb#18687)
[Python SQLLogicTest] Add `test/sql/pragma/profiling/test_profiling_all.test` to the SKIPPED_TESTS set (duckdb/duckdb#18689)
Make sure parse errors are wrapped in ErrorData (duckdb/duckdb#18682)
Internal duckdb/duckdb#5366: Window State Arguments  (duckdb/duckdb#18676)
Expected errors 2053 (duckdb/duckdb#14213)
Add `date_trunc()` simplification rules (duckdb/duckdb#18457)
Fix the issue where delta_for isn't used in bitpacking when for is unavailable (duckdb/duckdb#18616)
fix error message related to wrong memory unit (duckdb/duckdb#18671)
Grab lock and double-check that column is not loaded in MoveToCollection (duckdb/duckdb#18677)
Correctly allocate uncompressed string data in ZSTD for many giant strings (duckdb/duckdb#18678)
Internal duckdb/duckdb#5662: IEJoin Test Plans (duckdb/duckdb#18680)
[ Python SQLLogic Tester ] Add `MERGE_INTO` to `statement.type` enum in `result.py` (duckdb/duckdb#18675)
Internal duckdb/duckdb#5366: Window Interrupt Arguments (duckdb/duckdb#18651)
Correctly set weights in reservoir sample when switch to slow sampling (duckdb/duckdb#18563)
[Dev] Add script to create patch from changes in an extension repository (duckdb/duckdb#18620)
Python test runner: Fix hash comparison error output (duckdb/duckdb#18626)
[CI] skip building encodings extension in InvokeCI (duckdb/duckdb#18655)
CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656)
Merge ossivalis into main (duckdb/duckdb#18644)
Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625)
Add 1.4 release codename (duckdb/duckdb#18652)
Change arrow() to export record batch reader (duckdb/duckdb#18642)
bump spatial (on main) (duckdb/duckdb#18197)
bump avro to v1.4 (duckdb/duckdb#18434)
Make more configs into generic settings (duckdb/duckdb#18592)
Add "Hash Zero" verification CI run (duckdb/duckdb#18623)
feat: add ETA to progress bar in DuckDB CLI (duckdb/duckdb#18575)
wrap httplib ::max() call in `WIN_32` check (duckdb/duckdb#18590)
[ART] ART::Erase refactoring (duckdb/duckdb#18595)
[CSV Sniffer] Fix type detection issue with union and empty columns (duckdb/duckdb#18606)
Add Field IDS to multi file reader for positional deletes (duckdb/duckdb#18617)
Re-add `hugeint` to `__internal_compress_string` (duckdb/duckdb#18622)
Adjust filter pushdown to latest polars release (duckdb/duckdb#18624)
parquet/parquet_multi_file_info.cpp: fix move from stack (duckdb/duckdb#18634)
Issue duckdb/duckdb#18631: Streaming Windowed Quantile (duckdb/duckdb#18636)
Fix serialization backwards compatability for varargs functions (duckdb/duckdb#18596)
[Profiling] Add client context into more read functions (duckdb/duckdb#18514)
[CI] Don't zip and upload Code Coverage tests results when Code Coverage got cancelled (duckdb/duckdb#18607)
[Test] Fix test case and a benchmark (duckdb/duckdb#18610)
Update README.md (duckdb/duckdb#18614)
correctly setting log transaction id in ThreadContext (duckdb/duckdb#18536)
[Fix] Hidden test failure in test_struct_update.test (duckdb/duckdb#18598)
Increment storage version to enable `DICT_FSST` in benchmark file (duckdb/duckdb#18588)
fix hidden merge conflict (duckdb/duckdb#18589)
Adds a function for updating and adding values in a struct (duckdb/duckdb#15533)
Pushdown filters on coalesced outer join keys compared for equality under the join condition (duckdb/duckdb#18169)
fix: libduckdb.so missing soversion (duckdb/duckdb#18305)
String dictionary hash cache (duckdb/duckdb#18580)
Force `LIST`/`ARRAY` child vectors on a Parquet single page (duckdb/duckdb#18578)
fix: use thousands separator and decimal for row counts in`duckbox` output format (duckdb/duckdb#18564)
Flip left/right delim join based on cardinalities (duckdb/duckdb#18552)
[Fix] Adjust shrink threshold back to original count > SHRINK_THRESHOLD (duckdb/duckdb#18582)
[CSV Sniffer] Fixing bug of not properly setting skipped rows from sniffer (duckdb/duckdb#18555)
fix: add formatting to explain row counts (duckdb/duckdb#18566)
Delete FUNDING.json
Update FUNDING.json
Create FUNDING.json
[Indexes] Buffer-managed indexes part 3: segment handle for Node48 and Node256 (duckdb/duckdb#18567)
Rename the Varint type to Bignum (duckdb/duckdb#18547)
Add compile option standalone-debug for clang (duckdb/duckdb#17433)
Fixing compilation with -std=cpp23 (duckdb/duckdb#18557)
[easy] [no-op] Minor optimization on iterator lookup (duckdb/duckdb#15349)
optimize/parquet: generate movable types for parquet (duckdb/duckdb#18510)
Check if `heap_block_ids` is empty before getting start/end when destroying chunks in `TupleDataCollection` (duckdb/duckdb#18556)
Implement special-case `VARCHAR` to `JSON[]` casts and vice versa (duckdb/duckdb#18541)
[ART] Node::Free refactoring (duckdb/duckdb#18544)
[Fix] Follow-up PR to only delete unique row IDs (duckdb/duckdb#18545)
Restore missing `test/configs/small_block_size.json` file (duckdb/duckdb#18507)
Unittester: Add the `--sort-style` parameter that allows a fallback comparison where results are sorted according to a given sort-style (duckdb/duckdb#18542)
Allow overriding openssl version for FIPS compliance (duckdb/duckdb#18499)
fix: improve handling variant nulls and nested types (duckdb/duckdb#18538)
Add support for explicit clean-up routine in test config, and exit multi-statement execution when an error is encountered (duckdb/duckdb#18539)
Use global index, not local id when creating filters in `MultiFileColumnMapper` (duckdb/duckdb#18537)
Add `StatementVerifier` for `EXPLAIN` (duckdb/duckdb#18529)
Add CAPI to retrieve client context for table functions (duckdb/duckdb#18520)
fix: support both field orders for variant struct (duckdb/duckdb#18532)
[Varint] Negation, Subtraction and Over/under-flow checking (duckdb/duckdb#18477)
ALP test: skip TPC-DS 67 - it is not consistent with floating point numbers (duckdb/duckdb#18528)
Consistently detect JSON schema indepent of number of threads (duckdb/duckdb#18522)
Internal duckdb/duckdb#16560: Numeric TRUNC Precision (duckdb/duckdb#18511)
Dynamically determine dictionary size limit in Parquet writer (if unset) (duckdb/duckdb#18356)
Fix incorrect character encoding in GetLastErrorAsString on Windows (duckdb/duckdb#18431)
Fix: Write the salt together with the HT offset when determining the value for key comparison (duckdb/duckdb#18374)
When tracking evicted_data_per_tag, track actual size on disk after temp file compression (duckdb/duckdb#18521)
Adding WITH ORDINALITY to DuckDB (duckdb/duckdb#16581)
ParserException for Pragma with named parameters (duckdb/duckdb#18506)
Temporarily excluding `Build Pyodide wheel` for Python 3.11 because it fails to build `WASM` wheels  (duckdb/duckdb#18508)
Remove `immediate_transaction_mode` from DB config options (duckdb/duckdb#18516)
Allow expressions to be used in ATTACH / COPY options (duckdb/duckdb#18515)
Fix several bugs/fuzzer issues (duckdb/duckdb#18503)
Fix: Remove overly strict assertion on empty string value (duckdb/duckdb#18504)
Change UNICODE to UTF8 (duckdb/duckdb#17586)
Merge ossivalis (duckdb/duckdb#18502)
fix: add missing space in AttachInfo::ToString() (duckdb/duckdb#18500)
julia: config improvements (duckdb/duckdb#17585)
[Profiling] Add client context into read functions (duckdb/duckdb#18438)
Fix accidental internal exception in type transformation (duckdb/duckdb#18492)
add delta linux back to ci (duckdb/duckdb#18491)
Change ctrl-a/ctrl-e to move to start/end of line, not buffer (duckdb/duckdb#18490)
Unify `ON CONFLICT` and `MERGE INTO` (duckdb/duckdb#18480)
Internal duckdb/duckdb#5384: Window Sorting Polish (duckdb/duckdb#18484)
re-nable extensions in invokeci (duckdb/duckdb#18476)
SUM and + Operator for Varints (duckdb/duckdb#18424)
Internal duckdb/duckdb#5366: WindowDeltaScanner  (duckdb/duckdb#18468)
Merge ossivalis (duckdb/duckdb#18456)
Bump postgres to latest main (duckdb/duckdb#18464)
Internal duckdb/duckdb#5385: WindowMergeSortTree Sort Update (duckdb/duckdb#18461)
Add support for generic settings, and move many settings over to generic settings (duckdb/duckdb#18447)
Buffer index appends during WAL replay (duckdb/duckdb#18313)
Internal duckdb/duckdb#5384: WindowDistinctAggregator Sort Update  (duckdb/duckdb#18442)
Add support for "template" types (duckdb/duckdb#18410)
Update pyodide build to 0.28.0 (duckdb/duckdb#18446)
Parquet: add row-group ordinal during writing encryption (duckdb/duckdb#18433)
Include pyodide build configuration (duckdb/duckdb#18183)
[Fix] Block size nightly (duckdb/duckdb#18425)
Internal duckdb/duckdb#5368: WindowNaiveAggregator Sort Update  (duckdb/duckdb#18409)
Internal duckdb/duckdb#5367: SortedAggregateFunction Sort Update  (duckdb/duckdb#18408)
Refactor extension CI to use extension-ci-tools (duckdb/duckdb#18361)
Correctly fetch only base column data in ColumnData::FetchUpdateData (duckdb/duckdb#18423)
feat: remove anything following `?` in database name (duckdb/duckdb#18417)
Merge `v1.3-ossivalis` in `main` (duckdb/duckdb#18401)
Add support for table_constraints of AdbcConnectionGetObjects() (duckdb/duckdb#18181)
Add DuckLake back in (duckdb/duckdb#18405)
Internal duckdb/duckdb#5294: TIME_NS C API (duckdb/duckdb#18215)
Remove incorrect assertion (duckdb/duckdb#18404)
[ Python SQLLogic Tester ] Add `MERGE_INTO` statement to duckdb python (duckdb/duckdb#18402)
CI: Add separate job for discussion mirroring (duckdb/duckdb#18407)
Wrap runner.ExecuteFile, otherwise cleanup is not properly performed (duckdb/duckdb#18400)
Internal duckdb/duckdb#3273: Window Hashed Sort (duckdb/duckdb#18337)
Store extra metadata blocks in RowGroupPointer, and only flush dirty Metadata blocks (duckdb/duckdb#18398)
CI: Fix Discussion mirroring (duckdb/duckdb#18397)
Record whether or not cross products are implicit or not, and use this for converting queries back to SQL (duckdb/duckdb#18394)
Correct and consistent integer arithmetic error messages (duckdb/duckdb#18393)
Re-use metadata of unaltered row groups when checkpointing a table (duckdb/duckdb#18395)
Approx database count system function (duckdb/duckdb#18392)
Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390)
Bump httpfs (duckdb/duckdb#18388)
Uncomment skipped decimal REE tests (duckdb/duckdb#18372)
Re-enable but deprecate CORE_EXTENSIONS in CMakeLists.txt (duckdb/duckdb#18377)
Add missing ninja to workflow file (duckdb/duckdb#18373)
Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18364)
Pass `AttachOptions` to `attach` method, and turn `StorageExtensionInfo` into an `optional_ptr` (duckdb/duckdb#18368)
Split up out-of-tree extensions into separate files, and allow out-of-tree extensions to be built using BUILD_EXTENSIONS={ext_name} (duckdb/duckdb#18357)
Python external dispatch param fixes (duckdb/duckdb#18359)
Revert "[unittest] - fix doubled error headers on `Unexpected failure`" (duckdb/duckdb#18355)
Add support for checkpointing in-memory tables (duckdb/duckdb#18348)
[C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142)
Extend PEG parser grammar (duckdb/duckdb#18221)
[unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314)
Fix condition indexes in join filter pushdown (duckdb/duckdb#18341)
download Real Nest data in quiet mode (duckdb/duckdb#18346)
Fix debug error in join order optimizer (duckdb/duckdb#18344)
Aarch64 backport (duckdb/duckdb#18345)
add the from-table-function as parameter to copy-from-bind (duckdb/duckdb#18004)
feat: making Parquet write RowGroup.total_compressed_size (duckdb/duckdb#18307)
Make storage-version a test parameter (duckdb/duckdb#18324)
New Arrow C-API (duckdb/duckdb#18246)
feat: Parquet extension add row_group_compressed_size (duckdb/duckdb#18294)
Merge ossivalis into main (duckdb/duckdb#18272)
SHOW TABLES FROM <qualified_name> (duckdb/duckdb#18179)
Add target for installing Python deps. (duckdb/duckdb#18285)
Use `FromEpochSeconds` instead of `FromTimeT` in `FileSystem::GetLastModifiedTime` (duckdb/duckdb#18281)
[Fix] Adjust test to run with different block sizes (duckdb/duckdb#18277)
Use DuckDB cast infrastructure in fmt for new uhugeint/hugeint code (duckdb/duckdb#18275)
Use set for row ID scanning during index scans (duckdb/duckdb#18274)
Add support for RETURNING to MERGE INTO (duckdb/duckdb#18271)
Support HUGEINT in printf and format (duckdb/duckdb#13277)
Expanded autocomplete suggestions (duckdb/duckdb#18243)
[Parquet] Add read support for the `VARIANT` LogicalType (with shredded encoding) (duckdb/duckdb#18224)
Reduce copy in Vector::Reinterpret (duckdb/duckdb#18264)
Fixes for gcc 15 (duckdb/duckdb#18261)
Fix dictionary-related assertions (duckdb/duckdb#18260)
Allow for static libs from extension dependencies to be bundled (duckdb/duckdb#18226)
disable WebAssembly duckdb-wasm builds job in NightlyTests triggered by 'workflow_dispatch' event (duckdb/duckdb#18129)
Bunch of loosely connected test/CI fixes (duckdb/duckdb#18254)
update run_extension_medata_tests.sh (duckdb/duckdb#17976)
fixes for some minor llvm 20 complaints (duckdb/duckdb#18257)
Fix integer overflow in sequence vector (duckdb/duckdb#18245)
Add type safety to `FlatVector::GetData<T>`, `ConstantVector::GetData<T>` and `UnifiedVectorFormat::GetData<T>` (duckdb/duckdb#18256)
Slightly higher memory limit for test (duckdb/duckdb#18235)
Improve descriptions of thresholds vars affecting join algorithm selection (duckdb/duckdb#17377)
Add support for geoarrow encoded geometries in geoparquet files. (duckdb/duckdb#17942)
Dictionary functions (duckdb/duckdb#18127)
Better `NULL` handling in `TupleDataLayout` (duckdb/duckdb#18069)
Track `DataChunk` memory usage in various places (duckdb/duckdb#18191)
[Parquet] Add read support for the `VARIANT` LogicalType (duckdb/duckdb#18187)
Bugduckdb/duckdb#18163 Fix STDDEV_SAMP undeterminism (duckdb/duckdb#18210)
Internal duckdb/duckdb#5264: NLJ Not Distinct (duckdb/duckdb#18216)
Improve Parquet reader `NULL` statistics and compress all-`NULL` columns using `CompressedMaterialization` (duckdb/duckdb#18217)
Get type of encoded `SortKey` from `TupleDataLayout` (duckdb/duckdb#18218)
ci(pyodide): enable WASM exceptions on the latest pyodide build (duckdb/duckdb#18173)
Temporary file encryption (duckdb/duckdb#18208)
More internal-linkage (duckdb/duckdb#18177)
Two-rowID-leaf support in the conflict manager and general refactoring (duckdb/duckdb#18194)
[Parquet][Dev] Update the vendored `parquet.thrift` to `3ce0760` (duckdb/duckdb#18195)
Parquet reader logging (duckdb/duckdb#18172)
Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18188)
[Profiling] Move the client context into more write functions (duckdb/duckdb#17875)
Check if `GetLastSegment` is not `nullptr` in `ColumnData::RevertAppend` (duckdb/duckdb#18171)
Reduce lock contention for the instance cache (duckdb/duckdb#18079)
fix bug with allowed_paths (duckdb/duckdb#18176)
Avoid `realloc` in CSV writer (duckdb/duckdb#18174)
fix typo (duckdb/duckdb#18165)
Resolve some small build issues (duckdb/duckdb#18162)
Implement `replace_type` function (duckdb/duckdb#18077)
Issue duckdb/duckdb#17683: TIME_NS Compilation (duckdb/duckdb#18053)
Add support for AdbcConnectionGetObjects(table_type) (duckdb/duckdb#18066)
Detect when updates have no effect, and skip performing the actual updates if we encounter these nop updates (duckdb/duckdb#18144)
Add support for `MERGE INTO` (duckdb/duckdb#18135)
Improve sort key comparison performance (duckdb/duckdb#18131)
set ::error:: annotations for test runners (duckdb/duckdb#18072)
Internal duckdb/duckdb#3273: Window Task Generation (duckdb/duckdb#18113)
Update description of 'arrow_lossless_conversion' (duckdb/duckdb#18046)
[chore] Merge v1.3-ossivalis on main (duckdb/duckdb#18109)
ci: build duckdb against the latest emscripten (duckdb/duckdb#18110)
Don't throw `InternalException` in `Sort::Sink` (duckdb/duckdb#18105)
TPC-DS: Use BIGINT fields (duckdb/duckdb#18098)
[CI] don't run jobs on draft PRs (duckdb/duckdb#18016)
Fix  correlated subquery unnest fail (duckdb/duckdb#18092)
[CSV Reader] Prohibit options delim and sep in same read_csv call (duckdb/duckdb#18096)
Add start/end offset percentage options to Python test runner (duckdb/duckdb#18091)
Avoid running DraftPR.yml until timeout if token is missing (duckdb/duckdb#18090)
Unittest: Configure skip error messages (duckdb/duckdb#18087)
Switch to Optional for type hints in polars lazy dataframe function (duckdb/duckdb#18078)
Issue duckdb/duckdb#18071: Temporal inf -inf (duckdb/duckdb#18083)
Fix some scaling issues (duckdb/duckdb#17985)
Unittester: add `on_new_connection` + `on_load` + `skip_tests` options (duckdb/duckdb#18042)
Use `timestamp_t` instead of `time_t` for file last modified time (duckdb/duckdb#18037)
Add support for class-based expression iteration (duckdb/duckdb#18070)
fix star expr exclude error (duckdb/duckdb#18063)
Adding WAL encryption (duckdb/duckdb#17955)
Avoid adding commands read from a file to the shell history (duckdb/duckdb#18057)
Remove match-case statements from polars_io.py (duckdb/duckdb#18052)
Merge ossivalis into main (duckdb/duckdb#18036)
Add ppc64le spin-wait instruction (duckdb/duckdb#17837)
Unittest: Add skip_compiled option that can be used to skip built-in C++ tests (duckdb/duckdb#18034)
[Explain] Add the YAML format for EXPLAIN statements (duckdb/duckdb#17572)
Remove Linux (32 Bit) job (duckdb/duckdb#18012)
[Chore] Minor conflict manager refactoring (duckdb/duckdb#18015)
Fix duckdb/duckdb#18007: correctly execute expressions with pivot operator (duckdb/duckdb#18020)
c-api to copy vector with selection (duckdb/duckdb#17870)
Add support to produce Polars Lazy Dataframes (duckdb/duckdb#17947)
Implement consumption and production of Arrow Binary View (duckdb/duckdb#17975)
Rework extension loading to go through thread-safe ExtensionManager (duckdb/duckdb#17994)
Issue duckdb/duckdb#5123: make_timestamp_ms (duckdb/duckdb#17908)
Flag to disable database invalidation (duckdb/duckdb#17938)
[Fix] Reset profiling info before preparing a query (duckdb/duckdb#17940)
Issue duckdb/duckdb#5144: AsOf Join Threshold (duckdb/duckdb#17979)
[CI] Skip some workflows when updating out of tree extensions SHA (duckdb/duckdb#17949)
Merge v1.3-ossivalis into main (duckdb/duckdb#17973)
[nested] Allow fixed-size arrays to be unnested (duckdb/duckdb#17968)
Unit Tester Configuration (duckdb/duckdb#17972)
[Nested] Optimize structs in `LIST_VALUE` (duckdb/duckdb#17169)
Enable building spatial and encodings extensions (duckdb/duckdb#17960)
[Nested] Add `struct_position` and `struct_contains` functions (duckdb/duckdb#17819)
Visual Studio 17 (2022) fixes (duckdb/duckdb#17948)
[CI Nightly Fix] Skip logging test if not standard block size (duckdb/duckdb#17957)
Add v1.3-ossivalis to Cross version workflow (duckdb/duckdb#17906)
Unittester failures summary (duckdb/duckdb#16833)
Block based encryption (duckdb/duckdb#17275)
Do not dispatch JDBC/ODBC jobs in release CI runs (duckdb/duckdb#17937)
fix use after free in adbc on invalid stmt (duckdb/duckdb#17927)
Fix empty BP block when writing parquet (duckdb/duckdb#17929)
Leverage `VectorType` in `ColumnDataCollection` (duckdb/duckdb#17881)
Merge v1.3 into main (duckdb/duckdb#17897)
Make CTE Materialization the Default Instead of Inlining (duckdb/duckdb#17459)
Use an arena linked list for the physical operator children (duckdb/duckdb#17748)
Reword GenAI policy (duckdb/duckdb#17895)
Issue duckdb/duckdb#17861: FILL Argument Types (duckdb/duckdb#17888)
Update function descriptions and examples for list, array, lambda functions (duckdb/duckdb#17886)
Add GenAI policy (duckdb/duckdb#17882)
Issue duckdb/duckdb#17849: Test FILL Duplicates (duckdb/duckdb#17869)
Add STRUCT to MAP cast function (duckdb/duckdb#17799)
Issue duckdb/duckdb#17040: FILL Secondary Sorts (duckdb/duckdb#17821)
Issue duckdb/duckdb#17153: Window Order Columns (duckdb/duckdb#17835)
julia: add missing methods from C-API (duckdb/duckdb#17733)
Function Serialization: adapt to removal of overloads by explicitly casting if argument types have changed (duckdb/duckdb#17864)
[Indexes] Buffer-managed indexes part 2: segment handle for base nodes (duckdb/duckdb#17828)
duckdb/duckdb#17853 Enable flexible page sizes and update Android NDK to r27 in workflow. (duckdb/duckdb#17854)
Internal duckdb/duckdb#4991: Remove Epoch_MS(MS) (duckdb/duckdb#17816)
Add `duckdb_type` column to parquet_schema (duckdb/duckdb#17852)
Merge v1.3 into main (duckdb/duckdb#17851)
Fix ICE with Windows ARM64 (duckdb/duckdb#17844)
fix: escape using_columns on JoinRef::ToString (duckdb/duckdb#17839)
Merge130 (duckdb/duckdb#17833)
Replace string for const data ptr in encryption api (duckdb/duckdb#17825)
Pushdown pivot filter (duckdb/duckdb#17801)
Merge v1.3 into main (duckdb/duckdb#17806)
Add qualified parameter to Python GetTableNames API (duckdb/duckdb#17797)
Fix propagatesNullValues for case expr (duckdb/duckdb#17796)
[Profiling] Propagate the ClientContext into file handle writes (duckdb/duckdb#17754)
Ensure we use the same layout in `RadixPartitionedHashTable` and `GroupedAggregateHashTable` (duckdb/duckdb#17790)
[Julia] api docs improvements  (duckdb/duckdb#15645)
[Indexes] Buffer-managed indexes part 1: segment handles (duckdb/duckdb#17758)
Mark Upper/LowerComparisonType as const (duckdb/duckdb#17773)
Support glibc 2.28 environments (duckdb/duckdb#17776)
Pass `ExtensionLoader` when loading extensions, change extension entry function (duckdb/duckdb#17772)
Expose file_size_bytes and footer_size in parquet_file_metadata (duckdb/duckdb#17750)
[CAPI] Expose ErrorData (duckdb/duckdb#17722)
Rename decorator from test_nulls to null_test_parameters (duckdb/duckdb#17760)
re-add httpfs apply_patches (duckdb/duckdb#17755)
Deprecate windows-2019 runners  (duckdb/duckdb#17745)
csv_scanner: correct code comment (duckdb/duckdb#17735)
Adding additional authenticated data for encryption (duckdb/duckdb#17508)
[SQLLogicTester] Introduce `reset label <query label>` in the tester (duckdb/duckdb#17729)
Fix windows-2025 build errors (duckdb/duckdb#17726)
Aggregation performance (duckdb/duckdb#17718)
fix linux extension ci (duckdb/duckdb#17720)
Correctly setting the delim offset (duckdb/duckdb#17716)
Sorting followup (duckdb/duckdb#17717)
Revert "set default for MAIN_BRANCH_VERSIONING to false" (duckdb/duckdb#17708)
ClientBufferManager wrapper to access the client context in the buffer manager (duckdb/duckdb#17699)
Issue duckdb/duckdb#17040: FILL Window Function  (duckdb/duckdb#17686)
Merge v1.3-ossivalis into main (duckdb/duckdb#17690)
New Sorting Implementation (duckdb/duckdb#17584)
Output hashes in unittest and fix order (duckdb/duckdb#17664)
Enable profiling output for all operator types (duckdb/duckdb#17665)
[C API] Expose duckdb_scalar_function_bind_get_extra_info (duckdb/duckdb#17666)
Add rowsort in generate_series test duckdb/duckdb#43 (duckdb/duckdb#17675)
bump DuckDB_jll to v1.3.0 (duckdb/duckdb#17677)
C API tidying (duckdb/duckdb#17623)
fix extension troubleshooting link (duckdb/duckdb#17616)
Move query profiler's EndQuery after commit/rollback (duckdb/duckdb#17595)
Update function descriptions and examples (duckdb/duckdb#17132)
Add support for ToSqlString for union types (duckdb/duckdb#17513)
Remove redundant code path in the ConflictManager (duckdb/duckdb#17562)
change exception type to not be an internal exception (duckdb/duckdb#17551)
Python package devexp improvements (duckdb/duckdb#17483)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant