-
Notifications
You must be signed in to change notification settings - Fork 2.6k
CLI: Make ETA more of an estimate, and support large_row_rendering for footers #18656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @Mytherin, Welcome back! Thanks so much for the polish on this feature—it really looks great. You’re absolutely right that the estimates were overly precise, and I had originally kept them that way mainly to avoid changing the formatting of the total elapsed time. It looks so nice now. Looking at the diff, I do have a small concern about the decision to remove the separate thread. The current approach means the Kalman filter only updates when query progress is reported through the callback. If a query's progress stalls (e.g., due to remote I/O or memory allocation/swapping), the Kalman filter won’t update, and therefore the estimate won’t either. In the graphs I attached to the original PR, you can see moments where the query paused, and the estimated completion time increased accordingly. One benefit of the separate thread was that it periodically updated the filter with new time (without calling This is a nit, but one that will be notable if often there are stalled queries. Rusty |
I don't think the extra complexity is warranted for solving that edge case. DuckDB almost always makes small I/O requests which are unlikely to take more than a second even in the absolute worst case. This would be better solved by e.g. making the main thread no longer do I/O requests and moving those to separate I/O threads as part of adding async I/O. The progress bar is not the place to add such complexity imo. |
Hi @Mytherin, I ran some measurements of query progress on a few example queries on my laptop and thought it might be helpful to share what I observed. On my setup - a standard 2024 M3 MacBook Air with 24GB of RAM - DuckDB may take longer than a second to process I/O requests (but more accurately provide query progress callbacks), and I can show a few examples that don’t even involve network access. I logged the elapsed query time alongside the % progress reported in the progress callback handler. I noticed that there are periods where the progress isn’t updated, and these gaps can often exceed a second. During these times, the ETA continues to display but doesn’t get refreshed while progress is stalled. This motivated the thread to update the ETA. If you wanted to measure the progress you could use my diff here: I wanted to share this in case it’s useful for exploring ways to make the progress updates smoother. The input data is:
Query: copy(
select * from '/Users/rusty/Development/discreture/example.csv' union all
select * from '/Users/rusty/Development/discreture/example.csv' order by 1)
to 'example2.parquet'; The graphs below so that there are gaps between progress updates of 20 seconds and more than 80 seconds. During which the ETA bar is stalled. Here is the attached data with the logged progress updates for that query: Testing a simpler query over the same data, again you can see some stalls of up to a second at peak. Maybe these are flushes of row groups? copy(
select * from '/Users/rusty/Development/discreture/example.csv'
) to 'example.parquet'; Here is one with just a sort that causes progress update gaps of 20 seconds: copy(
select * from '/Users/rusty/Development/discreture/example.csv' order by 1
) to 'example.parquet'; While this may not convince you, I'd encourage you to try things out a bit more and you may see the same behavior. Rusty |
Thanks for the investigation - as mentioned before I think regardless of what the result here is, the progress bar is just not the right place to put this complexity. |
Hi @Mytherin, I’m glad we’re on the same page that there’s room for improvement here. As for where this issue gets addressed, I’m happy to leave that decision up to you, I don’t mind where the implementation is. I thought the progress bar was a reasonable spot because it already tracks when it was last updated and flushed to the screen, and it’s tied to the ETA estimator. Not all progress bars need an estimate (such as the Jupyter ones). If you’re open to making the progress callbacks periodic and (independent from the executor thread) - so the same progress value could be reported more than once - the estimator could be adapted to handle that. I’m not sure how that would work if the main executor thread is blocked (due to I/O or memory allocation), but as always, anything’s possible if you're willing to make the changes. I’m not trying to argue about style here; I just wanted to point out that there is some substance to the progress updates feeling jumpy. The period between updates can't easily be classified as being less than some threshold. It could be that the progress estimators for some of the operators aren't so accurate as well, and maybe this jumpiness is a result of that inaccuracy or unpredictability. |
CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656) Merge ossivalis into main (duckdb/duckdb#18644) Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625) Add 1.4 release codename (duckdb/duckdb#18652) Change arrow() to export record batch reader (duckdb/duckdb#18642)
CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656) Merge ossivalis into main (duckdb/duckdb#18644) Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625) Add 1.4 release codename (duckdb/duckdb#18652) Change arrow() to export record batch reader (duckdb/duckdb#18642)
CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656) Merge ossivalis into main (duckdb/duckdb#18644) Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625) Add 1.4 release codename (duckdb/duckdb#18652) Change arrow() to export record batch reader (duckdb/duckdb#18642)
bump iceberg to latest main [chore] Fix amalgamation build in progress_bar (duckdb/duckdb#18910) Bump inet & aws (duckdb/duckdb#18899) fix: refine query ETA display and Kalman filter stability (duckdb/duckdb#18880) Bump httpfs to v1.4-andium branch (duckdb/duckdb#18898) Encryption now encoded as a bit, centralizing in set/getter (duckdb/duckdb#18897) Add callback for when an extension fails to load, and also log this (duckdb/duckdb#18894) Keep base data scan state alive in ColumnData::Update call (duckdb/duckdb#18893) Expected errors 2053 (duckdb/duckdb#18892) fixing auto-specifying ciphers and remove double storage (duckdb/duckdb#18891) Add rowsort to upsert_default.test (duckdb/duckdb#18890) bump aws and iceberg (duckdb/duckdb#18889) [chore] Bump config test/configs/compressed_in_memory.json to new format (duckdb/duckdb#18888) [Dev] Fix footgun in `string_t::SetSizeAndFinalize` (duckdb/duckdb#18885) Json: no reinterpret<size_t*> (duckdb/duckdb#18886) [C API] Result schema of prepared statements (duckdb/duckdb#18779) Add `COPY (FORMAT BLOB)` to Andium too :^) (duckdb/duckdb#18884) Avoid automatically checkpointing if the database instance has been invalidated (duckdb/duckdb#18881) Update spatial+vss+sqlsmith in preparation for v1.4 (duckdb/duckdb#18882) Internal duckdb/duckdb#5796: Window Progress (duckdb/duckdb#18860) [Test] Small fixes to concurrent attach/detach test (duckdb/duckdb#18862) Update ducdkb iceberg hash (duckdb/duckdb#18873) Storage fuzzing + several fixes (duckdb/duckdb#18876) Bump mbedtls to v3.6.4 (duckdb/duckdb#18871) [minor] Incompatible DB error message: add newline (duckdb/duckdb#18861) Bump & remove patches for delta, avro, excel, encodings, fts (duckdb/duckdb#18869) Add a FORCE_DEBUG flag to force `-DDEBUG`, similar to FORCE_ASSERT (duckdb/duckdb#18872) Expected errors 2053 (duckdb/duckdb#18864) update duckdb azure extension ref for 1.4.0 (duckdb/duckdb#18868) Hold segment lock during GetColumnSegmentInfo (duckdb/duckdb#18859) Centralize attached database paths in a DatabaseFilePathManager which is shared across databases created through the same DBInstanceCache (duckdb/duckdb#18857) Add more encryption modes CTR and CBC (duckdb/duckdb#18619) Bump Ducklake (duckdb/duckdb#18825) No more `wal_encryption` flag (duckdb/duckdb#18851) Fix `NULL` path for `json_each`/`json_tree` (duckdb/duckdb#18852) Make ATTACH OR REPLACE atomic, keep list of used databases in MetaTransaction (duckdb/duckdb#18850) WAL <> DB File Match Fixes (duckdb/duckdb#18849) Add test_env to unit tester (duckdb/duckdb#18847) Merge ossivalis into main (duckdb/duckdb#18844) Bump MySQL/Postgres/SQLite (duckdb/duckdb#18848) Add OnBeginExtensionLoad callback (duckdb/duckdb#18842) Ignore null verification for statistics on structs (duckdb/duckdb#18813) Document storage version flag in CLI + minor rendering fix (duckdb/duckdb#18841) Add the `VARIANT` LogicalType (duckdb/duckdb#18609) Avoid printing '99 hours', given in most cases that means estimate is… (duckdb/duckdb#18839) Don't notify Py pkg when override git describe is set (duckdb/duckdb#18843) Add support for reading/writing native parquet geometry types (duckdb/duckdb#18832) Fix/run function in transaction (duckdb/duckdb#18741) Avoid expensive checkpoints and write amplification by appending row groups, and limiting vacuum operations for the last number of row groups (duckdb/duckdb#18829) fix: silence warnings about signed/unsigned conversions. (duckdb/duckdb#18835) fix: sanitize input for enable_logging (duckdb/duckdb#18830) Ensure a WAL file matches the DB file and checkpoint iteration (duckdb/duckdb#18823) Fix: Preserve database configuration flags for tab completion in DuckDB shell (duckdb/duckdb#18482) Extensions.yml: Pass down save_cache to inner workflows (duckdb/duckdb#18828) Fix format-fix runs on Linux (duckdb/duckdb#18827) Re-add accidentally removed check if copy_from is supported (duckdb/duckdb#18824) [Fix] Bug in fixed-size buffer when throwing out-of-memory (duckdb/duckdb#18769) For BC reasons - keep VARINT as alias for BIGNUM (duckdb/duckdb#18821) Add callback to get a list of copy options, use this to provide suggestions and to erase options from import that are only used during exporting (duckdb/duckdb#18812) fix: Add COLLATE NOCASE support to strpos function (duckdb/duckdb#18819) [CI] install libcurl4-openssl-dev with apt-get (duckdb/duckdb#18811) Provide failing file name in Parquet reader error messages (duckdb/duckdb#18814) Test runner: Expand '{UUID}' into a random UUID (duckdb/duckdb#18809) Expected errors 2053 (duckdb/duckdb#18810) Add support for non-aggregate window functions (duckdb/duckdb#18788) Fix some unindented interactions between `EMPTY_RESULT_PULLUP` and `MATERIALIZED` CTEs (duckdb/duckdb#18805) Typed macro parameters (duckdb/duckdb#18786) Internal duckdb/duckdb#3273: Hashed Sort Callbacks (duckdb/duckdb#18796) [chore] Fixup tidy-check on src/logging/log_manager.cpp by passing const & (duckdb/duckdb#18801) Fixup progress_bar: avoid converting doubles into int32_t unchecked (duckdb/duckdb#18800) Issue duckdb/duckdb#18767: Ignore Timestamp Offsets (duckdb/duckdb#18794) bump httpfs so it includes curl option (duckdb/duckdb#18691) Improve autocomplete suggestions (duckdb/duckdb#18773) Remove everything python-package related (duckdb/duckdb#18789) Support expressions as COPY file target (duckdb/duckdb#18795) fix: coalesce query progress updates to reduce terminal writes (duckdb/duckdb#18672) Task Scheduler: track exact task count, and re-signal on dequeue failure if there are tasks left (duckdb/duckdb#18792) Internal duckdb/duckdb#3273: Parallel Window Masks (duckdb/duckdb#18731) fix: improve speed of GetValue() for STRUCT type (duckdb/duckdb#18785) Add `memory_limit` parameter to `benchmark_runner`/`test_runner.py` (duckdb/duckdb#18790) Treat ENABLE_EXTENSION_AUTOINSTALL as the BOOL that it is (duckdb/duckdb#18778) Move row id logic to separate RowIdColumnData class instead of inlining it into the RowGroup (duckdb/duckdb#18780) Improve error messages for merge / vector reference (duckdb/duckdb#18777) Use microsecond resolution for printing the current timestamp (duckdb/duckdb#18776) Add `file_size_bytes` (de-)serialization (duckdb/duckdb#18775) Propagate `DUCKDB_*_VERSION` in extensions and tests (duckdb/duckdb#18774) [Test Fix] Forward output to file (duckdb/duckdb#18772) [CI] Adjust test configs post logger PR (duckdb/duckdb#18771) Revert "Use 1-based indexing for SQL-based JSON array extraction" (duckdb/duckdb#18758) Make `duckdb_log` return a TIMESTAMP_TZ (duckdb/duckdb#18768) Fix Path Typo in Extension's CMake Warning Message (duckdb/duckdb#18766) Fix index resolution when querying table with index via view (duckdb/duckdb#18319) Fix radix partitioning with more than 10 bits (duckdb/duckdb#18761) Add support for auto-globbing within a directory: if no matches are found for a specific path, we retry with `/**/*.[ext]` appended (duckdb/duckdb#18760) Refactor read_blob and read_text to use MultiFileFunction. (duckdb/duckdb#18706) Add missing expected errors to the test cases (next chunk) (duckdb/duckdb#18753) Minor logging fixes and more benchmarking (duckdb/duckdb#18755) Extensions.yml should also check converted_to_draft (duckdb/duckdb#18754) [Profiling] Add Profiling to Write Function (duckdb/duckdb#18724) Fixing lazy polars execution on query result (duckdb/duckdb#18749) Remove separate WAL encryption flag (duckdb/duckdb#18750) Add leak suppressions to nightly runs (duckdb/duckdb#18748) Append using a SQL query, instead of directly appending to a base table, and support user-provided queries through the QueryAppender (duckdb/duckdb#18738) removed placeholder client directories for node and jdbc, its been > 1 yr (duckdb/duckdb#18757) Add missing expected errors to the test cases (duckdb/duckdb#18746) Add OS X notarization for DuckDB CLI and libduckdb.dylib (duckdb/duckdb#18747) Use correct type for pushing collations in subqueries (duckdb/duckdb#18744) Merge ossivalis into main (duckdb/duckdb#18719) Secrets: if serialization_type is not specified, assume it's a key value secret (duckdb/duckdb#18743) [C API] Function to set a copy callback for bind data (duckdb/duckdb#18739) fix timetravel for default tables (duckdb/duckdb#18240) [unittest] SkipLoggingSameError() to make unittester report one failure per case (duckdb/duckdb#18270) Use 1-based indexing for SQL-based JSON array extraction (duckdb/duckdb#18735) Add (CSV) file logger (duckdb/duckdb#17692) feat: enhance .tables command with schema disambiguation and filtering (duckdb/duckdb#18641) Internal duckdb/duckdb#5669: Loop Join Thresholds (duckdb/duckdb#18733) Fix PIVOT in multiple statements (duckdb/duckdb#18729) Minor fixes for other catalogs - mostly checking `IsDuckTable()` for unsupported operations (duckdb/duckdb#18720) Added support for blob<->uuid conversions (duckdb/duckdb#18027) #Fix 18558: add row_group scan fast path (duckdb/duckdb#18686) Improved grammar generation script (duckdb/duckdb#18716) Correctly throw an error when too few columns are supplied in MERGE INTO INSERT (duckdb/duckdb#18715) [Profiling] Add Profiling to Read Function (duckdb/duckdb#18661) Fix issue with materialized CTE optimization in flatten_dependent_join (duckdb/duckdb#18714) Add Option to Allocate Using an Arena in `string_t` (duckdb/duckdb#17992) Internal duckdb/duckdb#3273: Hashed Sort States (duckdb/duckdb#18690) Python-style positional/named arguments for macro's (duckdb/duckdb#18684) [Fix] Correctly handle table and index chunks in WAL replay buffering (duckdb/duckdb#18700) Make ART construction iterative via ARTBuilder (duckdb/duckdb#18702) Correctly handle collations for IN (subquery) (duckdb/duckdb#18698) Hold row group lock for entire call of MoveToCollection (duckdb/duckdb#18694) Expected errors 2053 (duckdb/duckdb#18695) Issue duckdb/duckdb#18457: DateTrunc Simplification Warnings (duckdb/duckdb#18687) [Python SQLLogicTest] Add `test/sql/pragma/profiling/test_profiling_all.test` to the SKIPPED_TESTS set (duckdb/duckdb#18689) Make sure parse errors are wrapped in ErrorData (duckdb/duckdb#18682) Internal duckdb/duckdb#5366: Window State Arguments (duckdb/duckdb#18676) Expected errors 2053 (duckdb/duckdb#14213) Add `date_trunc()` simplification rules (duckdb/duckdb#18457) Fix the issue where delta_for isn't used in bitpacking when for is unavailable (duckdb/duckdb#18616) fix error message related to wrong memory unit (duckdb/duckdb#18671) Grab lock and double-check that column is not loaded in MoveToCollection (duckdb/duckdb#18677) Correctly allocate uncompressed string data in ZSTD for many giant strings (duckdb/duckdb#18678) Internal duckdb/duckdb#5662: IEJoin Test Plans (duckdb/duckdb#18680) [ Python SQLLogic Tester ] Add `MERGE_INTO` to `statement.type` enum in `result.py` (duckdb/duckdb#18675) Internal duckdb/duckdb#5366: Window Interrupt Arguments (duckdb/duckdb#18651) Correctly set weights in reservoir sample when switch to slow sampling (duckdb/duckdb#18563) [Dev] Add script to create patch from changes in an extension repository (duckdb/duckdb#18620) Python test runner: Fix hash comparison error output (duckdb/duckdb#18626) [CI] skip building encodings extension in InvokeCI (duckdb/duckdb#18655) CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656) Merge ossivalis into main (duckdb/duckdb#18644) Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625) Add 1.4 release codename (duckdb/duckdb#18652) Change arrow() to export record batch reader (duckdb/duckdb#18642) bump spatial (on main) (duckdb/duckdb#18197) bump avro to v1.4 (duckdb/duckdb#18434) Make more configs into generic settings (duckdb/duckdb#18592) Add "Hash Zero" verification CI run (duckdb/duckdb#18623) feat: add ETA to progress bar in DuckDB CLI (duckdb/duckdb#18575) wrap httplib ::max() call in `WIN_32` check (duckdb/duckdb#18590) [ART] ART::Erase refactoring (duckdb/duckdb#18595) [CSV Sniffer] Fix type detection issue with union and empty columns (duckdb/duckdb#18606) Add Field IDS to multi file reader for positional deletes (duckdb/duckdb#18617) Re-add `hugeint` to `__internal_compress_string` (duckdb/duckdb#18622) Adjust filter pushdown to latest polars release (duckdb/duckdb#18624) parquet/parquet_multi_file_info.cpp: fix move from stack (duckdb/duckdb#18634) Issue duckdb/duckdb#18631: Streaming Windowed Quantile (duckdb/duckdb#18636) Fix serialization backwards compatability for varargs functions (duckdb/duckdb#18596) [Profiling] Add client context into more read functions (duckdb/duckdb#18514) [CI] Don't zip and upload Code Coverage tests results when Code Coverage got cancelled (duckdb/duckdb#18607) [Test] Fix test case and a benchmark (duckdb/duckdb#18610) Update README.md (duckdb/duckdb#18614) correctly setting log transaction id in ThreadContext (duckdb/duckdb#18536) [Fix] Hidden test failure in test_struct_update.test (duckdb/duckdb#18598) Increment storage version to enable `DICT_FSST` in benchmark file (duckdb/duckdb#18588) fix hidden merge conflict (duckdb/duckdb#18589) Adds a function for updating and adding values in a struct (duckdb/duckdb#15533) Pushdown filters on coalesced outer join keys compared for equality under the join condition (duckdb/duckdb#18169) fix: libduckdb.so missing soversion (duckdb/duckdb#18305) String dictionary hash cache (duckdb/duckdb#18580) Force `LIST`/`ARRAY` child vectors on a Parquet single page (duckdb/duckdb#18578) fix: use thousands separator and decimal for row counts in`duckbox` output format (duckdb/duckdb#18564) Flip left/right delim join based on cardinalities (duckdb/duckdb#18552) [Fix] Adjust shrink threshold back to original count > SHRINK_THRESHOLD (duckdb/duckdb#18582) [CSV Sniffer] Fixing bug of not properly setting skipped rows from sniffer (duckdb/duckdb#18555) fix: add formatting to explain row counts (duckdb/duckdb#18566) Delete FUNDING.json Update FUNDING.json Create FUNDING.json [Indexes] Buffer-managed indexes part 3: segment handle for Node48 and Node256 (duckdb/duckdb#18567) Rename the Varint type to Bignum (duckdb/duckdb#18547) Add compile option standalone-debug for clang (duckdb/duckdb#17433) Fixing compilation with -std=cpp23 (duckdb/duckdb#18557) [easy] [no-op] Minor optimization on iterator lookup (duckdb/duckdb#15349) optimize/parquet: generate movable types for parquet (duckdb/duckdb#18510) Check if `heap_block_ids` is empty before getting start/end when destroying chunks in `TupleDataCollection` (duckdb/duckdb#18556) Implement special-case `VARCHAR` to `JSON[]` casts and vice versa (duckdb/duckdb#18541) [ART] Node::Free refactoring (duckdb/duckdb#18544) [Fix] Follow-up PR to only delete unique row IDs (duckdb/duckdb#18545) Restore missing `test/configs/small_block_size.json` file (duckdb/duckdb#18507) Unittester: Add the `--sort-style` parameter that allows a fallback comparison where results are sorted according to a given sort-style (duckdb/duckdb#18542) Allow overriding openssl version for FIPS compliance (duckdb/duckdb#18499) fix: improve handling variant nulls and nested types (duckdb/duckdb#18538) Add support for explicit clean-up routine in test config, and exit multi-statement execution when an error is encountered (duckdb/duckdb#18539) Use global index, not local id when creating filters in `MultiFileColumnMapper` (duckdb/duckdb#18537) Add `StatementVerifier` for `EXPLAIN` (duckdb/duckdb#18529) Add CAPI to retrieve client context for table functions (duckdb/duckdb#18520) fix: support both field orders for variant struct (duckdb/duckdb#18532) [Varint] Negation, Subtraction and Over/under-flow checking (duckdb/duckdb#18477) ALP test: skip TPC-DS 67 - it is not consistent with floating point numbers (duckdb/duckdb#18528) Consistently detect JSON schema indepent of number of threads (duckdb/duckdb#18522) Internal duckdb/duckdb#16560: Numeric TRUNC Precision (duckdb/duckdb#18511) Dynamically determine dictionary size limit in Parquet writer (if unset) (duckdb/duckdb#18356) Fix incorrect character encoding in GetLastErrorAsString on Windows (duckdb/duckdb#18431) Fix: Write the salt together with the HT offset when determining the value for key comparison (duckdb/duckdb#18374) When tracking evicted_data_per_tag, track actual size on disk after temp file compression (duckdb/duckdb#18521) Adding WITH ORDINALITY to DuckDB (duckdb/duckdb#16581) ParserException for Pragma with named parameters (duckdb/duckdb#18506) Temporarily excluding `Build Pyodide wheel` for Python 3.11 because it fails to build `WASM` wheels (duckdb/duckdb#18508) Remove `immediate_transaction_mode` from DB config options (duckdb/duckdb#18516) Allow expressions to be used in ATTACH / COPY options (duckdb/duckdb#18515) Fix several bugs/fuzzer issues (duckdb/duckdb#18503) Fix: Remove overly strict assertion on empty string value (duckdb/duckdb#18504) Change UNICODE to UTF8 (duckdb/duckdb#17586) Merge ossivalis (duckdb/duckdb#18502) fix: add missing space in AttachInfo::ToString() (duckdb/duckdb#18500) julia: config improvements (duckdb/duckdb#17585) [Profiling] Add client context into read functions (duckdb/duckdb#18438) Fix accidental internal exception in type transformation (duckdb/duckdb#18492) add delta linux back to ci (duckdb/duckdb#18491) Change ctrl-a/ctrl-e to move to start/end of line, not buffer (duckdb/duckdb#18490) Unify `ON CONFLICT` and `MERGE INTO` (duckdb/duckdb#18480) Internal duckdb/duckdb#5384: Window Sorting Polish (duckdb/duckdb#18484) re-nable extensions in invokeci (duckdb/duckdb#18476) SUM and + Operator for Varints (duckdb/duckdb#18424) Internal duckdb/duckdb#5366: WindowDeltaScanner (duckdb/duckdb#18468) Merge ossivalis (duckdb/duckdb#18456) Bump postgres to latest main (duckdb/duckdb#18464) Internal duckdb/duckdb#5385: WindowMergeSortTree Sort Update (duckdb/duckdb#18461) Add support for generic settings, and move many settings over to generic settings (duckdb/duckdb#18447) Buffer index appends during WAL replay (duckdb/duckdb#18313) Internal duckdb/duckdb#5384: WindowDistinctAggregator Sort Update (duckdb/duckdb#18442) Add support for "template" types (duckdb/duckdb#18410) Update pyodide build to 0.28.0 (duckdb/duckdb#18446) Parquet: add row-group ordinal during writing encryption (duckdb/duckdb#18433) Include pyodide build configuration (duckdb/duckdb#18183) [Fix] Block size nightly (duckdb/duckdb#18425) Internal duckdb/duckdb#5368: WindowNaiveAggregator Sort Update (duckdb/duckdb#18409) Internal duckdb/duckdb#5367: SortedAggregateFunction Sort Update (duckdb/duckdb#18408) Refactor extension CI to use extension-ci-tools (duckdb/duckdb#18361) Correctly fetch only base column data in ColumnData::FetchUpdateData (duckdb/duckdb#18423) feat: remove anything following `?` in database name (duckdb/duckdb#18417) Merge `v1.3-ossivalis` in `main` (duckdb/duckdb#18401) Add support for table_constraints of AdbcConnectionGetObjects() (duckdb/duckdb#18181) Add DuckLake back in (duckdb/duckdb#18405) Internal duckdb/duckdb#5294: TIME_NS C API (duckdb/duckdb#18215) Remove incorrect assertion (duckdb/duckdb#18404) [ Python SQLLogic Tester ] Add `MERGE_INTO` statement to duckdb python (duckdb/duckdb#18402) CI: Add separate job for discussion mirroring (duckdb/duckdb#18407) Wrap runner.ExecuteFile, otherwise cleanup is not properly performed (duckdb/duckdb#18400) Internal duckdb/duckdb#3273: Window Hashed Sort (duckdb/duckdb#18337) Store extra metadata blocks in RowGroupPointer, and only flush dirty Metadata blocks (duckdb/duckdb#18398) CI: Fix Discussion mirroring (duckdb/duckdb#18397) Record whether or not cross products are implicit or not, and use this for converting queries back to SQL (duckdb/duckdb#18394) Correct and consistent integer arithmetic error messages (duckdb/duckdb#18393) Re-use metadata of unaltered row groups when checkpointing a table (duckdb/duckdb#18395) Approx database count system function (duckdb/duckdb#18392) Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390) Bump httpfs (duckdb/duckdb#18388) Uncomment skipped decimal REE tests (duckdb/duckdb#18372) Re-enable but deprecate CORE_EXTENSIONS in CMakeLists.txt (duckdb/duckdb#18377) Add missing ninja to workflow file (duckdb/duckdb#18373) Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18364) Pass `AttachOptions` to `attach` method, and turn `StorageExtensionInfo` into an `optional_ptr` (duckdb/duckdb#18368) Split up out-of-tree extensions into separate files, and allow out-of-tree extensions to be built using BUILD_EXTENSIONS={ext_name} (duckdb/duckdb#18357) Python external dispatch param fixes (duckdb/duckdb#18359) Revert "[unittest] - fix doubled error headers on `Unexpected failure`" (duckdb/duckdb#18355) Add support for checkpointing in-memory tables (duckdb/duckdb#18348) [C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142) Extend PEG parser grammar (duckdb/duckdb#18221) [unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314) Fix condition indexes in join filter pushdown (duckdb/duckdb#18341) download Real Nest data in quiet mode (duckdb/duckdb#18346) Fix debug error in join order optimizer (duckdb/duckdb#18344) Aarch64 backport (duckdb/duckdb#18345) add the from-table-function as parameter to copy-from-bind (duckdb/duckdb#18004) feat: making Parquet write RowGroup.total_compressed_size (duckdb/duckdb#18307) Make storage-version a test parameter (duckdb/duckdb#18324) New Arrow C-API (duckdb/duckdb#18246) feat: Parquet extension add row_group_compressed_size (duckdb/duckdb#18294) Merge ossivalis into main (duckdb/duckdb#18272) SHOW TABLES FROM <qualified_name> (duckdb/duckdb#18179) Add target for installing Python deps. (duckdb/duckdb#18285) Use `FromEpochSeconds` instead of `FromTimeT` in `FileSystem::GetLastModifiedTime` (duckdb/duckdb#18281) [Fix] Adjust test to run with different block sizes (duckdb/duckdb#18277) Use DuckDB cast infrastructure in fmt for new uhugeint/hugeint code (duckdb/duckdb#18275) Use set for row ID scanning during index scans (duckdb/duckdb#18274) Add support for RETURNING to MERGE INTO (duckdb/duckdb#18271) Support HUGEINT in printf and format (duckdb/duckdb#13277) Expanded autocomplete suggestions (duckdb/duckdb#18243) [Parquet] Add read support for the `VARIANT` LogicalType (with shredded encoding) (duckdb/duckdb#18224) Reduce copy in Vector::Reinterpret (duckdb/duckdb#18264) Fixes for gcc 15 (duckdb/duckdb#18261) Fix dictionary-related assertions (duckdb/duckdb#18260) Allow for static libs from extension dependencies to be bundled (duckdb/duckdb#18226) disable WebAssembly duckdb-wasm builds job in NightlyTests triggered by 'workflow_dispatch' event (duckdb/duckdb#18129) Bunch of loosely connected test/CI fixes (duckdb/duckdb#18254) update run_extension_medata_tests.sh (duckdb/duckdb#17976) fixes for some minor llvm 20 complaints (duckdb/duckdb#18257) Fix integer overflow in sequence vector (duckdb/duckdb#18245) Add type safety to `FlatVector::GetData<T>`, `ConstantVector::GetData<T>` and `UnifiedVectorFormat::GetData<T>` (duckdb/duckdb#18256) Slightly higher memory limit for test (duckdb/duckdb#18235) Improve descriptions of thresholds vars affecting join algorithm selection (duckdb/duckdb#17377) Add support for geoarrow encoded geometries in geoparquet files. (duckdb/duckdb#17942) Dictionary functions (duckdb/duckdb#18127) Better `NULL` handling in `TupleDataLayout` (duckdb/duckdb#18069) Track `DataChunk` memory usage in various places (duckdb/duckdb#18191) [Parquet] Add read support for the `VARIANT` LogicalType (duckdb/duckdb#18187) Bugduckdb/duckdb#18163 Fix STDDEV_SAMP undeterminism (duckdb/duckdb#18210) Internal duckdb/duckdb#5264: NLJ Not Distinct (duckdb/duckdb#18216) Improve Parquet reader `NULL` statistics and compress all-`NULL` columns using `CompressedMaterialization` (duckdb/duckdb#18217) Get type of encoded `SortKey` from `TupleDataLayout` (duckdb/duckdb#18218) ci(pyodide): enable WASM exceptions on the latest pyodide build (duckdb/duckdb#18173) Temporary file encryption (duckdb/duckdb#18208) More internal-linkage (duckdb/duckdb#18177) Two-rowID-leaf support in the conflict manager and general refactoring (duckdb/duckdb#18194) [Parquet][Dev] Update the vendored `parquet.thrift` to `3ce0760` (duckdb/duckdb#18195) Parquet reader logging (duckdb/duckdb#18172) Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18188) [Profiling] Move the client context into more write functions (duckdb/duckdb#17875) Check if `GetLastSegment` is not `nullptr` in `ColumnData::RevertAppend` (duckdb/duckdb#18171) Reduce lock contention for the instance cache (duckdb/duckdb#18079) fix bug with allowed_paths (duckdb/duckdb#18176) Avoid `realloc` in CSV writer (duckdb/duckdb#18174) fix typo (duckdb/duckdb#18165) Resolve some small build issues (duckdb/duckdb#18162) Implement `replace_type` function (duckdb/duckdb#18077) Issue duckdb/duckdb#17683: TIME_NS Compilation (duckdb/duckdb#18053) Add support for AdbcConnectionGetObjects(table_type) (duckdb/duckdb#18066) Detect when updates have no effect, and skip performing the actual updates if we encounter these nop updates (duckdb/duckdb#18144) Add support for `MERGE INTO` (duckdb/duckdb#18135) Improve sort key comparison performance (duckdb/duckdb#18131) set ::error:: annotations for test runners (duckdb/duckdb#18072) Internal duckdb/duckdb#3273: Window Task Generation (duckdb/duckdb#18113) Update description of 'arrow_lossless_conversion' (duckdb/duckdb#18046) [chore] Merge v1.3-ossivalis on main (duckdb/duckdb#18109) ci: build duckdb against the latest emscripten (duckdb/duckdb#18110) Don't throw `InternalException` in `Sort::Sink` (duckdb/duckdb#18105) TPC-DS: Use BIGINT fields (duckdb/duckdb#18098) [CI] don't run jobs on draft PRs (duckdb/duckdb#18016) Fix correlated subquery unnest fail (duckdb/duckdb#18092) [CSV Reader] Prohibit options delim and sep in same read_csv call (duckdb/duckdb#18096) Add start/end offset percentage options to Python test runner (duckdb/duckdb#18091) Avoid running DraftPR.yml until timeout if token is missing (duckdb/duckdb#18090) Unittest: Configure skip error messages (duckdb/duckdb#18087) Switch to Optional for type hints in polars lazy dataframe function (duckdb/duckdb#18078) Issue duckdb/duckdb#18071: Temporal inf -inf (duckdb/duckdb#18083) Fix some scaling issues (duckdb/duckdb#17985) Unittester: add `on_new_connection` + `on_load` + `skip_tests` options (duckdb/duckdb#18042) Use `timestamp_t` instead of `time_t` for file last modified time (duckdb/duckdb#18037) Add support for class-based expression iteration (duckdb/duckdb#18070) fix star expr exclude error (duckdb/duckdb#18063) Adding WAL encryption (duckdb/duckdb#17955) Avoid adding commands read from a file to the shell history (duckdb/duckdb#18057) Remove match-case statements from polars_io.py (duckdb/duckdb#18052) Merge ossivalis into main (duckdb/duckdb#18036) Add ppc64le spin-wait instruction (duckdb/duckdb#17837) Unittest: Add skip_compiled option that can be used to skip built-in C++ tests (duckdb/duckdb#18034) [Explain] Add the YAML format for EXPLAIN statements (duckdb/duckdb#17572) Remove Linux (32 Bit) job (duckdb/duckdb#18012) [Chore] Minor conflict manager refactoring (duckdb/duckdb#18015) Fix duckdb/duckdb#18007: correctly execute expressions with pivot operator (duckdb/duckdb#18020) c-api to copy vector with selection (duckdb/duckdb#17870) Add support to produce Polars Lazy Dataframes (duckdb/duckdb#17947) Implement consumption and production of Arrow Binary View (duckdb/duckdb#17975) Rework extension loading to go through thread-safe ExtensionManager (duckdb/duckdb#17994) Issue duckdb/duckdb#5123: make_timestamp_ms (duckdb/duckdb#17908) Flag to disable database invalidation (duckdb/duckdb#17938) [Fix] Reset profiling info before preparing a query (duckdb/duckdb#17940) Issue duckdb/duckdb#5144: AsOf Join Threshold (duckdb/duckdb#17979) [CI] Skip some workflows when updating out of tree extensions SHA (duckdb/duckdb#17949) Merge v1.3-ossivalis into main (duckdb/duckdb#17973) [nested] Allow fixed-size arrays to be unnested (duckdb/duckdb#17968) Unit Tester Configuration (duckdb/duckdb#17972) [Nested] Optimize structs in `LIST_VALUE` (duckdb/duckdb#17169) Enable building spatial and encodings extensions (duckdb/duckdb#17960) [Nested] Add `struct_position` and `struct_contains` functions (duckdb/duckdb#17819) Visual Studio 17 (2022) fixes (duckdb/duckdb#17948) [CI Nightly Fix] Skip logging test if not standard block size (duckdb/duckdb#17957) Add v1.3-ossivalis to Cross version workflow (duckdb/duckdb#17906) Unittester failures summary (duckdb/duckdb#16833) Block based encryption (duckdb/duckdb#17275) Do not dispatch JDBC/ODBC jobs in release CI runs (duckdb/duckdb#17937) fix use after free in adbc on invalid stmt (duckdb/duckdb#17927) Fix empty BP block when writing parquet (duckdb/duckdb#17929) Leverage `VectorType` in `ColumnDataCollection` (duckdb/duckdb#17881) Merge v1.3 into main (duckdb/duckdb#17897) Make CTE Materialization the Default Instead of Inlining (duckdb/duckdb#17459) Use an arena linked list for the physical operator children (duckdb/duckdb#17748) Reword GenAI policy (duckdb/duckdb#17895) Issue duckdb/duckdb#17861: FILL Argument Types (duckdb/duckdb#17888) Update function descriptions and examples for list, array, lambda functions (duckdb/duckdb#17886) Add GenAI policy (duckdb/duckdb#17882) Issue duckdb/duckdb#17849: Test FILL Duplicates (duckdb/duckdb#17869) Add STRUCT to MAP cast function (duckdb/duckdb#17799) Issue duckdb/duckdb#17040: FILL Secondary Sorts (duckdb/duckdb#17821) Issue duckdb/duckdb#17153: Window Order Columns (duckdb/duckdb#17835) julia: add missing methods from C-API (duckdb/duckdb#17733) Function Serialization: adapt to removal of overloads by explicitly casting if argument types have changed (duckdb/duckdb#17864) [Indexes] Buffer-managed indexes part 2: segment handle for base nodes (duckdb/duckdb#17828) duckdb/duckdb#17853 Enable flexible page sizes and update Android NDK to r27 in workflow. (duckdb/duckdb#17854) Internal duckdb/duckdb#4991: Remove Epoch_MS(MS) (duckdb/duckdb#17816) Add `duckdb_type` column to parquet_schema (duckdb/duckdb#17852) Merge v1.3 into main (duckdb/duckdb#17851) Fix ICE with Windows ARM64 (duckdb/duckdb#17844) fix: escape using_columns on JoinRef::ToString (duckdb/duckdb#17839) Merge130 (duckdb/duckdb#17833) Replace string for const data ptr in encryption api (duckdb/duckdb#17825) Pushdown pivot filter (duckdb/duckdb#17801) Merge v1.3 into main (duckdb/duckdb#17806) Add qualified parameter to Python GetTableNames API (duckdb/duckdb#17797) Fix propagatesNullValues for case expr (duckdb/duckdb#17796) [Profiling] Propagate the ClientContext into file handle writes (duckdb/duckdb#17754) Ensure we use the same layout in `RadixPartitionedHashTable` and `GroupedAggregateHashTable` (duckdb/duckdb#17790) [Julia] api docs improvements (duckdb/duckdb#15645) [Indexes] Buffer-managed indexes part 1: segment handles (duckdb/duckdb#17758) Mark Upper/LowerComparisonType as const (duckdb/duckdb#17773) Support glibc 2.28 environments (duckdb/duckdb#17776) Pass `ExtensionLoader` when loading extensions, change extension entry function (duckdb/duckdb#17772) Expose file_size_bytes and footer_size in parquet_file_metadata (duckdb/duckdb#17750) [CAPI] Expose ErrorData (duckdb/duckdb#17722) Rename decorator from test_nulls to null_test_parameters (duckdb/duckdb#17760) re-add httpfs apply_patches (duckdb/duckdb#17755) Deprecate windows-2019 runners (duckdb/duckdb#17745) csv_scanner: correct code comment (duckdb/duckdb#17735) Adding additional authenticated data for encryption (duckdb/duckdb#17508) [SQLLogicTester] Introduce `reset label <query label>` in the tester (duckdb/duckdb#17729) Fix windows-2025 build errors (duckdb/duckdb#17726) Aggregation performance (duckdb/duckdb#17718) fix linux extension ci (duckdb/duckdb#17720) Correctly setting the delim offset (duckdb/duckdb#17716) Sorting followup (duckdb/duckdb#17717) Revert "set default for MAIN_BRANCH_VERSIONING to false" (duckdb/duckdb#17708) ClientBufferManager wrapper to access the client context in the buffer manager (duckdb/duckdb#17699) Issue duckdb/duckdb#17040: FILL Window Function (duckdb/duckdb#17686) Merge v1.3-ossivalis into main (duckdb/duckdb#17690) New Sorting Implementation (duckdb/duckdb#17584) Output hashes in unittest and fix order (duckdb/duckdb#17664) Enable profiling output for all operator types (duckdb/duckdb#17665) [C API] Expose duckdb_scalar_function_bind_get_extra_info (duckdb/duckdb#17666) Add rowsort in generate_series test duckdb/duckdb#43 (duckdb/duckdb#17675) bump DuckDB_jll to v1.3.0 (duckdb/duckdb#17677) C API tidying (duckdb/duckdb#17623) fix extension troubleshooting link (duckdb/duckdb#17616) Move query profiler's EndQuery after commit/rollback (duckdb/duckdb#17595) Update function descriptions and examples (duckdb/duckdb#17132) Add support for ToSqlString for union types (duckdb/duckdb#17513) Remove redundant code path in the ConflictManager (duckdb/duckdb#17562) change exception type to not be an internal exception (duckdb/duckdb#17551) Python package devexp improvements (duckdb/duckdb#17483)
Follow-ups to #18575 and #18564
Render ETA as actual estimate
Currently the ETA is rendered as a very precise value, which is hard to read, and overly precise in most cases given the fact that it is an estimate. Adding it in front of the progress bar also does not look very good imo.
This PR reworks the ETA to be added to the end of the progress bar instead, and to use a less granular estimate.
Here's some example estimates:
The final elapsed time is still shown in the old exact notation:
We also fix an issue where large cross-products/self-joins would keep on showing a "fast" estimate instead of converging towards a more realistic (very slow) one, e.g. the following query (that will realistically never finish as it is generating 100M * 100M rows) would previously keep on showing an estimate of a few minutes - it now (correctly) converges to
>99 hours remaining
.We also do a bunch of clean-up:
std::vector
overduckdb::vector
and clean-up a bunch of naming/codeDon't set thousand/decimal separator by default, and add large footer rendering
#18564 set the thousand/decimal separators by default, instead of letting them be uninitialized by default. This PR reverts this back to the old behavior. If enabled, the rendering is still used in the row/column counts. Instead, we add support for the
large_number_rendering
to the duckbox mode which is enabled by default, e.g. some example renderings:These also use syntax highlighting now.