Skip to content

Conversation

Dtenwolde
Copy link
Contributor

@Dtenwolde Dtenwolde commented Jul 11, 2025

This PR extends the grammar used by the autocomplete functions to work for (almost) all SQL queries in the test folder and out-of-tree test folders. I realise it has become quite a large PR, but does make the PEG parser more complete, moving forward to an extensible parser.

It adds support for:

  • Unicode spaces: These were initially not filtered out and caused the PEG parser to crash. They are now replaced by normal spaces if they are detected. This is similar in behaviour to the PostgreSQL parser (const string &sql_ref = StripUnicodeSpaces(sql, clean_sql) ? clean_sql : sql;)
  • Dollar-quoted strings.
  • New syntax
  • Better keyword handling (more inline with PostgreSQL parser)
  • Tests for autocomplete on ALTER columns.

New syntax

  • MERGE INTO
  • LAMBDA
  • ATTACH
  • CREATE INDEX
  • Loading and installing extensions
  • More elaborate type support, such as IntervalType
  • Vacuum

Keywords

The old PEG parser still relied on the PostgreSQL keyword .list files to determine the category of the keyword. For the PEG parser, I copied these .list files and placed them in the keywords folder. Based on the file they are in, they are categorized in a keyword_map that is used during parsing. This removes the dependency on the PostgreSQL parser, but it adds the complexity that there are now two of these sets of keywords (for PEG parser and PostgreSQL parser).

Normally table and schema names are not allowed to be reserved keywords. Except if the catalog name is specified before the schema name, or the schema name before the table name. For instance, AND is a reserved keyword:
USE catalog_name.and
USE and

FROM schema_name.and
FROM and

To handle the difference between these cases, and still keep the table and schema name suggestions, I added RESERVED_TABLE_NAME and RESERVED_SCHEMA_NAME to the SuggestionState. In the IdentifierMatcher, these can then match on a reserved keywords, whereas TABLE_NAME and SCHEMA_NAME cannot. The grammar rule related to this is:
QualifiedName <- (CatalogQualification ReservedSchemaQualification ReservedTableName) / (SchemaQualification ReservedTableName) / (Identifier / StringLiteral)

When providing suggestions, we then still have the context that we want to provide a suggestion for either a table name or schema name.

The keyword_map is automatically generated using the extension/autocomplete/inline_grammar.py script, which generates the entire grammar. I also added a script: scripts/build_peg_grammar.sh which generates both the inlined_grammar.hpp and inlined_grammar.gram files.

Changes to keywords

I moved some of the keywords, such as MAP out of the func_name_keywords.list file. MAP is categorized as both a column_name as well as func_name in the PostgreSQL parser. In the grammar file it is categorized as: PG_KEYWORD("map", MAP, TYPE_FUNC_NAME_KEYWORD).

My confusion lies in the following query: CREATE TABLE MAP(i BIGINT);, which is a valid query. However, a table name may only be (1) an Identifier (no category of keyword) (2) an unreserved keyword (3) a column_name keyword. Seeing as map is categorized as TYPE_FUNC_NAME_KEYWORD, I don't understand how this query passes with the PostgreSQL parser, but it does. To solve this in the PEG parser, I removed MAP from the func_name_keywords.list file, but I don't think this is the best solution. Happy to hear feedback regarding this.

Failing test cases:

The following test cases are excluded from the new PEG parser:

  1. PREPARE s1 AS SELECT ?1
    Reason for Exclusion: This query relies on a parsing behavior where a number following a ? parameter is treated as a column alias.
    Explanation: The standard parser tokenizes ?1 as two separate tokens: a positional parameter (?) and an integer literal (1). The literal is then interpreted as an alias, making the query equivalent to SELECT ? AS "1". This behavior is specific to the interaction between the tokenizer and the standard SQL parser. But to me this seems more like a bug. Normally only $1 or ? should accepted.

  2. SET s3_use_ssl=${DUCKDB_S3_USE_SSL}
    This seems like a weird interaction where we are missing '' around the ${} and somehow the SQLLogicTest framework can handle this, but trying this in the CLI seems to cause a syntax error.

Script changes

I also extended the scripts/test_peg_parser.py script that tests the PEG parser in two ways:

  1. By making it multithreaded (because I was tired of waiting around for it to finish) when setting the --no-exit flag.
  2. --include-extensions Downloads the test files for all the out-of-tree extensions and stores them under extension-test-files in the project root directory (to avoid downloading these files every time).

Dtenwolde added 30 commits June 20, 2025 14:15
… ColId, FuncName, TypeName, ColLabel and NonReservedWord rules (not used yet)
@Mytherin
Copy link
Collaborator

Thanks for the PR! Looks good - some comments:

To handle the difference between these cases, and still keep the table and schema name suggestions, I added RESERVED_TABLE_NAME and RESERVED_SCHEMA_NAME to the SuggestionState. In the IdentifierMatcher, these can then match on a reserved keywords, whereas TABLE_NAME and SCHEMA_NAME cannot. The grammar rule related to this is:

Table and schema names can of course always be reserved - they just need to be quoted. I don't think having the auto-complete not quote reserved keywords in this edge case is worth the added complexity of adding these extra suggestion states.

I moved some of the keywords, such as MAP out of the func_name_keywords.list file. MAP is categorized as both a column_name as well as func_name in the PostgreSQL parser. In the grammar file it is categorized as: PG_KEYWORD("map", MAP, TYPE_FUNC_NAME_KEYWORD).

My confusion lies in the following query: CREATE TABLE MAP(i BIGINT);, which is a valid query. However, a table name may only be (1) an Identifier (no category of keyword) (2) an unreserved keyword (3) a column_name keyword. Seeing as map is categorized as TYPE_FUNC_NAME_KEYWORD, I don't understand how this query passes with the PostgreSQL parser, but it does. To solve this in the PEG parser, I removed MAP from the func_name_keywords.list file, but I don't think this is the best solution. Happy to hear feedback regarding this.

Keywords can be either:

  • Unreserved
  • Reserved
  • A combination of 1-2 col_name, func_name and type_name - this is essentially "partially reserved, partially unreserved" where the keyword type determines where the keyword is unreserved

The enum value used is not really relevant - this is not used for parsing and only metadata that is returned by e.g. duckdb_keywords. The position in the .list files is relevant.

SET s3_use_ssl=${DUCKDB_S3_USE_SSL}
This seems like a weird interaction where we are missing '' around the ${} and somehow the SQLLogicTest framework can handle this, but trying this in the CLI seems to cause a syntax error.

This is not a SQL query that is passed directly to DuckDB - environment variables are replaced in the SQLLogicTest runner with their value.

Mytherin added a commit that referenced this pull request Jul 16, 2025
This PR introduces autocomplete suggestions for: 
- Pragma
- Settings (Using `SET` syntax)
- Scalar functions
- Table functions

This resolves some open TODOs. 

The PR doesn't introduce any changes to the grammar and is separate from
#18221

To provide these suggestions, I needed a way to get a complete list of
each type of entry. My current approach adds a new method to the
`Catalog` class (`GetAllEntries`). This works, but I don't want to
convolute the class with unneeded methods, so happy to hear feedback
regarding this :)
@duckdb-draftbot duckdb-draftbot marked this pull request as draft July 17, 2025 07:31
@Dtenwolde
Copy link
Contributor Author

I updated the PR with the following:

  • Keywords can now be in multiple categories. This complicates the logic for IdentifierMatcher, because we now not only need to check if it is part of a banned category, but also part of an allowed category.
  • Support MERGE_INTO RETURNING syntax
  • Create a ReservedIdentifierMatcher which basically allows anything to be matched against it (similar to ColLabel in PostgreSQL grammar)

@Dtenwolde Dtenwolde marked this pull request as ready for review July 21, 2025 12:30
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good - some comments from my side but none are blockers at this point - more thinking forward about how we're going to integrate this into the final parser.


namespace duckdb {
void KeywordHelper::InitializeKeywordMaps() { // Renamed for clarity
if (initialized) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to use std::call_once, otherwise this is prone to race conditions

@@ -3,68 +3,599 @@
namespace duckdb {

const char INLINED_PEG_GRAMMAR[] = {
"UnreservedKeyword <- 'ABORT'i /\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary in the grammar? Can we leave this out given that we're checking this at the code-level instead?

private:
KeywordHelper();
bool initialized;
case_insensitive_set_t reserved_keyword_map;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this is fine - but we can't use globals in the final design. Globals can't be extended by extensions. Extensions can be loaded per DuckDB instance - and different DuckDB instances can have a different set of extensions loaded. The current design will therefore prohibit extensions from registering additional keywords.

I think this needs to live in the DatabaseInstance or DBConfig somehow in the future, i.e. similar to where the parser extensions currently live.

@Mytherin Mytherin merged commit 566e5fd into duckdb:main Jul 22, 2025
55 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@Dtenwolde Dtenwolde deleted the extend_peg_parser_grammar_keywords branch July 22, 2025 13:14
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
[C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142)
Extend PEG parser grammar (duckdb/duckdb#18221)
[unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314)
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
[C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142)
Extend PEG parser grammar (duckdb/duckdb#18221)
[unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314)
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Jul 26, 2025
[C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142)
Extend PEG parser grammar (duckdb/duckdb#18221)
[unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314)
Tmonster added a commit to Tmonster/duckdb-r that referenced this pull request Sep 8, 2025
bump iceberg to latest main
[chore] Fix amalgamation build in progress_bar (duckdb/duckdb#18910)
Bump inet & aws (duckdb/duckdb#18899)
fix: refine query ETA display and Kalman filter stability (duckdb/duckdb#18880)
Bump httpfs to v1.4-andium branch (duckdb/duckdb#18898)
Encryption now encoded as a bit, centralizing in set/getter (duckdb/duckdb#18897)
Add callback for when an extension fails to load, and also log this (duckdb/duckdb#18894)
Keep base data scan state alive in ColumnData::Update call (duckdb/duckdb#18893)
Expected errors 2053 (duckdb/duckdb#18892)
fixing auto-specifying ciphers and remove double storage (duckdb/duckdb#18891)
Add rowsort to upsert_default.test (duckdb/duckdb#18890)
bump aws and iceberg (duckdb/duckdb#18889)
[chore] Bump config test/configs/compressed_in_memory.json to new format (duckdb/duckdb#18888)
[Dev] Fix footgun in `string_t::SetSizeAndFinalize` (duckdb/duckdb#18885)
Json: no reinterpret<size_t*> (duckdb/duckdb#18886)
[C API] Result schema of prepared statements (duckdb/duckdb#18779)
Add `COPY (FORMAT BLOB)` to Andium too :^) (duckdb/duckdb#18884)
Avoid automatically checkpointing if the database instance has been invalidated (duckdb/duckdb#18881)
Update spatial+vss+sqlsmith in preparation for v1.4 (duckdb/duckdb#18882)
Internal duckdb/duckdb#5796: Window Progress (duckdb/duckdb#18860)
[Test] Small fixes to concurrent attach/detach test (duckdb/duckdb#18862)
Update ducdkb iceberg hash (duckdb/duckdb#18873)
Storage fuzzing + several fixes (duckdb/duckdb#18876)
Bump mbedtls to v3.6.4 (duckdb/duckdb#18871)
[minor] Incompatible DB error message: add newline (duckdb/duckdb#18861)
Bump & remove patches for delta, avro, excel, encodings, fts  (duckdb/duckdb#18869)
Add a FORCE_DEBUG flag to force `-DDEBUG`, similar to FORCE_ASSERT (duckdb/duckdb#18872)
Expected errors 2053 (duckdb/duckdb#18864)
update duckdb azure extension ref for 1.4.0 (duckdb/duckdb#18868)
Hold segment lock during GetColumnSegmentInfo (duckdb/duckdb#18859)
Centralize attached database paths in a DatabaseFilePathManager which is shared across databases created through the same DBInstanceCache (duckdb/duckdb#18857)
Add more encryption modes CTR and CBC (duckdb/duckdb#18619)
Bump Ducklake (duckdb/duckdb#18825)
No more `wal_encryption` flag (duckdb/duckdb#18851)
Fix `NULL` path for `json_each`/`json_tree` (duckdb/duckdb#18852)
Make ATTACH OR REPLACE atomic, keep list of used databases in MetaTransaction (duckdb/duckdb#18850)
WAL <> DB File Match Fixes (duckdb/duckdb#18849)
Add test_env to unit tester (duckdb/duckdb#18847)
Merge ossivalis into main (duckdb/duckdb#18844)
Bump MySQL/Postgres/SQLite (duckdb/duckdb#18848)
Add OnBeginExtensionLoad callback (duckdb/duckdb#18842)
Ignore null verification for statistics on structs (duckdb/duckdb#18813)
Document storage version flag in CLI + minor rendering fix (duckdb/duckdb#18841)
Add the `VARIANT` LogicalType (duckdb/duckdb#18609)
Avoid printing '99 hours', given in most cases that means estimate is… (duckdb/duckdb#18839)
Don't notify Py pkg when override git describe is set (duckdb/duckdb#18843)
Add support for reading/writing native parquet geometry types (duckdb/duckdb#18832)
Fix/run function in transaction (duckdb/duckdb#18741)
Avoid expensive checkpoints and write amplification by appending row groups, and limiting vacuum operations for the last number of row groups (duckdb/duckdb#18829)
fix: silence warnings about signed/unsigned conversions. (duckdb/duckdb#18835)
fix: sanitize input for enable_logging (duckdb/duckdb#18830)
Ensure a WAL file matches the DB file and checkpoint iteration (duckdb/duckdb#18823)
Fix: Preserve database configuration flags for tab completion in DuckDB shell (duckdb/duckdb#18482)
Extensions.yml: Pass down save_cache to inner workflows (duckdb/duckdb#18828)
Fix format-fix runs on Linux (duckdb/duckdb#18827)
Re-add accidentally removed check if copy_from is supported (duckdb/duckdb#18824)
[Fix] Bug in fixed-size buffer when throwing out-of-memory (duckdb/duckdb#18769)
For BC reasons - keep VARINT as alias for BIGNUM (duckdb/duckdb#18821)
Add callback to get a list of copy options, use this to provide suggestions and to erase options from import that are only used during exporting (duckdb/duckdb#18812)
fix: Add COLLATE NOCASE support to strpos function (duckdb/duckdb#18819)
[CI] install  libcurl4-openssl-dev with apt-get (duckdb/duckdb#18811)
Provide failing file name in Parquet reader error messages (duckdb/duckdb#18814)
Test runner: Expand '{UUID}' into a random UUID (duckdb/duckdb#18809)
Expected errors 2053 (duckdb/duckdb#18810)
Add support for non-aggregate window functions (duckdb/duckdb#18788)
Fix some unindented interactions between `EMPTY_RESULT_PULLUP` and `MATERIALIZED` CTEs (duckdb/duckdb#18805)
Typed macro parameters (duckdb/duckdb#18786)
Internal duckdb/duckdb#3273: Hashed Sort Callbacks (duckdb/duckdb#18796)
[chore] Fixup tidy-check on src/logging/log_manager.cpp by passing const & (duckdb/duckdb#18801)
Fixup progress_bar: avoid converting doubles into int32_t unchecked (duckdb/duckdb#18800)
Issue duckdb/duckdb#18767: Ignore Timestamp Offsets (duckdb/duckdb#18794)
bump httpfs so it includes curl option (duckdb/duckdb#18691)
Improve autocomplete suggestions (duckdb/duckdb#18773)
Remove everything python-package related (duckdb/duckdb#18789)
Support expressions as COPY file target (duckdb/duckdb#18795)
fix: coalesce query progress updates to reduce terminal writes (duckdb/duckdb#18672)
Task Scheduler: track exact task count, and re-signal on dequeue failure if there are tasks left (duckdb/duckdb#18792)
Internal duckdb/duckdb#3273: Parallel Window Masks (duckdb/duckdb#18731)
fix: improve speed of GetValue() for STRUCT type (duckdb/duckdb#18785)
Add `memory_limit` parameter to `benchmark_runner`/`test_runner.py` (duckdb/duckdb#18790)
Treat ENABLE_EXTENSION_AUTOINSTALL as the BOOL that it is (duckdb/duckdb#18778)
Move row id logic to separate RowIdColumnData class instead of inlining it into the RowGroup (duckdb/duckdb#18780)
Improve error messages for merge / vector reference (duckdb/duckdb#18777)
Use microsecond resolution for printing the current timestamp (duckdb/duckdb#18776)
Add `file_size_bytes` (de-)serialization (duckdb/duckdb#18775)
Propagate `DUCKDB_*_VERSION` in extensions and tests (duckdb/duckdb#18774)
[Test Fix] Forward output to file (duckdb/duckdb#18772)
[CI] Adjust test configs post logger PR (duckdb/duckdb#18771)
Revert "Use 1-based indexing for SQL-based JSON array extraction" (duckdb/duckdb#18758)
Make `duckdb_log` return a TIMESTAMP_TZ (duckdb/duckdb#18768)
Fix Path Typo in Extension's CMake Warning Message (duckdb/duckdb#18766)
Fix index resolution when querying table with index via view (duckdb/duckdb#18319)
Fix radix partitioning with more than 10 bits (duckdb/duckdb#18761)
Add support for auto-globbing within a directory: if no matches are found for a specific path, we retry with `/**/*.[ext]` appended (duckdb/duckdb#18760)
Refactor read_blob and read_text to use MultiFileFunction. (duckdb/duckdb#18706)
Add missing expected errors to the test cases (next chunk) (duckdb/duckdb#18753)
Minor logging fixes and more benchmarking (duckdb/duckdb#18755)
Extensions.yml should also check converted_to_draft (duckdb/duckdb#18754)
[Profiling] Add Profiling to Write Function (duckdb/duckdb#18724)
Fixing lazy polars execution on query result (duckdb/duckdb#18749)
Remove separate WAL encryption flag (duckdb/duckdb#18750)
Add leak suppressions to nightly runs (duckdb/duckdb#18748)
Append using a SQL query, instead of directly appending to a base table, and support user-provided queries through the QueryAppender (duckdb/duckdb#18738)
removed placeholder client directories for node and jdbc, its been > 1 yr (duckdb/duckdb#18757)
Add missing expected errors to the test cases (duckdb/duckdb#18746)
Add OS X notarization for DuckDB CLI and libduckdb.dylib (duckdb/duckdb#18747)
Use correct type for pushing collations in subqueries (duckdb/duckdb#18744)
Merge ossivalis into main (duckdb/duckdb#18719)
Secrets: if serialization_type is not specified, assume it's a key value secret (duckdb/duckdb#18743)
[C API] Function to set a copy callback for bind data (duckdb/duckdb#18739)
fix timetravel for default tables (duckdb/duckdb#18240)
[unittest] SkipLoggingSameError() to make unittester report one failure per case (duckdb/duckdb#18270)
Use 1-based indexing for SQL-based JSON array extraction (duckdb/duckdb#18735)
Add (CSV) file logger (duckdb/duckdb#17692)
feat: enhance .tables command with schema disambiguation and filtering (duckdb/duckdb#18641)
Internal duckdb/duckdb#5669: Loop Join Thresholds (duckdb/duckdb#18733)
Fix PIVOT in multiple statements (duckdb/duckdb#18729)
Minor fixes for other catalogs - mostly checking `IsDuckTable()` for unsupported operations (duckdb/duckdb#18720)
Added support for blob<->uuid conversions (duckdb/duckdb#18027)
#Fix 18558: add row_group scan fast path (duckdb/duckdb#18686)
Improved grammar generation script (duckdb/duckdb#18716)
Correctly throw an error when too few columns are supplied in MERGE INTO INSERT (duckdb/duckdb#18715)
[Profiling] Add Profiling to Read Function (duckdb/duckdb#18661)
Fix issue with materialized CTE optimization in flatten_dependent_join (duckdb/duckdb#18714)
Add Option to Allocate Using an Arena in `string_t` (duckdb/duckdb#17992)
Internal duckdb/duckdb#3273: Hashed Sort States (duckdb/duckdb#18690)
Python-style positional/named arguments for macro's (duckdb/duckdb#18684)
[Fix] Correctly handle table and index chunks in WAL replay buffering (duckdb/duckdb#18700)
Make ART construction iterative via ARTBuilder (duckdb/duckdb#18702)
Correctly handle collations for IN (subquery) (duckdb/duckdb#18698)
Hold row group lock for entire call of MoveToCollection (duckdb/duckdb#18694)
Expected errors 2053 (duckdb/duckdb#18695)
Issue duckdb/duckdb#18457: DateTrunc Simplification Warnings (duckdb/duckdb#18687)
[Python SQLLogicTest] Add `test/sql/pragma/profiling/test_profiling_all.test` to the SKIPPED_TESTS set (duckdb/duckdb#18689)
Make sure parse errors are wrapped in ErrorData (duckdb/duckdb#18682)
Internal duckdb/duckdb#5366: Window State Arguments  (duckdb/duckdb#18676)
Expected errors 2053 (duckdb/duckdb#14213)
Add `date_trunc()` simplification rules (duckdb/duckdb#18457)
Fix the issue where delta_for isn't used in bitpacking when for is unavailable (duckdb/duckdb#18616)
fix error message related to wrong memory unit (duckdb/duckdb#18671)
Grab lock and double-check that column is not loaded in MoveToCollection (duckdb/duckdb#18677)
Correctly allocate uncompressed string data in ZSTD for many giant strings (duckdb/duckdb#18678)
Internal duckdb/duckdb#5662: IEJoin Test Plans (duckdb/duckdb#18680)
[ Python SQLLogic Tester ] Add `MERGE_INTO` to `statement.type` enum in `result.py` (duckdb/duckdb#18675)
Internal duckdb/duckdb#5366: Window Interrupt Arguments (duckdb/duckdb#18651)
Correctly set weights in reservoir sample when switch to slow sampling (duckdb/duckdb#18563)
[Dev] Add script to create patch from changes in an extension repository (duckdb/duckdb#18620)
Python test runner: Fix hash comparison error output (duckdb/duckdb#18626)
[CI] skip building encodings extension in InvokeCI (duckdb/duckdb#18655)
CLI: Make ETA more of an estimate, and support large_row_rendering for footers (duckdb/duckdb#18656)
Merge ossivalis into main (duckdb/duckdb#18644)
Python test runner: Fix result check for `COPY ... RETURN_STATS` queries (duckdb/duckdb#18625)
Add 1.4 release codename (duckdb/duckdb#18652)
Change arrow() to export record batch reader (duckdb/duckdb#18642)
bump spatial (on main) (duckdb/duckdb#18197)
bump avro to v1.4 (duckdb/duckdb#18434)
Make more configs into generic settings (duckdb/duckdb#18592)
Add "Hash Zero" verification CI run (duckdb/duckdb#18623)
feat: add ETA to progress bar in DuckDB CLI (duckdb/duckdb#18575)
wrap httplib ::max() call in `WIN_32` check (duckdb/duckdb#18590)
[ART] ART::Erase refactoring (duckdb/duckdb#18595)
[CSV Sniffer] Fix type detection issue with union and empty columns (duckdb/duckdb#18606)
Add Field IDS to multi file reader for positional deletes (duckdb/duckdb#18617)
Re-add `hugeint` to `__internal_compress_string` (duckdb/duckdb#18622)
Adjust filter pushdown to latest polars release (duckdb/duckdb#18624)
parquet/parquet_multi_file_info.cpp: fix move from stack (duckdb/duckdb#18634)
Issue duckdb/duckdb#18631: Streaming Windowed Quantile (duckdb/duckdb#18636)
Fix serialization backwards compatability for varargs functions (duckdb/duckdb#18596)
[Profiling] Add client context into more read functions (duckdb/duckdb#18514)
[CI] Don't zip and upload Code Coverage tests results when Code Coverage got cancelled (duckdb/duckdb#18607)
[Test] Fix test case and a benchmark (duckdb/duckdb#18610)
Update README.md (duckdb/duckdb#18614)
correctly setting log transaction id in ThreadContext (duckdb/duckdb#18536)
[Fix] Hidden test failure in test_struct_update.test (duckdb/duckdb#18598)
Increment storage version to enable `DICT_FSST` in benchmark file (duckdb/duckdb#18588)
fix hidden merge conflict (duckdb/duckdb#18589)
Adds a function for updating and adding values in a struct (duckdb/duckdb#15533)
Pushdown filters on coalesced outer join keys compared for equality under the join condition (duckdb/duckdb#18169)
fix: libduckdb.so missing soversion (duckdb/duckdb#18305)
String dictionary hash cache (duckdb/duckdb#18580)
Force `LIST`/`ARRAY` child vectors on a Parquet single page (duckdb/duckdb#18578)
fix: use thousands separator and decimal for row counts in`duckbox` output format (duckdb/duckdb#18564)
Flip left/right delim join based on cardinalities (duckdb/duckdb#18552)
[Fix] Adjust shrink threshold back to original count > SHRINK_THRESHOLD (duckdb/duckdb#18582)
[CSV Sniffer] Fixing bug of not properly setting skipped rows from sniffer (duckdb/duckdb#18555)
fix: add formatting to explain row counts (duckdb/duckdb#18566)
Delete FUNDING.json
Update FUNDING.json
Create FUNDING.json
[Indexes] Buffer-managed indexes part 3: segment handle for Node48 and Node256 (duckdb/duckdb#18567)
Rename the Varint type to Bignum (duckdb/duckdb#18547)
Add compile option standalone-debug for clang (duckdb/duckdb#17433)
Fixing compilation with -std=cpp23 (duckdb/duckdb#18557)
[easy] [no-op] Minor optimization on iterator lookup (duckdb/duckdb#15349)
optimize/parquet: generate movable types for parquet (duckdb/duckdb#18510)
Check if `heap_block_ids` is empty before getting start/end when destroying chunks in `TupleDataCollection` (duckdb/duckdb#18556)
Implement special-case `VARCHAR` to `JSON[]` casts and vice versa (duckdb/duckdb#18541)
[ART] Node::Free refactoring (duckdb/duckdb#18544)
[Fix] Follow-up PR to only delete unique row IDs (duckdb/duckdb#18545)
Restore missing `test/configs/small_block_size.json` file (duckdb/duckdb#18507)
Unittester: Add the `--sort-style` parameter that allows a fallback comparison where results are sorted according to a given sort-style (duckdb/duckdb#18542)
Allow overriding openssl version for FIPS compliance (duckdb/duckdb#18499)
fix: improve handling variant nulls and nested types (duckdb/duckdb#18538)
Add support for explicit clean-up routine in test config, and exit multi-statement execution when an error is encountered (duckdb/duckdb#18539)
Use global index, not local id when creating filters in `MultiFileColumnMapper` (duckdb/duckdb#18537)
Add `StatementVerifier` for `EXPLAIN` (duckdb/duckdb#18529)
Add CAPI to retrieve client context for table functions (duckdb/duckdb#18520)
fix: support both field orders for variant struct (duckdb/duckdb#18532)
[Varint] Negation, Subtraction and Over/under-flow checking (duckdb/duckdb#18477)
ALP test: skip TPC-DS 67 - it is not consistent with floating point numbers (duckdb/duckdb#18528)
Consistently detect JSON schema indepent of number of threads (duckdb/duckdb#18522)
Internal duckdb/duckdb#16560: Numeric TRUNC Precision (duckdb/duckdb#18511)
Dynamically determine dictionary size limit in Parquet writer (if unset) (duckdb/duckdb#18356)
Fix incorrect character encoding in GetLastErrorAsString on Windows (duckdb/duckdb#18431)
Fix: Write the salt together with the HT offset when determining the value for key comparison (duckdb/duckdb#18374)
When tracking evicted_data_per_tag, track actual size on disk after temp file compression (duckdb/duckdb#18521)
Adding WITH ORDINALITY to DuckDB (duckdb/duckdb#16581)
ParserException for Pragma with named parameters (duckdb/duckdb#18506)
Temporarily excluding `Build Pyodide wheel` for Python 3.11 because it fails to build `WASM` wheels  (duckdb/duckdb#18508)
Remove `immediate_transaction_mode` from DB config options (duckdb/duckdb#18516)
Allow expressions to be used in ATTACH / COPY options (duckdb/duckdb#18515)
Fix several bugs/fuzzer issues (duckdb/duckdb#18503)
Fix: Remove overly strict assertion on empty string value (duckdb/duckdb#18504)
Change UNICODE to UTF8 (duckdb/duckdb#17586)
Merge ossivalis (duckdb/duckdb#18502)
fix: add missing space in AttachInfo::ToString() (duckdb/duckdb#18500)
julia: config improvements (duckdb/duckdb#17585)
[Profiling] Add client context into read functions (duckdb/duckdb#18438)
Fix accidental internal exception in type transformation (duckdb/duckdb#18492)
add delta linux back to ci (duckdb/duckdb#18491)
Change ctrl-a/ctrl-e to move to start/end of line, not buffer (duckdb/duckdb#18490)
Unify `ON CONFLICT` and `MERGE INTO` (duckdb/duckdb#18480)
Internal duckdb/duckdb#5384: Window Sorting Polish (duckdb/duckdb#18484)
re-nable extensions in invokeci (duckdb/duckdb#18476)
SUM and + Operator for Varints (duckdb/duckdb#18424)
Internal duckdb/duckdb#5366: WindowDeltaScanner  (duckdb/duckdb#18468)
Merge ossivalis (duckdb/duckdb#18456)
Bump postgres to latest main (duckdb/duckdb#18464)
Internal duckdb/duckdb#5385: WindowMergeSortTree Sort Update (duckdb/duckdb#18461)
Add support for generic settings, and move many settings over to generic settings (duckdb/duckdb#18447)
Buffer index appends during WAL replay (duckdb/duckdb#18313)
Internal duckdb/duckdb#5384: WindowDistinctAggregator Sort Update  (duckdb/duckdb#18442)
Add support for "template" types (duckdb/duckdb#18410)
Update pyodide build to 0.28.0 (duckdb/duckdb#18446)
Parquet: add row-group ordinal during writing encryption (duckdb/duckdb#18433)
Include pyodide build configuration (duckdb/duckdb#18183)
[Fix] Block size nightly (duckdb/duckdb#18425)
Internal duckdb/duckdb#5368: WindowNaiveAggregator Sort Update  (duckdb/duckdb#18409)
Internal duckdb/duckdb#5367: SortedAggregateFunction Sort Update  (duckdb/duckdb#18408)
Refactor extension CI to use extension-ci-tools (duckdb/duckdb#18361)
Correctly fetch only base column data in ColumnData::FetchUpdateData (duckdb/duckdb#18423)
feat: remove anything following `?` in database name (duckdb/duckdb#18417)
Merge `v1.3-ossivalis` in `main` (duckdb/duckdb#18401)
Add support for table_constraints of AdbcConnectionGetObjects() (duckdb/duckdb#18181)
Add DuckLake back in (duckdb/duckdb#18405)
Internal duckdb/duckdb#5294: TIME_NS C API (duckdb/duckdb#18215)
Remove incorrect assertion (duckdb/duckdb#18404)
[ Python SQLLogic Tester ] Add `MERGE_INTO` statement to duckdb python (duckdb/duckdb#18402)
CI: Add separate job for discussion mirroring (duckdb/duckdb#18407)
Wrap runner.ExecuteFile, otherwise cleanup is not properly performed (duckdb/duckdb#18400)
Internal duckdb/duckdb#3273: Window Hashed Sort (duckdb/duckdb#18337)
Store extra metadata blocks in RowGroupPointer, and only flush dirty Metadata blocks (duckdb/duckdb#18398)
CI: Fix Discussion mirroring (duckdb/duckdb#18397)
Record whether or not cross products are implicit or not, and use this for converting queries back to SQL (duckdb/duckdb#18394)
Correct and consistent integer arithmetic error messages (duckdb/duckdb#18393)
Re-use metadata of unaltered row groups when checkpointing a table (duckdb/duckdb#18395)
Approx database count system function (duckdb/duckdb#18392)
Re-use table metadata when table is not altered during checkpoint (duckdb/duckdb#18390)
Bump httpfs (duckdb/duckdb#18388)
Uncomment skipped decimal REE tests (duckdb/duckdb#18372)
Re-enable but deprecate CORE_EXTENSIONS in CMakeLists.txt (duckdb/duckdb#18377)
Add missing ninja to workflow file (duckdb/duckdb#18373)
Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18364)
Pass `AttachOptions` to `attach` method, and turn `StorageExtensionInfo` into an `optional_ptr` (duckdb/duckdb#18368)
Split up out-of-tree extensions into separate files, and allow out-of-tree extensions to be built using BUILD_EXTENSIONS={ext_name} (duckdb/duckdb#18357)
Python external dispatch param fixes (duckdb/duckdb#18359)
Revert "[unittest] - fix doubled error headers on `Unexpected failure`" (duckdb/duckdb#18355)
Add support for checkpointing in-memory tables (duckdb/duckdb#18348)
[C API] Expose expressions and use them in scalar function binding (duckdb/duckdb#18142)
Extend PEG parser grammar (duckdb/duckdb#18221)
[unittest] - fix doubled error headers on `Unexpected failure` (duckdb/duckdb#18314)
Fix condition indexes in join filter pushdown (duckdb/duckdb#18341)
download Real Nest data in quiet mode (duckdb/duckdb#18346)
Fix debug error in join order optimizer (duckdb/duckdb#18344)
Aarch64 backport (duckdb/duckdb#18345)
add the from-table-function as parameter to copy-from-bind (duckdb/duckdb#18004)
feat: making Parquet write RowGroup.total_compressed_size (duckdb/duckdb#18307)
Make storage-version a test parameter (duckdb/duckdb#18324)
New Arrow C-API (duckdb/duckdb#18246)
feat: Parquet extension add row_group_compressed_size (duckdb/duckdb#18294)
Merge ossivalis into main (duckdb/duckdb#18272)
SHOW TABLES FROM <qualified_name> (duckdb/duckdb#18179)
Add target for installing Python deps. (duckdb/duckdb#18285)
Use `FromEpochSeconds` instead of `FromTimeT` in `FileSystem::GetLastModifiedTime` (duckdb/duckdb#18281)
[Fix] Adjust test to run with different block sizes (duckdb/duckdb#18277)
Use DuckDB cast infrastructure in fmt for new uhugeint/hugeint code (duckdb/duckdb#18275)
Use set for row ID scanning during index scans (duckdb/duckdb#18274)
Add support for RETURNING to MERGE INTO (duckdb/duckdb#18271)
Support HUGEINT in printf and format (duckdb/duckdb#13277)
Expanded autocomplete suggestions (duckdb/duckdb#18243)
[Parquet] Add read support for the `VARIANT` LogicalType (with shredded encoding) (duckdb/duckdb#18224)
Reduce copy in Vector::Reinterpret (duckdb/duckdb#18264)
Fixes for gcc 15 (duckdb/duckdb#18261)
Fix dictionary-related assertions (duckdb/duckdb#18260)
Allow for static libs from extension dependencies to be bundled (duckdb/duckdb#18226)
disable WebAssembly duckdb-wasm builds job in NightlyTests triggered by 'workflow_dispatch' event (duckdb/duckdb#18129)
Bunch of loosely connected test/CI fixes (duckdb/duckdb#18254)
update run_extension_medata_tests.sh (duckdb/duckdb#17976)
fixes for some minor llvm 20 complaints (duckdb/duckdb#18257)
Fix integer overflow in sequence vector (duckdb/duckdb#18245)
Add type safety to `FlatVector::GetData<T>`, `ConstantVector::GetData<T>` and `UnifiedVectorFormat::GetData<T>` (duckdb/duckdb#18256)
Slightly higher memory limit for test (duckdb/duckdb#18235)
Improve descriptions of thresholds vars affecting join algorithm selection (duckdb/duckdb#17377)
Add support for geoarrow encoded geometries in geoparquet files. (duckdb/duckdb#17942)
Dictionary functions (duckdb/duckdb#18127)
Better `NULL` handling in `TupleDataLayout` (duckdb/duckdb#18069)
Track `DataChunk` memory usage in various places (duckdb/duckdb#18191)
[Parquet] Add read support for the `VARIANT` LogicalType (duckdb/duckdb#18187)
Bugduckdb/duckdb#18163 Fix STDDEV_SAMP undeterminism (duckdb/duckdb#18210)
Internal duckdb/duckdb#5264: NLJ Not Distinct (duckdb/duckdb#18216)
Improve Parquet reader `NULL` statistics and compress all-`NULL` columns using `CompressedMaterialization` (duckdb/duckdb#18217)
Get type of encoded `SortKey` from `TupleDataLayout` (duckdb/duckdb#18218)
ci(pyodide): enable WASM exceptions on the latest pyodide build (duckdb/duckdb#18173)
Temporary file encryption (duckdb/duckdb#18208)
More internal-linkage (duckdb/duckdb#18177)
Two-rowID-leaf support in the conflict manager and general refactoring (duckdb/duckdb#18194)
[Parquet][Dev] Update the vendored `parquet.thrift` to `3ce0760` (duckdb/duckdb#18195)
Parquet reader logging (duckdb/duckdb#18172)
Merge `v1.3-ossivalis` into `main` (duckdb/duckdb#18188)
[Profiling] Move the client context into more write functions (duckdb/duckdb#17875)
Check if `GetLastSegment` is not `nullptr` in `ColumnData::RevertAppend` (duckdb/duckdb#18171)
Reduce lock contention for the instance cache (duckdb/duckdb#18079)
fix bug with allowed_paths (duckdb/duckdb#18176)
Avoid `realloc` in CSV writer (duckdb/duckdb#18174)
fix typo (duckdb/duckdb#18165)
Resolve some small build issues (duckdb/duckdb#18162)
Implement `replace_type` function (duckdb/duckdb#18077)
Issue duckdb/duckdb#17683: TIME_NS Compilation (duckdb/duckdb#18053)
Add support for AdbcConnectionGetObjects(table_type) (duckdb/duckdb#18066)
Detect when updates have no effect, and skip performing the actual updates if we encounter these nop updates (duckdb/duckdb#18144)
Add support for `MERGE INTO` (duckdb/duckdb#18135)
Improve sort key comparison performance (duckdb/duckdb#18131)
set ::error:: annotations for test runners (duckdb/duckdb#18072)
Internal duckdb/duckdb#3273: Window Task Generation (duckdb/duckdb#18113)
Update description of 'arrow_lossless_conversion' (duckdb/duckdb#18046)
[chore] Merge v1.3-ossivalis on main (duckdb/duckdb#18109)
ci: build duckdb against the latest emscripten (duckdb/duckdb#18110)
Don't throw `InternalException` in `Sort::Sink` (duckdb/duckdb#18105)
TPC-DS: Use BIGINT fields (duckdb/duckdb#18098)
[CI] don't run jobs on draft PRs (duckdb/duckdb#18016)
Fix  correlated subquery unnest fail (duckdb/duckdb#18092)
[CSV Reader] Prohibit options delim and sep in same read_csv call (duckdb/duckdb#18096)
Add start/end offset percentage options to Python test runner (duckdb/duckdb#18091)
Avoid running DraftPR.yml until timeout if token is missing (duckdb/duckdb#18090)
Unittest: Configure skip error messages (duckdb/duckdb#18087)
Switch to Optional for type hints in polars lazy dataframe function (duckdb/duckdb#18078)
Issue duckdb/duckdb#18071: Temporal inf -inf (duckdb/duckdb#18083)
Fix some scaling issues (duckdb/duckdb#17985)
Unittester: add `on_new_connection` + `on_load` + `skip_tests` options (duckdb/duckdb#18042)
Use `timestamp_t` instead of `time_t` for file last modified time (duckdb/duckdb#18037)
Add support for class-based expression iteration (duckdb/duckdb#18070)
fix star expr exclude error (duckdb/duckdb#18063)
Adding WAL encryption (duckdb/duckdb#17955)
Avoid adding commands read from a file to the shell history (duckdb/duckdb#18057)
Remove match-case statements from polars_io.py (duckdb/duckdb#18052)
Merge ossivalis into main (duckdb/duckdb#18036)
Add ppc64le spin-wait instruction (duckdb/duckdb#17837)
Unittest: Add skip_compiled option that can be used to skip built-in C++ tests (duckdb/duckdb#18034)
[Explain] Add the YAML format for EXPLAIN statements (duckdb/duckdb#17572)
Remove Linux (32 Bit) job (duckdb/duckdb#18012)
[Chore] Minor conflict manager refactoring (duckdb/duckdb#18015)
Fix duckdb/duckdb#18007: correctly execute expressions with pivot operator (duckdb/duckdb#18020)
c-api to copy vector with selection (duckdb/duckdb#17870)
Add support to produce Polars Lazy Dataframes (duckdb/duckdb#17947)
Implement consumption and production of Arrow Binary View (duckdb/duckdb#17975)
Rework extension loading to go through thread-safe ExtensionManager (duckdb/duckdb#17994)
Issue duckdb/duckdb#5123: make_timestamp_ms (duckdb/duckdb#17908)
Flag to disable database invalidation (duckdb/duckdb#17938)
[Fix] Reset profiling info before preparing a query (duckdb/duckdb#17940)
Issue duckdb/duckdb#5144: AsOf Join Threshold (duckdb/duckdb#17979)
[CI] Skip some workflows when updating out of tree extensions SHA (duckdb/duckdb#17949)
Merge v1.3-ossivalis into main (duckdb/duckdb#17973)
[nested] Allow fixed-size arrays to be unnested (duckdb/duckdb#17968)
Unit Tester Configuration (duckdb/duckdb#17972)
[Nested] Optimize structs in `LIST_VALUE` (duckdb/duckdb#17169)
Enable building spatial and encodings extensions (duckdb/duckdb#17960)
[Nested] Add `struct_position` and `struct_contains` functions (duckdb/duckdb#17819)
Visual Studio 17 (2022) fixes (duckdb/duckdb#17948)
[CI Nightly Fix] Skip logging test if not standard block size (duckdb/duckdb#17957)
Add v1.3-ossivalis to Cross version workflow (duckdb/duckdb#17906)
Unittester failures summary (duckdb/duckdb#16833)
Block based encryption (duckdb/duckdb#17275)
Do not dispatch JDBC/ODBC jobs in release CI runs (duckdb/duckdb#17937)
fix use after free in adbc on invalid stmt (duckdb/duckdb#17927)
Fix empty BP block when writing parquet (duckdb/duckdb#17929)
Leverage `VectorType` in `ColumnDataCollection` (duckdb/duckdb#17881)
Merge v1.3 into main (duckdb/duckdb#17897)
Make CTE Materialization the Default Instead of Inlining (duckdb/duckdb#17459)
Use an arena linked list for the physical operator children (duckdb/duckdb#17748)
Reword GenAI policy (duckdb/duckdb#17895)
Issue duckdb/duckdb#17861: FILL Argument Types (duckdb/duckdb#17888)
Update function descriptions and examples for list, array, lambda functions (duckdb/duckdb#17886)
Add GenAI policy (duckdb/duckdb#17882)
Issue duckdb/duckdb#17849: Test FILL Duplicates (duckdb/duckdb#17869)
Add STRUCT to MAP cast function (duckdb/duckdb#17799)
Issue duckdb/duckdb#17040: FILL Secondary Sorts (duckdb/duckdb#17821)
Issue duckdb/duckdb#17153: Window Order Columns (duckdb/duckdb#17835)
julia: add missing methods from C-API (duckdb/duckdb#17733)
Function Serialization: adapt to removal of overloads by explicitly casting if argument types have changed (duckdb/duckdb#17864)
[Indexes] Buffer-managed indexes part 2: segment handle for base nodes (duckdb/duckdb#17828)
duckdb/duckdb#17853 Enable flexible page sizes and update Android NDK to r27 in workflow. (duckdb/duckdb#17854)
Internal duckdb/duckdb#4991: Remove Epoch_MS(MS) (duckdb/duckdb#17816)
Add `duckdb_type` column to parquet_schema (duckdb/duckdb#17852)
Merge v1.3 into main (duckdb/duckdb#17851)
Fix ICE with Windows ARM64 (duckdb/duckdb#17844)
fix: escape using_columns on JoinRef::ToString (duckdb/duckdb#17839)
Merge130 (duckdb/duckdb#17833)
Replace string for const data ptr in encryption api (duckdb/duckdb#17825)
Pushdown pivot filter (duckdb/duckdb#17801)
Merge v1.3 into main (duckdb/duckdb#17806)
Add qualified parameter to Python GetTableNames API (duckdb/duckdb#17797)
Fix propagatesNullValues for case expr (duckdb/duckdb#17796)
[Profiling] Propagate the ClientContext into file handle writes (duckdb/duckdb#17754)
Ensure we use the same layout in `RadixPartitionedHashTable` and `GroupedAggregateHashTable` (duckdb/duckdb#17790)
[Julia] api docs improvements  (duckdb/duckdb#15645)
[Indexes] Buffer-managed indexes part 1: segment handles (duckdb/duckdb#17758)
Mark Upper/LowerComparisonType as const (duckdb/duckdb#17773)
Support glibc 2.28 environments (duckdb/duckdb#17776)
Pass `ExtensionLoader` when loading extensions, change extension entry function (duckdb/duckdb#17772)
Expose file_size_bytes and footer_size in parquet_file_metadata (duckdb/duckdb#17750)
[CAPI] Expose ErrorData (duckdb/duckdb#17722)
Rename decorator from test_nulls to null_test_parameters (duckdb/duckdb#17760)
re-add httpfs apply_patches (duckdb/duckdb#17755)
Deprecate windows-2019 runners  (duckdb/duckdb#17745)
csv_scanner: correct code comment (duckdb/duckdb#17735)
Adding additional authenticated data for encryption (duckdb/duckdb#17508)
[SQLLogicTester] Introduce `reset label <query label>` in the tester (duckdb/duckdb#17729)
Fix windows-2025 build errors (duckdb/duckdb#17726)
Aggregation performance (duckdb/duckdb#17718)
fix linux extension ci (duckdb/duckdb#17720)
Correctly setting the delim offset (duckdb/duckdb#17716)
Sorting followup (duckdb/duckdb#17717)
Revert "set default for MAIN_BRANCH_VERSIONING to false" (duckdb/duckdb#17708)
ClientBufferManager wrapper to access the client context in the buffer manager (duckdb/duckdb#17699)
Issue duckdb/duckdb#17040: FILL Window Function  (duckdb/duckdb#17686)
Merge v1.3-ossivalis into main (duckdb/duckdb#17690)
New Sorting Implementation (duckdb/duckdb#17584)
Output hashes in unittest and fix order (duckdb/duckdb#17664)
Enable profiling output for all operator types (duckdb/duckdb#17665)
[C API] Expose duckdb_scalar_function_bind_get_extra_info (duckdb/duckdb#17666)
Add rowsort in generate_series test duckdb/duckdb#43 (duckdb/duckdb#17675)
bump DuckDB_jll to v1.3.0 (duckdb/duckdb#17677)
C API tidying (duckdb/duckdb#17623)
fix extension troubleshooting link (duckdb/duckdb#17616)
Move query profiler's EndQuery after commit/rollback (duckdb/duckdb#17595)
Update function descriptions and examples (duckdb/duckdb#17132)
Add support for ToSqlString for union types (duckdb/duckdb#17513)
Remove redundant code path in the ConflictManager (duckdb/duckdb#17562)
change exception type to not be an internal exception (duckdb/duckdb#17551)
Python package devexp improvements (duckdb/duckdb#17483)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants