Skip to content

Conversation

Mytherin
Copy link
Collaborator

#10321 added the create_sort_key method, which allows us to take any input type, and create a binary-comparable BLOB according to a set of ordering constraints (e.g. ASCENDING/DESCENDING/NULLS FIRST/NULLS LAST). This PR adds the inverse - DecodeSortKey - albeit not as a scalar function but only for internal usage. This function allows us to take any binary blob created by the create_sort_key method, and reconstruct the original values that were ingested.

This method has multiple uses - but for this PR the primary use case is for this combination of functions to work as a fallback mechanism for types we do not specialize on. As DuckDB supports arbitrarily composible types (in the form of lists, structs, maps, that can all be nested) - we cannot specialize on every type - yet there are functions that need to work on every type (e.g. MIN/MAX). By using the sort keys, we only need to implement these methods for the BLOB type - after which we can support all types through e.g. decode_sort_key(FUNCTION(create_sort_key(...))). This allows us to more easily add support for all types in a reasonably efficient manner (although not as efficient as specializing directly of course).

In this PR we switch the fallback implementation of MIN/MAX to use the new sort keys. The old fallback implementation constructed a Vector in the aggregate state that would hold only a single element, and used vector operations (e.g. VectorOperations::Copy) to assign elements to the aggregate. The sorting key implementation is significantly more performant - as the vector methods are not designed to operate on individual elements. In addition, this fallback method is also far more memory efficient, particularly when running the min/max operations on many groups, as vectors are designed to hold large arrays.

Below are some benchmarks running min/max over structs/lists. We can see the large increase in performance, especially as the number of groups increases. The old implementation would also be killed by the OOM killer when running with many groups - which the new implementation does not suffer from.

Ungrouped
CREATE TABLE structs AS SELECT {'i': i} s FROM range(100000000) t(i);
SELECT MIN(s), MAX(s) FROM structs;

CREATE TABLE lists AS SELECT [i, i + 1, i + 2] l FROM range(100000000) t(i);
SELECT MIN(l), MAX(l) FROM lists;
Type v1.0.0 New
Structs 1.50s 0.44s
Lists 1.64s 1.48s
10K Groups
CREATE TABLE structs AS SELECT i%10000 AS grp, {'i': i} s FROM range(100000000) t(i);
SELECT grp, MIN(s), MAX(s) FROM structs GROUP BY grp;

CREATE TABLE lists AS SELECT i%10000 AS grp, [i, i + 1, i + 2] l FROM range(100000000) t(i);
SELECT grp, MIN(l), MAX(l) FROM lists GROUP BY grp;
Type v1.0.0 New
Structs 4.4s 0.44s
Lists 9.4s 2.1s
10M Groups
CREATE TABLE structs AS SELECT i%10000000 AS grp, {'i': i} s FROM range(100000000) t(i);
SELECT grp, MIN(s), MAX(s) FROM structs GROUP BY grp;

CREATE TABLE lists AS SELECT i%10000000 AS grp, [i, i + 1, i + 2] l FROM range(100000000) t(i);
SELECT grp, MIN(l), MAX(l) FROM lists GROUP BY grp;
Type v1.0.0 New
Structs KILLED 2.0s
Lists KILLED 64s

CC @lnkuiper @hawkfish

Copy link
Contributor

@hawkfish hawkfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but had to ask about TIME_TZ.

@Mytherin Mytherin merged commit 3641078 into duckdb:feature Jun 13, 2024
@Mytherin Mytherin deleted the deserializesortkey branch June 27, 2024 13:53
renovate bot referenced this pull request in d-issy/dotfiles Sep 9, 2024
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [duckdb/duckdb](https://redirect.github.com/duckdb/duckdb) | minor |
`v1.0.0` -> `v1.1.0` |

---

### Release Notes

<details>
<summary>duckdb/duckdb (duckdb/duckdb)</summary>

###
[`v1.1.0`](https://redirect.github.com/duckdb/duckdb/releases/tag/v1.1.0):
DuckDB 1.1.0 &quot;Eatoni&quot;

[Compare
Source](https://redirect.github.com/duckdb/duckdb/compare/v1.0.0...v1.1.0)

This release of DuckDB is named "Eatoni" after Eaton's pintail (Anas
Eatoni) from the southern Indian Ocean.

Please also refer to the announcement blog post:
https://duckdb.org/2024/09/09/announcing-duckdb-110

#### What's Changed

- Add feature changes back in by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/11146](https://redirect.github.com/duckdb/duckdb/pull/11146)
- Make `MultiFileReader` filename configurable by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/11178](https://redirect.github.com/duckdb/duckdb/pull/11178)
- \[Dev] Fix compilation issues on `feature` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11082](https://redirect.github.com/duckdb/duckdb/pull/11082)
- add query() and query_table() functions by
[@&#8203;chrisiou](https://redirect.github.com/chrisiou) in
[https://github.com/duckdb/duckdb/pull/10586](https://redirect.github.com/duckdb/duckdb/pull/10586)
- \[Block Size] Move the block allocation size into the block manager by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/11176](https://redirect.github.com/duckdb/duckdb/pull/11176)
- LIMIT pushdown below PROJECT by
[@&#8203;jeewonhh](https://redirect.github.com/jeewonhh) in
[https://github.com/duckdb/duckdb/pull/11112](https://redirect.github.com/duckdb/duckdb/pull/11112)
- BUGFIX: IN () filter with one argument should translate to = filter.
by [@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/11473](https://redirect.github.com/duckdb/duckdb/pull/11473)
- Regression Script should calculate micro benchmark differences with
the correct base branch by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/11762](https://redirect.github.com/duckdb/duckdb/pull/11762)
- Pushdown filters on window partitions by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/10932](https://redirect.github.com/duckdb/duckdb/pull/10932)
- Arrow ListView Type by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/10766](https://redirect.github.com/duckdb/duckdb/pull/10766)
- Add scalar function support to the C API by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/11786](https://redirect.github.com/duckdb/duckdb/pull/11786)
- Add TopN optimization in physical plan mapping by
[@&#8203;kryonix](https://redirect.github.com/kryonix) in
[https://github.com/duckdb/duckdb/pull/11290](https://redirect.github.com/duckdb/duckdb/pull/11290)
- Join-dependent filter derivation by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/11272](https://redirect.github.com/duckdb/duckdb/pull/11272)
- Implement `ROW_GROUPS_PER_FILE` for Parquet by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/11249](https://redirect.github.com/duckdb/duckdb/pull/11249)
- Prefer Final projected columns on probe side if cardinalities are
similar by [@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/11109](https://redirect.github.com/duckdb/duckdb/pull/11109)
- Propagate unused columns to distinct on by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/11006](https://redirect.github.com/duckdb/duckdb/pull/11006)
- Separate eviction queues by `FileBufferType` by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/11417](https://redirect.github.com/duckdb/duckdb/pull/11417)
- Disable false positive for vector size nightly in test by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/11953](https://redirect.github.com/duckdb/duckdb/pull/11953)
- Rework jemalloc extension by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/11891](https://redirect.github.com/duckdb/duckdb/pull/11891)
- Tweak jemalloc config by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12034](https://redirect.github.com/duckdb/duckdb/pull/12034)
- Httpfs test to nightly by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12196](https://redirect.github.com/duckdb/duckdb/pull/12196)
- Removed three reinterpret casts and some rewriting by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12200](https://redirect.github.com/duckdb/duckdb/pull/12200)
- Begin Profiling Rework to move towards Modularity by
[@&#8203;maiadegraaf](https://redirect.github.com/maiadegraaf) in
[https://github.com/duckdb/duckdb/pull/11101](https://redirect.github.com/duckdb/duckdb/pull/11101)
- \[CLI] Add highlighting + limited auto-complete for shell dot commands
by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12201](https://redirect.github.com/duckdb/duckdb/pull/12201)
- Skip test to fix block size nightly and add more explicit error
checking by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12211](https://redirect.github.com/duckdb/duckdb/pull/12211)
- Remove BLOCK_ALLOC_SIZE from the column segment files by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/11474](https://redirect.github.com/duckdb/duckdb/pull/11474)
- \[Julia] - Added optional `schema` input argument to `DuckDB.Appender`
constructor by [@&#8203;curtd](https://redirect.github.com/curtd) in
[https://github.com/duckdb/duckdb/pull/12174](https://redirect.github.com/duckdb/duckdb/pull/12174)
- Fix Mark Index in the Bound Join Ref by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12263](https://redirect.github.com/duckdb/duckdb/pull/12263)
- Fix for CI Regression Failure by
[@&#8203;maiadegraaf](https://redirect.github.com/maiadegraaf) in
[https://github.com/duckdb/duckdb/pull/12273](https://redirect.github.com/duckdb/duckdb/pull/12273)
- 🦆 by [@&#8203;samansmink](https://redirect.github.com/samansmink) in
[https://github.com/duckdb/duckdb/pull/12303](https://redirect.github.com/duckdb/duckdb/pull/12303)
- Disable `JEMALLOC_RETAIN` by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12185](https://redirect.github.com/duckdb/duckdb/pull/12185)
- Enforce compression extensions for CSV Files by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/11903](https://redirect.github.com/duckdb/duckdb/pull/11903)
- Make spuriously failing test more robust by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12306](https://redirect.github.com/duckdb/duckdb/pull/12306)
- Add new extensions to issue template by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12313](https://redirect.github.com/duckdb/duckdb/pull/12313)
- \[Fix] Block size nightly run by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12283](https://redirect.github.com/duckdb/duckdb/pull/12283)
- Spell Check | Nothing Major | Corrected base_scanner.cpp by
[@&#8203;nj7](https://redirect.github.com/nj7) in
[https://github.com/duckdb/duckdb/pull/12282](https://redirect.github.com/duckdb/duckdb/pull/12282)
- add duckdb_bind_timestamp_tz function to C API by
[@&#8203;karlseguin](https://redirect.github.com/karlseguin) in
[https://github.com/duckdb/duckdb/pull/12151](https://redirect.github.com/duckdb/duckdb/pull/12151)
- \[Python] Add some date/datetime functions to pyspark api by
[@&#8203;mariotaddeucci](https://redirect.github.com/mariotaddeucci) in
[https://github.com/duckdb/duckdb/pull/12075](https://redirect.github.com/duckdb/duckdb/pull/12075)
- Fixes to Windows workflow and ubuntu\_18 action by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12308](https://redirect.github.com/duckdb/duckdb/pull/12308)
- \[Extension Dev] Forward declare re2 in `hive_partitioning.hpp` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12302](https://redirect.github.com/duckdb/duckdb/pull/12302)
- add expected errors to test/sql/copy/per_thread_output.test by
[@&#8203;hmeriann](https://redirect.github.com/hmeriann) in
[https://github.com/duckdb/duckdb/pull/12280](https://redirect.github.com/duckdb/duckdb/pull/12280)
- Issue
[#&#8203;12287](https://redirect.github.com/duckdb/duckdb/issues/12287):
ICU Strptime Lists by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12295](https://redirect.github.com/duckdb/duckdb/pull/12295)
- Issue
[#&#8203;12171](https://redirect.github.com/duckdb/duckdb/issues/12171):
Streaming Window FILTER by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12250](https://redirect.github.com/duckdb/duckdb/pull/12250)
- \[Python] Update the Connection wrapper generation, now generates c++
code by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12216](https://redirect.github.com/duckdb/duckdb/pull/12216)
- Use iterator buffer position when storing buffer handles by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12315](https://redirect.github.com/duckdb/duckdb/pull/12315)
- Bump Julia client to v0.10.3 by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12323](https://redirect.github.com/duckdb/duckdb/pull/12323)
- Fix
[#&#8203;12286](https://redirect.github.com/duckdb/duckdb/issues/12286)
- in the MetadataManager, prefer to allocate new blocks if the next free
block id is smaller than the currently used metadata block by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12318](https://redirect.github.com/duckdb/duckdb/pull/12318)
- \[Fix] Only read file size if file handle still exists by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12319](https://redirect.github.com/duckdb/duckdb/pull/12319)
- Add support for APPEND argument to hive partitioned write by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12262](https://redirect.github.com/duckdb/duckdb/pull/12262)
- Remove all reinterpret casts from the transformer by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12320](https://redirect.github.com/duckdb/duckdb/pull/12320)
- Additional check for overlapping CTE names by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12305](https://redirect.github.com/duckdb/duckdb/pull/12305)
- \[Dev] `STANDARD_VECTOR_SIZE` and `BLOCK_ALLOC_SIZE` can now be set
through the Makefile by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12164](https://redirect.github.com/duckdb/duckdb/pull/12164)
- \[Upsert] Fix issue with lambdas in `DO UPDATE SET` expressions by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11866](https://redirect.github.com/duckdb/duckdb/pull/11866)
- \[Python] Fix scoping issue for `pandas_analyze_sample` setting by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11706](https://redirect.github.com/duckdb/duckdb/pull/11706)
- Support REGEX matches expected error message by
[@&#8203;hmeriann](https://redirect.github.com/hmeriann) in
[https://github.com/duckdb/duckdb/pull/12327](https://redirect.github.com/duckdb/duckdb/pull/12327)
- Allow run_fuzzer to reduce multi statements. by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/12278](https://redirect.github.com/duckdb/duckdb/pull/12278)
- Fix
[#&#8203;12328](https://redirect.github.com/duckdb/duckdb/issues/12328)
- when flattening STRUCT vectors with NULL values, we need to flatten
the children recursively as well by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12332](https://redirect.github.com/duckdb/duckdb/pull/12332)
- Make `dbgen` generate data in parallel by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12337](https://redirect.github.com/duckdb/duckdb/pull/12337)
- dbgen: skip parallel generation if DUCKDB_NO_THREADS is set by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12341](https://redirect.github.com/duckdb/duckdb/pull/12341)
- Add prefix prefix_front_back. to get prefix_front\_ and prefix_back\_
by [@&#8203;liujiayi771](https://redirect.github.com/liujiayi771) in
[https://github.com/duckdb/duckdb/pull/12344](https://redirect.github.com/duckdb/duckdb/pull/12344)
- Issue
[#&#8203;12171](https://redirect.github.com/duckdb/duckdb/issues/12171):
Streaming Windowed DISTINCT by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12311](https://redirect.github.com/duckdb/duckdb/pull/12311)
- Update README by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12357](https://redirect.github.com/duckdb/duckdb/pull/12357)
- \[CSV Reader] \[Skip Option] Tests and fixes by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12213](https://redirect.github.com/duckdb/duckdb/pull/12213)
- Adjust BM25 score in FTS extension to prevent negative scores by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12356](https://redirect.github.com/duckdb/duckdb/pull/12356)
- Fix typos by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12360](https://redirect.github.com/duckdb/duckdb/pull/12360)
- Fix
[#&#8203;12293](https://redirect.github.com/duckdb/duckdb/issues/12293)
- accept NULL values in generate_series with timestamp by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12367](https://redirect.github.com/duckdb/duckdb/pull/12367)
- Fix
[#&#8203;12335](https://redirect.github.com/duckdb/duckdb/issues/12335):
avoid calling fsync when writing Parquet files, instead just close the
file by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12371](https://redirect.github.com/duckdb/duckdb/pull/12371)
- Fix parameters passed down to other workflows in OnTag.yml by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12369](https://redirect.github.com/duckdb/duckdb/pull/12369)
- \[Python] Fixes for the SQLLogicTest runner implementation by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12372](https://redirect.github.com/duckdb/duckdb/pull/12372)
- Bump julia to v1.0.0 by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12379](https://redirect.github.com/duckdb/duckdb/pull/12379)
- Fix
[#&#8203;11921](https://redirect.github.com/duckdb/duckdb/issues/11921)
- varchar -> timestamp casts are not invertible by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12376](https://redirect.github.com/duckdb/duckdb/pull/12376)
- Upgrade utf8proc - and move our custom extensions out of utf8proc
itself by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12373](https://redirect.github.com/duckdb/duckdb/pull/12373)
- change max_queries number back to 2000 by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/12375](https://redirect.github.com/duckdb/duckdb/pull/12375)
- Remove sqlsmith extension by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/12300](https://redirect.github.com/duckdb/duckdb/pull/12300)
- Reorder semi and anti joins. by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/11815](https://redirect.github.com/duckdb/duckdb/pull/11815)
- Issue
[#&#8203;12351](https://redirect.github.com/duckdb/duckdb/issues/12351):
implicit cast to `TIMESTAMP_MS`, `TIMESTAMP_S`, `TIMESTAMP_NS` from
`DATE` values by
[@&#8203;akoshchiy](https://redirect.github.com/akoshchiy) in
[https://github.com/duckdb/duckdb/pull/12352](https://redirect.github.com/duckdb/duckdb/pull/12352)
- Issue
[#&#8203;10023](https://redirect.github.com/duckdb/duckdb/issues/10023):
Approx_Count_Distinct Memory Usage by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12355](https://redirect.github.com/duckdb/duckdb/pull/12355)
- Fix a small typo in dev instructions for swift setup by
[@&#8203;gjmwoods](https://redirect.github.com/gjmwoods) in
[https://github.com/duckdb/duckdb/pull/12383](https://redirect.github.com/duckdb/duckdb/pull/12383)
- Release lock before returning `BufferHandle` in
`StandardBufferManager::Pin` by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12391](https://redirect.github.com/duckdb/duckdb/pull/12391)
- Remote attach autoload by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12393](https://redirect.github.com/duckdb/duckdb/pull/12393)
- Add JSON type to Parquet reader/writer by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12222](https://redirect.github.com/duckdb/duckdb/pull/12222)
- Add `RETURN_FILES` parameter to `COPY TO` by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12220](https://redirect.github.com/duckdb/duckdb/pull/12220)
- Updated JoinHashTable to use linear probing to resolve hash collisions
by [@&#8203;gropaul](https://redirect.github.com/gropaul) in
[https://github.com/duckdb/duckdb/pull/11472](https://redirect.github.com/duckdb/duckdb/pull/11472)
- \[Benchmark Runner] Add `--disable-timeout` flag by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12387](https://redirect.github.com/duckdb/duckdb/pull/12387)
- Don't replace unicode spaces within `$$` quotes in query strings by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12405](https://redirect.github.com/duckdb/duckdb/pull/12405)
- \[Python] Fix fatal exception caused by empty Pandas Categorical
objects. by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12370](https://redirect.github.com/duckdb/duckdb/pull/12370)
- Release CSV Blocks when acquiring new blocks if single threaded by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12409](https://redirect.github.com/duckdb/duckdb/pull/12409)
- Add support for prefetching multiple adjacent blocks in a single
batched read when attaching to remote databases by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12413](https://redirect.github.com/duckdb/duckdb/pull/12413)
- MatchRegex() fixed to do not return false positive result by
[@&#8203;hmeriann](https://redirect.github.com/hmeriann) in
[https://github.com/duckdb/duckdb/pull/12396](https://redirect.github.com/duckdb/duckdb/pull/12396)
- Expected errors 2053 by
[@&#8203;hmeriann](https://redirect.github.com/hmeriann) in
[https://github.com/duckdb/duckdb/pull/12392](https://redirect.github.com/duckdb/duckdb/pull/12392)
- \[C-API] Catch exception in `duckdb_execute_prepared` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12414](https://redirect.github.com/duckdb/duckdb/pull/12414)
- Combining LIST_CONCAT and CONCAT binding by
[@&#8203;maiadegraaf](https://redirect.github.com/maiadegraaf) in
[https://github.com/duckdb/duckdb/pull/12317](https://redirect.github.com/duckdb/duckdb/pull/12317)
- \[Appender] Add `AppendDefault` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11905](https://redirect.github.com/duckdb/duckdb/pull/11905)
- \[Python Dev] Push CTE internally for every (python) replacement scan
that occurred. by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12161](https://redirect.github.com/duckdb/duckdb/pull/12161)
- Improve compiler compatibility by
[@&#8203;krlmlr](https://redirect.github.com/krlmlr) in
[https://github.com/duckdb/duckdb/pull/12401](https://redirect.github.com/duckdb/duckdb/pull/12401)
- Write zero-length list offsets for NULL values when serializing
vectors by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12423](https://redirect.github.com/duckdb/duckdb/pull/12423)
- Get column statistics if Logical Get has a statistics function by
[@&#8203;jeewonhh](https://redirect.github.com/jeewonhh) in
[https://github.com/duckdb/duckdb/pull/12424](https://redirect.github.com/duckdb/duckdb/pull/12424)
- jemalloc: Identify GNU source code properly by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12420](https://redirect.github.com/duckdb/duckdb/pull/12420)
- Avoid parallelizing LIMIT clauses when the query plan is simple by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12433](https://redirect.github.com/duckdb/duckdb/pull/12433)
- Prefetch metadata blocks for remote files by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12437](https://redirect.github.com/duckdb/duckdb/pull/12437)
- \[Jupyter] Remove width limit on the BoxRenderer config by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12443](https://redirect.github.com/duckdb/duckdb/pull/12443)
- Revert
[#&#8203;10865](https://redirect.github.com/duckdb/duckdb/issues/10865)
by [@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12426](https://redirect.github.com/duckdb/duckdb/pull/12426)
- inline delta by
[@&#8203;samansmink](https://redirect.github.com/samansmink) in
[https://github.com/duckdb/duckdb/pull/12435](https://redirect.github.com/duckdb/duckdb/pull/12435)
- Account for *tagged* dollar-quoted strings when stripping unicode
spaces by [@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12421](https://redirect.github.com/duckdb/duckdb/pull/12421)
- Work-around for broken github windows runner by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12447](https://redirect.github.com/duckdb/duckdb/pull/12447)
- Prevents clearing of the types of the LogicalExecute operator by
[@&#8203;NiclasHaderer](https://redirect.github.com/NiclasHaderer) in
[https://github.com/duckdb/duckdb/pull/12436](https://redirect.github.com/duckdb/duckdb/pull/12436)
- Add support for BEGIN TRANSACTION READ ONLY by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12202](https://redirect.github.com/duckdb/duckdb/pull/12202)
- Make `range` and `generate_series` table in-out functions, and fix
several issues with table in-out functions by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12431](https://redirect.github.com/duckdb/duckdb/pull/12431)
- Issue
[#&#8203;12412](https://redirect.github.com/duckdb/duckdb/issues/12412):
AsOf Filter Push by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12448](https://redirect.github.com/duckdb/duckdb/pull/12448)
- \[Fix] Block Size Nightly by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12427](https://redirect.github.com/duckdb/duckdb/pull/12427)
- \[ART] Remove Flatten and template key generation by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12428](https://redirect.github.com/duckdb/duckdb/pull/12428)
- \[Python] Clean up internals of `execute` / `executemany` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12434](https://redirect.github.com/duckdb/duckdb/pull/12434)
- By default attach remote databases as READ_ONLY by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12461](https://redirect.github.com/duckdb/duckdb/pull/12461)
- Fix
[#&#8203;11837](https://redirect.github.com/duckdb/duckdb/issues/11837):
use internal physical type for FIRST/LAST/ANY_VALUE instead of logical
type by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12462](https://redirect.github.com/duckdb/duckdb/pull/12462)
- Issue
[#&#8203;12464](https://redirect.github.com/duckdb/duckdb/issues/12464):
Windowed Order By All by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12470](https://redirect.github.com/duckdb/duckdb/pull/12470)
- Specialize `list_value` for primitive types for significantly improved
performance by [@&#8203;Mytherin](https://redirect.github.com/Mytherin)
in
[https://github.com/duckdb/duckdb/pull/12468](https://redirect.github.com/duckdb/duckdb/pull/12468)
- \[Dev] Remove dead code from `PhysicalBatchCopyToFile` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12459](https://redirect.github.com/duckdb/duckdb/pull/12459)
- Disable Windows extensions CI until Github actions runners are fixed
by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12479](https://redirect.github.com/duckdb/duckdb/pull/12479)
- \[Fix] access_mode now lives in AttachOptions by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12482](https://redirect.github.com/duckdb/duckdb/pull/12482)
- Internal
[#&#8203;2186](https://redirect.github.com/duckdb/duckdb/issues/2186):
Nanosecond Functionality by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12440](https://redirect.github.com/duckdb/duckdb/pull/12440)
- \[C-API] Fix leak in `duckdb_create_config` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12465](https://redirect.github.com/duckdb/duckdb/pull/12465)
- \[Python] No longer scan the entire frame lineage in a replacement
scan, added option to disable (python) replacements entirely by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12425](https://redirect.github.com/duckdb/duckdb/pull/12425)
- throw binder error for comment on system catalog by
[@&#8203;samansmink](https://redirect.github.com/samansmink) in
[https://github.com/duckdb/duckdb/pull/12486](https://redirect.github.com/duckdb/duckdb/pull/12486)
- Parquet reader performance by
[@&#8203;lnkuiper](https://redirect.github.com/lnkuiper) in
[https://github.com/duckdb/duckdb/pull/12478](https://redirect.github.com/duckdb/duckdb/pull/12478)
- Operators the Optimizer can skip by
[@&#8203;Tmonster](https://redirect.github.com/Tmonster) in
[https://github.com/duckdb/duckdb/pull/12489](https://redirect.github.com/duckdb/duckdb/pull/12489)
- Fixes clang conversion warnings by
[@&#8203;TinyTinni](https://redirect.github.com/TinyTinni) in
[https://github.com/duckdb/duckdb/pull/12467](https://redirect.github.com/duckdb/duckdb/pull/12467)
- Avoid creating internal schemas as non-internal when reading old
database files by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12456](https://redirect.github.com/duckdb/duckdb/pull/12456)
- Allow parquet encryption/decryption keys to be passed in as base64
encoded strings by
[@&#8203;elefeint](https://redirect.github.com/elefeint) in
[https://github.com/duckdb/duckdb/pull/12445](https://redirect.github.com/duckdb/duckdb/pull/12445)
- \[Block Size] Introducing CompressionInfo by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12481](https://redirect.github.com/duckdb/duckdb/pull/12481)
- add the number of filtered files to explain by
[@&#8203;samansmink](https://redirect.github.com/samansmink) in
[https://github.com/duckdb/duckdb/pull/12488](https://redirect.github.com/duckdb/duckdb/pull/12488)
- Implement Map Type Detection for JSON Reader by
[@&#8203;ZiyaZa](https://redirect.github.com/ZiyaZa) in
[https://github.com/duckdb/duckdb/pull/11285](https://redirect.github.com/duckdb/duckdb/pull/11285)
- \[Dev] Remove busy-spin from `ClientContext::ExecuteTaskInternal` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12483](https://redirect.github.com/duckdb/duckdb/pull/12483)
- Pluggable collations by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12492](https://redirect.github.com/duckdb/duckdb/pull/12492)
- \[Dev] Don't fail `make generate-files` if the python code generation
fails by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12500](https://redirect.github.com/duckdb/duckdb/pull/12500)
- Optimize `EXTRACT(year/month/day FROM date/timestamp)` by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12499](https://redirect.github.com/duckdb/duckdb/pull/12499)
- \[Fix] Remove BLOCK_ALLOC_SIZE in the single file block manager by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12502](https://redirect.github.com/duckdb/duckdb/pull/12502)
- Revert Windows CI fixes by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12510](https://redirect.github.com/duckdb/duckdb/pull/12510)
- Fix
[#&#8203;12467](https://redirect.github.com/duckdb/duckdb/issues/12467)
changes to covariance calculation by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12515](https://redirect.github.com/duckdb/duckdb/pull/12515)
- \[Python] Fix reading strided `datetime` and `timedelta` columns by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12519](https://redirect.github.com/duckdb/duckdb/pull/12519)
- Add method for decoding sort keys, and use this in min/max for
arbitrary types by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12520](https://redirect.github.com/duckdb/duckdb/pull/12520)
- Reduce allocations & use predication in ColumnSegment::FilterSelection
by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12521](https://redirect.github.com/duckdb/duckdb/pull/12521)
- Skip only built-in optimizers by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12522](https://redirect.github.com/duckdb/duckdb/pull/12522)
- Improve min/max performance for strings and fallback types by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12524](https://redirect.github.com/duckdb/duckdb/pull/12524)
- Move arg_min/arg_max to use sort keys by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12525](https://redirect.github.com/duckdb/duckdb/pull/12525)
- Move FIRST/LAST/ANY_VALUE to use sort keys by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12526](https://redirect.github.com/duckdb/duckdb/pull/12526)
- CMake: use GNUInstallDirs as defaults for
INSTALL\_{BIN,LIB,INCLUDE}\_DIR by
[@&#8203;paparodeo](https://redirect.github.com/paparodeo) in
[https://github.com/duckdb/duckdb/pull/12509](https://redirect.github.com/duckdb/duckdb/pull/12509)
- More formatting and fix to stddev by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12516](https://redirect.github.com/duckdb/duckdb/pull/12516)
- Linux Extensions CI: Attempt at fix missing dependencies by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12429](https://redirect.github.com/duckdb/duckdb/pull/12429)
- Fix checkouts by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12366](https://redirect.github.com/duckdb/duckdb/pull/12366)
- Etag if none match for extension install by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12333](https://redirect.github.com/duckdb/duckdb/pull/12333)
- \[Block Size] FixedSizeAllocator, MetadataManager, PartialBlockManager
by [@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12514](https://redirect.github.com/duckdb/duckdb/pull/12514)
- \[Python] Skip the PandasAnalyzer if dtype is `'string'` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12511](https://redirect.github.com/duckdb/duckdb/pull/12511)
- \[StreamQueryResult] Batched variant of the StreamQueryResult
collector by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11494](https://redirect.github.com/duckdb/duckdb/pull/11494)
- Move many tests to slow by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12534](https://redirect.github.com/duckdb/duckdb/pull/12534)
- Add support for `arg_min(ANY, ANY)` by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12532](https://redirect.github.com/duckdb/duckdb/pull/12532)
- Avoid overriding types in PrepareTypeForCast when not required by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12539](https://redirect.github.com/duckdb/duckdb/pull/12539)
- Support all types in `histogram` function by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12538](https://redirect.github.com/duckdb/duckdb/pull/12538)
- \[Python] Remove busy-spin during execution by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12512](https://redirect.github.com/duckdb/duckdb/pull/12512)
- \[Block Size] String space constant by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12537](https://redirect.github.com/duckdb/duckdb/pull/12537)
- Use string_t instead of std::string in histogram by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12545](https://redirect.github.com/duckdb/duckdb/pull/12545)
- Add support for binned histograms by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12548](https://redirect.github.com/duckdb/duckdb/pull/12548)
- \[Upsert] Fix RETURNING for `DO NOTHING` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12554](https://redirect.github.com/duckdb/duckdb/pull/12554)
- Build Android Binaries by
[@&#8203;hannes](https://redirect.github.com/hannes) in
[https://github.com/duckdb/duckdb/pull/12550](https://redirect.github.com/duckdb/duckdb/pull/12550)
- \[CI] Remove pyarrow version lock by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12566](https://redirect.github.com/duckdb/duckdb/pull/12566)
- \[Dev] Change tests: np.NaN -> np.nan by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12565](https://redirect.github.com/duckdb/duckdb/pull/12565)
- Internal
[#&#8203;2017](https://redirect.github.com/duckdb/duckdb/issues/2017):
DECIMAL Downcast Rounding by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12036](https://redirect.github.com/duckdb/duckdb/pull/12036)
- Issue
[#&#8203;12204](https://redirect.github.com/duckdb/duckdb/issues/12204):
Summarize Temporal Quantiles by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12297](https://redirect.github.com/duckdb/duckdb/pull/12297)
- Internal
[#&#8203;2186](https://redirect.github.com/duckdb/duckdb/issues/2186):
Nanosecond StrTimeFormat by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12551](https://redirect.github.com/duckdb/duckdb/pull/12551)
- Add support for `equi_width_bins` function to compute histogram
boundaries by [@&#8203;Mytherin](https://redirect.github.com/Mytherin)
in
[https://github.com/duckdb/duckdb/pull/12574](https://redirect.github.com/duckdb/duckdb/pull/12574)
- add support for casting 'yes'/'no' strings to boolean values by
[@&#8203;chrisiou](https://redirect.github.com/chrisiou) in
[https://github.com/duckdb/duckdb/pull/12501](https://redirect.github.com/duckdb/duckdb/pull/12501)
- Julia: Add chunked results with Tables.partitions() by
[@&#8203;frankier](https://redirect.github.com/frankier) in
[https://github.com/duckdb/duckdb/pull/12395](https://redirect.github.com/duckdb/duckdb/pull/12395)
- \[PySpark] - Allow spark session range by
[@&#8203;mariotaddeucci](https://redirect.github.com/mariotaddeucci) in
[https://github.com/duckdb/duckdb/pull/12346](https://redirect.github.com/duckdb/duckdb/pull/12346)
- \[PySpark] Implement subset drop duplicates by
[@&#8203;mariotaddeucci](https://redirect.github.com/mariotaddeucci) in
[https://github.com/duckdb/duckdb/pull/12348](https://redirect.github.com/duckdb/duckdb/pull/12348)
- ICU noaccent collation by
[@&#8203;tiagokepe](https://redirect.github.com/tiagokepe) in
[https://github.com/duckdb/duckdb/pull/12170](https://redirect.github.com/duckdb/duckdb/pull/12170)
- Implement Brotli compression for Parquet reading & writing by
[@&#8203;hannes](https://redirect.github.com/hannes) in
[https://github.com/duckdb/duckdb/pull/12103](https://redirect.github.com/duckdb/duckdb/pull/12103)
- \[FriendlySQL] Unpacked COLUMNS() Expression by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/11872](https://redirect.github.com/duckdb/duckdb/pull/11872)
- \[PySpark] Implement UDFRegistration.register method on PySpark api by
[@&#8203;mariotaddeucci](https://redirect.github.com/mariotaddeucci) in
[https://github.com/duckdb/duckdb/pull/12179](https://redirect.github.com/duckdb/duckdb/pull/12179)
- \[Python] Don't use `np.nan`, deprecated alias starting with NumPy 2.0
by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12583](https://redirect.github.com/duckdb/duckdb/pull/12583)
- Add `bind_expression` callback to scalar function, and use it to turn
`typeof` into a `BoundConstantExpression` by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12580](https://redirect.github.com/duckdb/duckdb/pull/12580)
- Add `can_cast_implicitly` scalar function by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12581](https://redirect.github.com/duckdb/duckdb/pull/12581)
- Add support for `histogram` and `histogram_values` table macro, and
add support for default table macros (similar to how we support default
macros) by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12590](https://redirect.github.com/duckdb/duckdb/pull/12590)
- build: swap libclang for cxxheaderparser by
[@&#8203;Mause](https://redirect.github.com/Mause) in
[https://github.com/duckdb/duckdb/pull/12567](https://redirect.github.com/duckdb/duckdb/pull/12567)
- \[C-API] Add `table_description` struct to query various information
about the table. by [@&#8203;Tishj](https://redirect.github.com/Tishj)
in
[https://github.com/duckdb/duckdb/pull/12460](https://redirect.github.com/duckdb/duckdb/pull/12460)
- Change new micro benchmark script to only look for `.benchmark` files
by [@&#8203;maiadegraaf](https://redirect.github.com/maiadegraaf) in
[https://github.com/duckdb/duckdb/pull/12598](https://redirect.github.com/duckdb/duckdb/pull/12598)
- Add HTTP error code to extension install failures by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12608](https://redirect.github.com/duckdb/duckdb/pull/12608)
- Separate WAL write from commit, and allow writing to the WAL without
holding the transaction lock by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12261](https://redirect.github.com/duckdb/duckdb/pull/12261)
- Add `OwningStringMap` - and rework `histogram` and `mode` functions to
use this by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12601](https://redirect.github.com/duckdb/duckdb/pull/12601)
- Feature
[#&#8203;1272](https://redirect.github.com/duckdb/duckdb/issues/1272):
Window Executor State by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12573](https://redirect.github.com/duckdb/duckdb/pull/12573)
- Add support for any type to `mode` aggregate by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12619](https://redirect.github.com/duckdb/duckdb/pull/12619)
- WAL - when dropping a table, also delete any transaction local storage
associated with that table by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12603](https://redirect.github.com/duckdb/duckdb/pull/12603)
- \[Python] Allow Generators to be passed where List is expected by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12602](https://redirect.github.com/duckdb/duckdb/pull/12602)
- VectorOperations::Copy - fast path when copying an aligned flat
validity mask into a flat vector by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12618](https://redirect.github.com/duckdb/duckdb/pull/12618)
- Move android CI to only run during nightly CI triggers by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12622](https://redirect.github.com/duckdb/duckdb/pull/12622)
- Add initial support for GeoParquet + Bump spatial by
[@&#8203;Maxxen](https://redirect.github.com/Maxxen) in
[https://github.com/duckdb/duckdb/pull/12503](https://redirect.github.com/duckdb/duckdb/pull/12503)
- Issue
[#&#8203;12600](https://redirect.github.com/duckdb/duckdb/issues/12600):
Streaming Positive LAG by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12609](https://redirect.github.com/duckdb/duckdb/pull/12609)
- Feature
[#&#8203;1272](https://redirect.github.com/duckdb/duckdb/issues/1272):
Window Group Preparation by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12628](https://redirect.github.com/duckdb/duckdb/pull/12628)
- Minor window improvements by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12617](https://redirect.github.com/duckdb/duckdb/pull/12617)
- Merge feature into main by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12633](https://redirect.github.com/duckdb/duckdb/pull/12633)
- Refactor `quantile` aggregate - clean up code & support
`quantile_disc`/`median` for all types by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12630](https://redirect.github.com/duckdb/duckdb/pull/12630)
- Feature 1272: Window Payload Preallocation by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12629](https://redirect.github.com/duckdb/duckdb/pull/12629)
- \[ART] Configurable index scan threshold by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12635](https://redirect.github.com/duckdb/duckdb/pull/12635)
- Subtract start offset for when fetching array child segment by
[@&#8203;Maxxen](https://redirect.github.com/Maxxen) in
[https://github.com/duckdb/duckdb/pull/12639](https://redirect.github.com/duckdb/duckdb/pull/12639)
- Remove custom logic to detect main vs feature by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12643](https://redirect.github.com/duckdb/duckdb/pull/12643)
- Do not quote fields with space in the CSV output mode by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12644](https://redirect.github.com/duckdb/duckdb/pull/12644)
- Use lowercase in 'html' output mode by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12612](https://redirect.github.com/duckdb/duckdb/pull/12612)
- Internal
[#&#8203;2361](https://redirect.github.com/duckdb/duckdb/issues/2361):
Window ROWS Overflow by
[@&#8203;hawkfish](https://redirect.github.com/hawkfish) in
[https://github.com/duckdb/duckdb/pull/12652](https://redirect.github.com/duckdb/duckdb/pull/12652)
- Quantile: Fix variable used only in D_ASSERT by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12642](https://redirect.github.com/duckdb/duckdb/pull/12642)
- Skip pytorch test, it fails spuriously in CI by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12645](https://redirect.github.com/duckdb/duckdb/pull/12645)
- Add `histogram_exact` function that adds values to bins only if they
match exactly, and add `other` column that contains values that do not
fit in any bin by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12650](https://redirect.github.com/duckdb/duckdb/pull/12650)
- Add operator hook for sink progress by
[@&#8203;Maxxen](https://redirect.github.com/Maxxen) in
[https://github.com/duckdb/duckdb/pull/12637](https://redirect.github.com/duckdb/duckdb/pull/12637)
- Regression workflow on newly introduced benchmarks: remove for now by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12659](https://redirect.github.com/duckdb/duckdb/pull/12659)
- Fix
[#&#8203;12646](https://redirect.github.com/duckdb/duckdb/issues/12646)
- allow SQL value functions in HAVING by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12654](https://redirect.github.com/duckdb/duckdb/pull/12654)
- Add != operators on string_t and interval_t by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12658](https://redirect.github.com/duckdb/duckdb/pull/12658)
- fix: improve C scalar functions API by
[@&#8203;rustyconover](https://redirect.github.com/rustyconover) in
[https://github.com/duckdb/duckdb/pull/12663](https://redirect.github.com/duckdb/duckdb/pull/12663)
- Add `approx_top_k` aggregate based on the (Filtered) Space-Saving
algorithm, and use it in histogram by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12653](https://redirect.github.com/duckdb/duckdb/pull/12653)
- Fix std::sort requirements, from greater_equal to greater by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12669](https://redirect.github.com/duckdb/duckdb/pull/12669)
- fix(parquet): two-complement zeroes check on FIXED_BYTE_ARRAY encoded
DECIMAL
([#&#8203;12621](https://redirect.github.com/duckdb/duckdb/issues/12621))
by [@&#8203;fedefrancescon](https://redirect.github.com/fedefrancescon)
in
[https://github.com/duckdb/duckdb/pull/12655](https://redirect.github.com/duckdb/duckdb/pull/12655)
- \[CSV Reader] Reorder of Columns for CSV Scans on multiple files. by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12288](https://redirect.github.com/duckdb/duckdb/pull/12288)
- \[CSV] \[Bug-Fix] Fix for issue related with single-threaded execution
and null padding. by [@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12679](https://redirect.github.com/duckdb/duckdb/pull/12679)
- \[Block Size] String block limit and a few other places by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12671](https://redirect.github.com/duckdb/duckdb/pull/12671)
- Rework arena allocator allocation policy - and increase pivot
threshold by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12690](https://redirect.github.com/duckdb/duckdb/pull/12690)
- Julia - Fix Base.isopen(db::DB) in
[https://github.com/duckdb/duckdb/pull/12700](https://redirect.github.com/duckdb/duckdb/pull/12700)
- \[CLI] Limit history size to 100MB, and avoid writing invalid UTF8 to
the CLI history by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12677](https://redirect.github.com/duckdb/duckdb/pull/12677)
- Add configurable thresholds for using nested loop join and merge join
by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12689](https://redirect.github.com/duckdb/duckdb/pull/12689)
- Prevent unnecessary usage of `std::string` in `list` aggregate - and
use more efficient `memcpy` for batched copy by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12694](https://redirect.github.com/duckdb/duckdb/pull/12694)
- Dont load spatial unless geoparquet metata is present by
[@&#8203;Maxxen](https://redirect.github.com/Maxxen) in
[https://github.com/duckdb/duckdb/pull/12692](https://redirect.github.com/duckdb/duckdb/pull/12692)
- Serialization: add CustomData and better support for integrating with
extensions by [@&#8203;jeewonhh](https://redirect.github.com/jeewonhh)
in
[https://github.com/duckdb/duckdb/pull/12681](https://redirect.github.com/duckdb/duckdb/pull/12681)
- Removing ODBC driver by
[@&#8203;hannes](https://redirect.github.com/hannes) in
[https://github.com/duckdb/duckdb/pull/12706](https://redirect.github.com/duckdb/duckdb/pull/12706)
- Support thousand separator for floating point numbers by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12717](https://redirect.github.com/duckdb/duckdb/pull/12717)
- \[Python] Use non-owning references to hold created cursors by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12711](https://redirect.github.com/duckdb/duckdb/pull/12711)
- LIST(VARCHAR) - reduce memory usage by avoiding allocation of nullmask
for string data, and allocate larger initial batches by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12705](https://redirect.github.com/duckdb/duckdb/pull/12705)
- \[CSV] Bug fix for race condition in single-threaded multifile reader
+ properly print paths on union_by_name errors. by
[@&#8203;pdet](https://redirect.github.com/pdet) in
[https://github.com/duckdb/duckdb/pull/12697](https://redirect.github.com/duckdb/duckdb/pull/12697)
- Issue template: Add ODBC and Node (neo) clients by
[@&#8203;szarnyasg](https://redirect.github.com/szarnyasg) in
[https://github.com/duckdb/duckdb/pull/12714](https://redirect.github.com/duckdb/duckdb/pull/12714)
- Shell: add .sql suffix to temporary file created with \e by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12723](https://redirect.github.com/duckdb/duckdb/pull/12723)
- Partitioned write - keep only up until 100 files open, when this limit
is exceeded close the file and create a new file if more data for this
partition appears by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12708](https://redirect.github.com/duckdb/duckdb/pull/12708)
- Change setting types to fix warnings by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12724](https://redirect.github.com/duckdb/duckdb/pull/12724)
- Avoid unnecessarily copying child expression when binding COLLATE
statements by [@&#8203;Mytherin](https://redirect.github.com/Mytherin)
in
[https://github.com/duckdb/duckdb/pull/12725](https://redirect.github.com/duckdb/duckdb/pull/12725)
- Support for variadic arguments in scalar UDFs in the C API by
[@&#8203;taniabogatsch](https://redirect.github.com/taniabogatsch) in
[https://github.com/duckdb/duckdb/pull/12678](https://redirect.github.com/duckdb/duckdb/pull/12678)
- \[Relation API] Dont push DISTINCT modifier for EXCEPT/INTERSECT ALL
by [@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12599](https://redirect.github.com/duckdb/duckdb/pull/12599)
- Builds for Windows on ARM64 by
[@&#8203;hannes](https://redirect.github.com/hannes) in
[https://github.com/duckdb/duckdb/pull/12586](https://redirect.github.com/duckdb/duckdb/pull/12586)
- Rework `union_by_name` so that files are no longer kept open by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12730](https://redirect.github.com/duckdb/duckdb/pull/12730)
- Fix
[#&#8203;12729](https://redirect.github.com/duckdb/duckdb/issues/12729):
early-out when checking for perfect hash joins when running on empty
tables by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12731](https://redirect.github.com/duckdb/duckdb/pull/12731)
- CLI: Replace \n with \r\n again in history again by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12735](https://redirect.github.com/duckdb/duckdb/pull/12735)
- Fix
[#&#8203;11228](https://redirect.github.com/duckdb/duckdb/issues/11228)
- add support for unsigned integers in printf/format by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12736](https://redirect.github.com/duckdb/duckdb/pull/12736)
- Various CI fixes by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12737](https://redirect.github.com/duckdb/duckdb/pull/12737)
- Add repeat(LIST\[], INT) that allows repetition of lists similar to
how this is allowed in Python by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12738](https://redirect.github.com/duckdb/duckdb/pull/12738)
- \[Python] Add missing options to `read_json` method by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12732](https://redirect.github.com/duckdb/duckdb/pull/12732)
- Add support for fetching cardinality estimation and stats through a
multifilelist by
[@&#8203;samansmink](https://redirect.github.com/samansmink) in
[https://github.com/duckdb/duckdb/pull/12740](https://redirect.github.com/duckdb/duckdb/pull/12740)
- Fixes warnings detected by cppcheck by
[@&#8203;carlopi](https://redirect.github.com/carlopi) in
[https://github.com/duckdb/duckdb/pull/12745](https://redirect.github.com/duckdb/duckdb/pull/12745)
- \[Arrow] Add `ArrowQueryResult` by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12496](https://redirect.github.com/duckdb/duckdb/pull/12496)
- \[Dev] StreamQueryResult internals cleanup by
[@&#8203;Tishj](https://redirect.github.com/Tishj) in
[https://github.com/duckdb/duckdb/pull/12636](https://redirect.github.com/duckdb/duckdb/pull/12636)
- ALP/ALPRD: correctly skip when we are skipping fewer values than in a
vector by [@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12753](https://redirect.github.com/duckdb/duckdb/pull/12753)
- Maintain prepared statement parameter types explicitly instead of
converting into literals by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12759](https://redirect.github.com/duckdb/duckdb/pull/12759)
- CLI .changes: use sqlite3\_changes64 and sqlite3\_totalchanges64 to
prevent overflows by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12761](https://redirect.github.com/duckdb/duckdb/pull/12761)
- Fix
[#&#8203;12569](https://redirect.github.com/duckdb/duckdb/issues/12569):
avoid truncating zeros that matter in format function by
[@&#8203;Mytherin](https://redirect.github.com/Mytherin) in
[https://github.com/duckdb/duckdb/pull/12762](https://redirect.github.com/duckdb/duckdb/pull/12762)
- Fix
[#&#8203;12418](https://redirect.github.com/duckdb/duckdb/issues/12418):
Remove .lint command in SQLite shell by
[@&#8203;Mytherin](https://redire

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/d-issy/dotfiles).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Mytherin added a commit that referenced this pull request Sep 23, 2024
This PR adds the `SMALLER_BINARY` compilation flag that can be used to
reduce the binary size. In many places in DuckDB, we generate code using
templating to speed up execution, by e.g. providing specialized
implementations for many different primitive types, or providing
specialized implementations based on common compressed vector types. In
many of these cases, we also have a generic fallback implementation that
can be used (albeit with a small performance loss).

This PR hides many of these specialized implementations behind a
`#ifndef DUCKDB_SMALL_BINARY` compilation flag. This allows builds where
binary size is important choose to omit this extra generated code and
always select the fallback implementation.

The goal is to not sacrifice any functionality so that the versions are
fully compatible - but to rather achieve the same functionality in less
code. In many cases the [sort key
functionality](#12520) can be used
to relatively efficiently support operations on all types without adding
specialized code for all types. Since we already have this code path to
deal with arbitrary types, using it in all cases is in many cases as
simple as pushing an `#ifdef` around the specialized implementations.

This PR also has a side effect in that it improves testing of the
fallback implementations - since the fallback implementation is now
always used. In this process I already found two issues with the
fallback implementation (one that is addressed in this PR, one that
still needs to be addressed).

This PR moves the following implementations behind the
`DUCKDB_SMALL_BINARY` flag:

* Specialized code for flat vectors in `UnaryExecutor/BinaryExecutor`
and `DistinctFrom`
* Specialized code for executing `BETWEEN` in the `WHERE` clause
(`ExpressionExecutor::Select`)
* Use fallback implementation for `binned_histogram`, `histogram`,
`arg_min/arg_max`, `mode` aggregates
* Removes all `window` functions - instead falling back to the regular
aggregate implementations when these aggregates are used in window
functions

This has as effect that binary size is reduced by ~20%. We also add a CI
run that runs all unit tests using the `SMALLER_BINARY` flag in the
nightly CI.

The performance impact of the smaller binary is heavily workload
dependent. For TPC-H, there is little performance impact. Simple
aggregates have the potential to be more affected. For example - running
an ungrouped `arg_min(BIGINT, BIGINT)` is slowed down by ~3x. Removing
the specialized `window` functions might be even more impactful when
they are used. That said, the significant decrease in binary size is
likely worth it in the majority of cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants