-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add SMALLER_BINARY
flag to reduce binary size
#14057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… generic implementation
This reverts commit 714ab2e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very cool, thanks a lot.
Main comment would be whether this (possibly without the changes to tests, just to be on the safe side) can be already considered for inclusion in main
branch, adding (if not already there) a big "EXPERIMENTAL" warning on the setting.
That would allow opt-in for this already in v1.1.1
.
Another comment, that I am not 100% sure it makes sense, but I think it's worth raising, is whether it could make sense to add another layer so that changes in the code that go together are behind a common ifdef.
One pattern could be having:
#ifdef SMALLER_BINARY
#define SMALLER_BINARY_DISTINCT_ON
#endif
#ifdef SMALLER_BINARY_DISTICT_ON
//
#endif
or equivalent for SMALLER_BINARY_HISTOGRAM
, SMALLER_BINARY_ARG_MIN_MAX
, ....
Idea would be that behind the generic SMALLER_BINARY
the whole set of changes are enabled, but more granular choices are possible for debugging purposes (say to track down which changes are connected to a given bug) or even for allowing people distributing DuckDB to decide (or even measure) what are the tradeoffs.
Another minor benefit is that we will be keeping related changes somewhat connected.
Example of use: say for example at some point some differences are detected between code-path SMALLER_BINARY=1
and the default with no SMALLER_BINARY, having the possibility at the [C]Make level to decide more granularly what path to select can be handy in determining which areas of the code influence a change in result.
This proposed chance, if it does makes sense, can also be considered at a later point.
Let's keep this in |
This PR adds the
SMALLER_BINARY
compilation flag that can be used to reduce the binary size. In many places in DuckDB, we generate code using templating to speed up execution, by e.g. providing specialized implementations for many different primitive types, or providing specialized implementations based on common compressed vector types. In many of these cases, we also have a generic fallback implementation that can be used (albeit with a small performance loss).This PR hides many of these specialized implementations behind a
#ifndef DUCKDB_SMALL_BINARY
compilation flag. This allows builds where binary size is important choose to omit this extra generated code and always select the fallback implementation.The goal is to not sacrifice any functionality so that the versions are fully compatible - but to rather achieve the same functionality in less code. In many cases the sort key functionality can be used to relatively efficiently support operations on all types without adding specialized code for all types. Since we already have this code path to deal with arbitrary types, using it in all cases is in many cases as simple as pushing an
#ifdef
around the specialized implementations.This PR also has a side effect in that it improves testing of the fallback implementations - since the fallback implementation is now always used. In this process I already found two issues with the fallback implementation (one that is addressed in this PR, one that still needs to be addressed).
This PR moves the following implementations behind the
DUCKDB_SMALL_BINARY
flag:UnaryExecutor/BinaryExecutor
andDistinctFrom
BETWEEN
in theWHERE
clause (ExpressionExecutor::Select
)binned_histogram
,histogram
,arg_min/arg_max
,mode
aggregateswindow
functions - instead falling back to the regular aggregate implementations when these aggregates are used in window functionsThis has as effect that binary size is reduced by ~20%. We also add a CI run that runs all unit tests using the
SMALLER_BINARY
flag in the nightly CI.The performance impact of the smaller binary is heavily workload dependent. For TPC-H, there is little performance impact. Simple aggregates have the potential to be more affected. For example - running an ungrouped
arg_min(BIGINT, BIGINT)
is slowed down by ~3x. Removing the specializedwindow
functions might be even more impactful when they are used. That said, the significant decrease in binary size is likely worth it in the majority of cases.