Lambda performance revamp #9395
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR refactors the lambda function execution code. Most notably, it removes all calls to
Flatten
and other 'relics' from the initial implementation. The changes significantly increase performance and make the code more structured and readable. This should make the lambdas efficient enough to be used directly in cases where unnesting is the current alternative. They can now scale to lists with a combined (over the whole table) element count of over 10M.This also paves the way for an efficient future
list_reduce
implementation (@maiadegraaf, @Maxxen).Benchmark numbers
Here are the benchmark numbers for 10K values. The benchmarks included in this PR run with 100K values. I took the benchmarks from those that @cryoEncryp used in his PR.
This PR
Feature branch
SQL