Skip to content

Conversation

taniabogatsch
Copy link
Contributor

Fixes https://github.com/duckdblabs/duckdb-internal/issues/1450.

We allow subqueries in WINDOW expressions, so the following query returns an expected result.

statement ok
CREATE TABLE tbl AS SELECT 42 AS i;

query I
SELECT SUM(i) OVER (ROWS BETWEEN (SELECT UNNEST([1])) PRECEDING AND 1 FOLLOWING) FROM tbl;
----
42

However, we do not allow UNNEST to be an expression outside a subquery in WINDOW expressions. Previously, queries of the following nature would throw an InternalException. This PR ensures that we now throw the expected binder exception.

statement ok
CREATE TABLE tbl AS SELECT 42 AS i;

statement error
SELECT SUM(i) OVER (ROWS BETWEEN UNNEST([1]) PRECEDING AND 1 FOLLOWING) FROM tbl;
----
UNNEST not supported here

Relevant changes

  • I moved inside_window from the BaseSelectBinder into the ExpressionBinder, to access it without type casting of binders in macro binding code-paths.
  • I added two new functions to the ExpressionBinder to (1) detect UNNEST expressions (WindowContainsUnnest), and to (2) bind a WINDOW child expression with extra checks (BindWindowChild).
  • I also did some code tidying, as my IDE complained about shadowed variables.

@taniabogatsch taniabogatsch requested a review from hawkfish March 19, 2024 10:25
@github-actions github-actions bot marked this pull request as draft March 29, 2024 10:56
@taniabogatsch taniabogatsch marked this pull request as ready for review March 29, 2024 11:27
@Mytherin
Copy link
Collaborator

Mytherin commented Apr 2, 2024

Thanks for the PR!

I'm a bit worried about the added calls to WindowContainsUnnest. It seems to be adding a good amount of code complexity, and there are performance ramifications to (repeatedly) doing a full expression tree traversal during binding. Since we are already doing a full expression tree traversal during the actual bind this could lead to O(n^2) binding performance in the size of the expression tree. Of course often binding performance is not critical - but quadratic performance is always best avoided.

I've tried a simpler change that uses the existing inside_window property here and it seems to solve the issue as well (at least all of the tests seem to pass). Since you have looked at this more perhaps I'm missing something and this doesn't fully solve the problem?

@github-actions github-actions bot marked this pull request as draft April 2, 2024 16:38
@taniabogatsch taniabogatsch marked this pull request as ready for review April 2, 2024 16:39
@taniabogatsch
Copy link
Contributor Author

@Mytherin, thanks for your review and the great suggestion! I've reverted the changes and added your change. I've only kept the other tidy changes. Let's see what CI has to say. :)

@github-actions github-actions bot marked this pull request as draft April 3, 2024 05:40
@Mytherin Mytherin marked this pull request as ready for review April 3, 2024 05:40
@Mytherin Mytherin merged commit dfaac68 into duckdb:main Apr 3, 2024
@Mytherin
Copy link
Collaborator

Mytherin commented Apr 3, 2024

Thanks!

@taniabogatsch taniabogatsch deleted the fix-fuzzer branch April 5, 2024 08:21
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Apr 5, 2024
Merge pull request duckdb/duckdb#11247 from taniabogatsch/fix-fuzzer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants