Skip to content

Conversation

bradynwalsh
Copy link
Contributor

Closes #17417

This issue appears to actually be a regression of issue #9308 (with the existing test for that issue not picking up the regression as the optimizer no longer makes an explicit ANTI-JOIN for the WHERE NOT EXISTS query).

While the fix in #9654 could be re-applied (and if this PR is deemed not suitable, should be) to remove the pruning optimization altogether for ANTI-JOINS with always true filters, this PR aims to keep the optimization and handle the case where the join is on an empty result set.

In particular, this PR adds a helper function to BaseStatistics to determine whether the stats match an empty result set (using the min/max set by CreateEmpty) and correctly returns the LHS of the ANTI-JOIN if the RHS is empty (and keeps the current behaviour of returning an empty result otherwise)

While this function is only currently used for NumericStats, I've also implemented it for the other BaseStatistics implementations, to avoid this being regressed again in future if statistics propagation (and in particular, filter pruning) occur for non-numeric types.

@bradynwalsh bradynwalsh force-pushed the fix/17417-empty-anti-join branch from 68ab1ce to 22fabcc Compare May 11, 2025 04:07
@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 11, 2025 04:08
@bradynwalsh bradynwalsh marked this pull request as ready for review May 11, 2025 04:08
@TheoristCoder
Copy link

Thanks for your fixing!

@bradynwalsh bradynwalsh force-pushed the fix/17417-empty-anti-join branch from 22fabcc to 5891777 Compare May 17, 2025 07:05
@duckdb-draftbot duckdb-draftbot marked this pull request as draft May 17, 2025 07:05
@bradynwalsh bradynwalsh marked this pull request as ready for review May 17, 2025 07:11
@TheoristCoder
Copy link

Dear Developers, thanks for your fixing! Could you please review this pull request and merge it? Thanks! I am currently working on extensively testing DuckDB, and I also want to express my gratitude to your fixing!

@coracuity
Copy link

also chiming in with enthusiasm for this being merged, and thanks to the dev who fixed it!!

Copy link
Contributor

@Tmonster Tmonster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I think you can change the base branch to v1.3-Ossivalis so that this gets merged into the next bug fix release

@bradynwalsh bradynwalsh changed the base branch from main to v1.3-ossivalis June 10, 2025 15:20
@bradynwalsh
Copy link
Contributor Author

Thanks for the review, changed the base branch to v1.3-Ossivalis!

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 29, 2025

@Mytherin: Is this likely to make it into 1.3.2? I want to run revdepchecks for the R package, and this (and #17530, but not ready for review) seem to be the only affected open PRs.

@Mytherin Mytherin merged commit 6267d4e into duckdb:v1.3-ossivalis Jun 30, 2025
50 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Jun 30, 2025
fix statistics propagation for anti-joins on empty tables (duckdb/duckdb#17439)
Run Python workflow against both Python 3.9 and 3.13 on PR to ensure … (duckdb/duckdb#18080)
Main.yml: Move very long job from debug to release with `-DDEBUG` and `FORCE_ASSERT` (duckdb/duckdb#18081)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Jun 30, 2025
fix statistics propagation for anti-joins on empty tables (duckdb/duckdb#17439)
Run Python workflow against both Python 3.9 and 3.13 on PR to ensure … (duckdb/duckdb#18080)
Main.yml: Move very long job from debug to release with `-DDEBUG` and `FORCE_ASSERT` (duckdb/duckdb#18081)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A wrong anti-join result when joining with an empty table
7 participants