Skip to content

Conversation

hannes
Copy link
Member

@hannes hannes commented Nov 10, 2023

Previously, we silently ignored the ALL modifier for EXCEPT and INTERSECT. With this PR, we implement support using window functions.

@github-actions github-actions bot marked this pull request as draft November 10, 2023 11:58
@hannes hannes marked this pull request as ready for review November 10, 2023 11:58
@hannes
Copy link
Member Author

hannes commented Nov 10, 2023

@Tmonster here's what I was referring to

@Tmonster
Copy link
Contributor

I see, so similar to Union, intersect all is intersect with duplicates and intersect is automatically de-duplicated. Same with except all and except.

I feel like this logic could also go in my PR. intersect all can be transformed into a semi join while intersect is turned into a semi join with a distinct pushed on top? Except is similar but using an anti join instead.

My PR isn't ready yet. I still need to use the column binding replacer when I replace the set op.

Also just a thought, if I add the optimizer to convert intersect/except set ops to semi/anti joins early, and we keep the automatic deduplication, the join order optimizer has less opportunities for optimizing around the semi/anti joins since the group by operator won't allow it

@github-actions github-actions bot marked this pull request as draft November 10, 2023 13:49
@hannes hannes marked this pull request as ready for review November 10, 2023 13:51
@hannes
Copy link
Member Author

hannes commented Nov 10, 2023

No idea what to do about the storage serialisation test failure

@Mytherin
Copy link
Collaborator

The failing test seems to be that this is breaking forwards compatibility (i.e. v0.9.2 will not be able to read databases made with v0.10.0 containing views with UNION ALL after this patch). Breaking forwards compatibility is generally fine so we can merge this in spite of the test failure IMO. @carlopi agreed?

@Mytherin Mytherin changed the base branch from main to feature November 10, 2023 21:38
@carlopi
Copy link
Contributor

carlopi commented Nov 10, 2023

Here some kind of breakage, either having 0.9.1 not be able to read all 0.9.2+ files or 0.9.2 not be able to read all 0.10.0 files is probably the way to go.
We are generally fine with breaking forward, unsure whether it's best to do it now or wait post 0.9.x.

Unsure on what's worse, between changing behaviour between minor versions or live a couple of more months with what we know have decided is a wrong behaviour.

The only middle way that I could think of, but it's an hassle long term, is having a setting like intersect_all_compatilibity, default that to true for 0.9.2, and later switch it to false. That would make 0.9.2 be able to interact with both 0.9.1 (forward and backward) and 0.10.x (forward and backward) depending on the status of the flag.

@Mytherin Mytherin changed the base branch from feature to main November 11, 2023 08:24
@Mytherin Mytherin changed the base branch from main to feature November 11, 2023 08:24
@Mytherin
Copy link
Collaborator

I would say this is a feature and shouldn't go in 0.9.2, then we can break forwards compatibility between 0.9.2 and 0.10.0 on views with union/intersect/except

@github-actions github-actions bot marked this pull request as draft November 13, 2023 07:44
@hannes hannes marked this pull request as ready for review November 13, 2023 07:44
@github-actions github-actions bot marked this pull request as draft November 13, 2023 08:50
@hannes hannes marked this pull request as ready for review November 13, 2023 11:32
@github-actions github-actions bot marked this pull request as draft November 13, 2023 15:09
@hannes hannes marked this pull request as ready for review November 13, 2023 15:09
@Mytherin Mytherin changed the base branch from feature to main November 13, 2023 15:44
@Mytherin Mytherin changed the base branch from main to feature November 13, 2023 15:45
@Mytherin Mytherin changed the base branch from feature to main November 20, 2023 12:42
@Mytherin Mytherin merged commit b81d8c2 into duckdb:main Nov 22, 2023
@Mytherin
Copy link
Collaborator

Thanks!

@hannes hannes deleted the exceptintersectall branch November 22, 2023 11:13
@krlmlr
Copy link
Collaborator

krlmlr commented Dec 11, 2023

This PR breaks R tests, but I don't see a corresponding CI/CD run here. Are we still running R tests in this repository?

@Tmonster
Copy link
Contributor

This PR breaks R tests, but I don't see a corresponding CI/CD run here. Are we still running R tests in this repository?

We aren't running the R tests in this repository. Just checking the build. I'll investigate the failing test this week

@krlmlr
Copy link
Collaborator

krlmlr commented Dec 11, 2023

I'm on it, let's discuss later today.

@Tmonster
Copy link
Contributor

Have a PR to fix the R tests here duckdb/duckdb-r#54

krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Dec 14, 2023
Merge pull request duckdb/duckdb#9636 from hannes/exceptintersectall
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants