Skip to content

Conversation

kryonix
Copy link
Contributor

@kryonix kryonix commented Apr 18, 2023

This pr adds support for materialized CTE expressions as an alternative to inlining, as I briefly mentioned in #404.

Inlining CTEs can be great, but it can also multiply the work. Take this query for example:

WITH t(x) AS
  (SELECT t FROM generate_series(1,10) AS _(t))
SELECT t1.x, t2.x FROM t AS t1, t AS t2;

Inlining duplicates the definition of t for each reference.
This results in the following query:

SELECT t1.x, t2.x FROM (SELECT t FROM generate_series(1,10) AS _(t)) AS t1(x),
                       (SELECT t FROM generate_series(1,10) AS _(t)) AS t2(x);

Now, if t is an expensive query, there may be a problem.


With this pr, CTEs can be annotated as being MATERIALIZED.
Such CTEs first fully materialize their result, which can then be read:

WITH t(x) AS MATERIALIZED
  (SELECT t FROM generate_series(1,10) AS _(t))
SELECT t1.x, t2.x FROM t AS t1, t AS t2;

Instead of copying t for each reference, t is calculated exactly once. Since the result is cached, no work is done multiple times.

@kryonix
Copy link
Contributor Author

kryonix commented Apr 18, 2023

This pr is currently a draft, as there are two remaining issues:


First, something during binding seems to be broken, as the following fails:

SELECT x FROM (WITH t(y, z) AS MATERIALIZED (SELECT 1, 2) SELECT * FROM t) AS _(x, y);
Error: INTERNAL Error: Failed to bind column reference "x" [0.0] (bindings: [9.0])

Edit: This is fixed.


Second, pipeline construction of two (or more) materialized recursive CTEs fails:

WITH RECURSIVE t(x) AS MATERIALIZED
(
  SELECT 1
    UNION ALL
  SELECT x+1
  FROM   t
  WHERE  x < 4
)
,
u(x) AS
(
  SELECT *
  FROM   t
    UNION ALL
  SELECT u.x + t.x
  FROM   u, t
  WHERE  u.x < 32
)
SELECT *
FROM   u;
Error: INTERNAL Error: Assertion triggered in file "/duckdb/src/parallel/executor.cpp" on line 156: event_map_entry != event_map.end()

Edit: This is fixed.

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good - very cool feature! I have left some comments inline. One note is that we have been working to remove C-style pointers (see #7080) - this might cause a few merge conflicts on your end.

SELECT x FROM (WITH t(y, z) AS MATERIALIZED (SELECT 1, 2) SELECT * FROM t) AS _(x, y);

Error: INTERNAL Error: Failed to bind column reference "x" [0.0] (bindings: [9.0])

This is the column binding resolver that complains. Essentially "x" is bound as the first column to table binding 0, but table binding 0 is not accessible from that part of the plan.

Second, pipeline construction of two (or more) materialized recursive CTEs fails:

I'm not sure about what causes this exactly - I would guess that something goes wrong in the order in which BuildPipelines is executed so that the layered situation is not resolved correctly, or perhaps the incorrect pipeline is passed on to the child.

Perhaps it would also be interesting to add a benchmark where the benefits of the materialized CTEs are visible?

@kryonix
Copy link
Contributor Author

kryonix commented Apr 21, 2023

The binder issue seems to be resolved now.

Thank you for the comments. I will address them.

Thanks for the PR! Looks good - very cool feature! I have left some comments inline. One note is that we have been working to remove C-style pointers (see #7080) - this might cause a few merge conflicts on your end.

I think I will just rebase onto master and do the rewritings before I take this pr out of its draft status. Should be a clean solution.

kryonix added 26 commits May 9, 2023 10:43
@kryonix
Copy link
Contributor Author

kryonix commented May 25, 2023

I think all remaining issues are now resolved.

@kryonix kryonix requested a review from Mytherin May 25, 2023 15:01
@kryonix
Copy link
Contributor Author

kryonix commented May 25, 2023

NVM I guess. ALTERNATIVE_VERIFY still has some issues 👀

@kryonix
Copy link
Contributor Author

kryonix commented Jun 7, 2023

If you find the time, @Mytherin, I think this pr is ready for review now.

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! Looks great to me now. One minor comment otherwise this is good to go. Could you also merge with feature so the CI can rerun? I believe the CI failures should be fixed then.

TransformCTE(*PGPointerCast<duckdb_libpgquery::PGWithClause>(stmt.withClause), result->cte_map,
materialized_ctes);
if (!materialized_ctes.empty()) {
throw NotImplementedException("Materialized CTEs are not implemented for delete.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be update - also could you perhaps expand on the reason for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the message is wrong. I will fix that.


There is an AST 'mismatch' currently. Materialized CTEs are represented as a QueryNode, whereasINSERT/UPDATE/DELETE are SQLStatements. This means, that it is easy to represent the materialized CTEs as a bunch of stacked QueryNodes (this happens in Transformer::TransformMaterializedCTE). It is not that easy for INSERT/UPDATE/DELETE, however. While there should not be a principle problem implementing that, my current plan is to tackle that in a separate pr.

statement ok
PRAGMA enable_verification

require noalternativeverify
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary here?

Copy link
Contributor Author

@kryonix kryonix Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative verify would lead to non-MATERIALIZED CTEs here. Therefore, the tests would be unexpectedly successful.

@Mytherin Mytherin merged commit 09fd4e3 into duckdb:feature Jun 20, 2023
@Mytherin
Copy link
Collaborator

Thanks!

Mytherin added a commit that referenced this pull request Jun 12, 2025
Until now, DuckDB's default behavior for CTEs was to inline them, which
means that
the CTE is replaced with its definition in the main query. However,
inlining can not
only lead to performance issues, especially when the CTE is used
multiple times in
the main query, but also to incorrect results in some cases.

With the introduction of the `WITH ... AS [NOT] MATERIALIZED` syntax in
#7126,
users can specify that a CTE should be materialized, meaning that it
will be
computed once and stored in memory for the duration of the main query.
This can improve performance and ensure correct results when the CTE is
used multiple times.

This PR is a continuation of that work. While we still want to be able
to inline CTEs in some cases,
it should not be the default behavior anymore. More importantly, this
should not be the responsibility
of the `Binder`, but rather an `Optimizer` decision.

For CTEs with `MATERIALIZED` or `NOT MATERIALIZED` specified, nothing
changes.
If a CTE is marked as `MATERIALIZED`, it will be materialized, and if it
is marked
as `NOT MATERIALIZED`, it will be inlined—regardless of the possibly
questionable
performance or correctness implications.

For CTEs without a `MATERIALIZED` or `NOT MATERIALIZED` clause, the
`Optimizer` will
decide whether to inline or materialize the CTE.

The rules for this decision are currently a work in progress.

---

**Performance**

@neumannt released a solution of the Advent of Code 2024 using SQL:
https://github.com/neumannt/aoc24
All solutions rely heavily on CTEs, but do not use explicit
MATERIALIZATION hints.
I use his queries as a benchmark—using the full AoC input:

| DAY | Speedup with this PR vs. DuckDB main |
|-----|---------------------------------------|
| 1   |  2.3x                                 |
| 2   |  1.7x                                 |
| 3   |  1.6x                                 |
| 4   |  2.7x                                 |
| 5   | 99.3x                                 |
| 6   | 1810x (now ~18 sec, main ~9h)               |
| 7   | 11.8x                                 |
| 8   |  1.2x                                 |
| 9   |156.2x                                 |
| 10  |  1.7x                                 |
| 11  |  1.2x                                 |
| 12  |  2.3x                                 |
| 13  |  3.2x                                 |
| 14  |  308x (now ~1.5s, main ~8 min)               |
| 15  | > 10x (now ~2 min, main timeout)       |
| 16  | TBD                                   |
| 17  | Not supported                         |
| 18  | > 73x (now ~8.5 sec, main timeout)     |
| 19  |  2.2x                                 |
| 20  | 28x (now ~8 sec, main ~3 min 50s)           |
| 21  | 24.2x                                 |
| 22  | Not supported                         |
| 23  | 188x (now ~260ms, main ~48 sec)               |
| 24  | Not supported                         |
| 25  | > 2245x (now ~250ms, main timeout)     |

I think these numbers speak for themselves. All queries experienced a
speedup, and
the most problematic queries are now fast enough to run in a reasonable
time frame.
Speedups range from 1.2x to over 2000x, which—I think—is a significant
improvement.

---

`logical_operator_deep_copy` is a new visitor that allows us to _"deep
copy"_ a logical operator tree.
While we can already get a 1:1 copy of a logical operator using `Copy`,
we can not use that
copy to immediately replace a CTE reference in the main query. This is
because the copy
has the same `table_index`es as the original tree, which breaks some
sanity checks.
To fix this, we traverse the tree and assign new `table_index`es to each
operator.

---

TODO:
- [x] Improve the rules for inlining vs. materializing CTEs
- [x] Fix the unit tests for `WITH RECURSIVE` queries, that are
currently failing because they relied on the inlining behavior
- [x] Clean up some rough edges in the code
- [x] Add more tests for the new `Optimizer` behavior

---

Fix: #17456
Fix: #12875
Fix: #15416
Fix: #16891
Fix: #9940
Fix: #14244
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants