Skip to content

Conversation

Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Dec 8, 2023

Fixes #109

This PR adds support for vacuuming deletes within row groups during a checkpoint. The way that this works is that adjacent row groups are merged together if possible by checking if adjacent row groups can be folded into fewer row groups. For example, consider the following row groups with the given number of non-deleted rows:

[1: 60K][2: 40K][3: 50K]

After this PR, the checkpoint will observe that row groups #1 and #2 can be merged together into a single row group, and move the database to the following state:

[1: 100K][2: 50K]

The row groups are re-written with only the valid rows, and the old row groups are dropped, removing the deleted rows from the system and freeing up blocks.

Merges are performed with either 1, 2 or 3 target row groups. We always prefer merging into fewer target row groups as that is cheaper from a performance perspective. That means we need to have at least 25% deletions within adjacent row group before merges are performed and rows are cleaned up automatically. If too few rows are deleted they will not be cleaned up and we will write tombstones as is currently the case.

Merge into one row group (requires >= 50% deleted)
[1: 60K][2: 60K]
->
[1: 120K]
Merge into two row groups (requires >= 33% deleted)
[1: 80K][2: 80K][3: 80K]
->
[1: 120K][2: 120K]
Merge into three row groups (requires >= 25% deleted)
[1: 90K][2: 90K][3: 90K][4: 90K]
->
[1: 120K][2: 120K][3: 120K]

Performance

This PR increases the work that is done during checkpointing significantly when partial deletes are performed. Since checkpointing is currently still single-threaded, performance of large DELETE statements can drop significantly. For example, in this extreme scenario performance drops by a factor of ~15:

DELETE FROM lineitem WHERE l_orderkey%2=0;
v0.9.2 New
0.12s 1.83s

The reason for this degradation is that in the old delete code we would only write deleted markers for the deleted rows, whereas in this new extreme scenario we end up rewriting half the table (the non-deleted rows).

On the flip side, because deletes are cleaned up, performance of running queries on the table afterwards is sped up since we no longer need to scan deleted rows when executing queries. Running TPC-H Q1 on the lineitem table after the deletion has the following performance:

v0.9.2 New
0.035s 0.028s

The more deletions we run, the worse the performance impact of leaving them around. For example, with the following chain of deletions/insertions we get the following timings:

CALL dbgen(sf=0);
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
COPY lineitem FROM 'lineitem.parquet';
DELETE FROM lineitem WHERE l_orderkey%2=0;
- v0.9.2 New
Load 6.41s 25.18s
Q01 (Cold) 0.172s 0.095s
Q01 (Hot) 0.125s 0.091s
Size 977MB 686MB

While performance should always drop when cleaning up deletes it is especially problematic now because we are doing single-threaded checkpoints. This should be addressed in a future PR and will greatly alleviate the performance problem here.

@Mytherin Mytherin merged commit 3abc5eb into duckdb:main Dec 9, 2023
@szarnyasg szarnyasg added the Needs Documentation Use for issues or PRs that require changes in the documentation label Dec 9, 2023
@szarnyasg
Copy link
Collaborator

szarnyasg commented Dec 9, 2023

@Mytherin I'll document this. When exactly is vacuuming deletes triggered?

  • The examples above seem to rely on automatic checkpointing (which happens based on the WAL size, see docs).
  • I'd assume an explicit CHECKPOINT call also triggers it.
  • How about VACUUM?

@Mytherin
Copy link
Collaborator Author

Vacuuming deletes is triggered as part of a checkpoint (automatic or otherwise). The VACUUM statement does not trigger this. Note that this does not remove all deletes, but rather merges row groups that have a significant amount of deletes together. In the current implementation this requires ~25% of rows to be deleted in adjacent row groups.

@szarnyasg
Copy link
Collaborator

Thanks!

krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Dec 14, 2023
Merge pull request duckdb/duckdb#9931 from Mytherin/vacuumdeletes
@Mytherin Mytherin mentioned this pull request Dec 14, 2023
@Mytherin Mytherin deleted the vacuumdeletes branch February 14, 2024 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Documentation Use for issues or PRs that require changes in the documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vacuum Deletes
2 participants