Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

esotericnonsense · 2017-09-17T22:23:55Z

Partial fix for issue #11315.

Every prune event flushes the dbcache to disk.
By default this happens approximately every ~160MiB so high dbcache values are negated and IBD takes far longer than without pruning enabled.

This change allows a 'high water mark' for pruning such that the actual size of blk/rev on disk can increase a reasonable amount before flushing.

On a machine with prune=550 and dbcache=3000:

2017-09-17 22:04:56 Prune: target=550MiB hwm=3540MiB actual=3510MiB diff=-2960MiB max_prune_height=292477 removed 0 blk/rev pairs
2017-09-17 22:04:56 Prune: target=550MiB hwm=3540MiB actual=3516MiB diff=-2966MiB max_prune_height=292499 removed 0 blk/rev pairs
2017-09-17 22:04:57 Prune: target=550MiB hwm=3540MiB actual=468MiB diff=81MiB max_prune_height=292537 removed 21 blk/rev pairs
2017-09-17 22:04:57 Prune: UnlinkPrunedFiles deleted blk/rev (00103)
...

I haven't changed the 'diff' column in debug log (it could perhaps be hwm - actual rather than target - actual).

Not sure if this could potentially increase disk space requirements in some cases - may need documentation. With a very high dbcache value, if say 10GiB of blocks come in that only produce 2GiB of chainstate then you'd overshoot quite a bit, I think. It's a tradeoff - more frequent flushing = slower IBD.

Thanks to sipa and gmaxwell for helping out on IRC.

…ents

esotericnonsense · 2017-09-18T03:21:09Z

Benchmarks, syncing against a localhost node. Sending node on HDD, syncing node on SSD. Clock starts at UpdateTip height=1. prune=550 dbcache=3000.

+--------+----------+------------------+-----------------+
| height | unpruned | pruned (this PR) | pruned (master) |
+--------+----------+------------------+-----------------+
| 250000 |     427s |             593s |            724s |
| 300000 |     916s |            1076s |           1402s |
| 350000 |    1443s |            1979s |           2707s |
+--------+----------+------------------+-----------------+

At height 350000, this PR results in a 529MiB dbcache vs. a 2646MiB dbcache unpruned.
The pruned node ends up with approx. 30MiB.
110 seconds should be added to serialize the dbcache on the unpruned node's shutdown (the other two cases were single digit seconds).

Final result is that the node can sync to height 350000 27% faster than without the PR by giving the prune target ~3GiB leeway. I didn't want to spend the time to reach the end but I suspect results would be similar or better.

As in the above post, this is only a 'partial fix' because the dbcache is still limited empirically to far lower than the actual value.

edit: Ah yes, space requirements. In this test the chainstate folder's final size is ~1GiB and the prune is allowed to overshoot by ~3GiB, so it raises the maximum disk space requirement by ~2GiB in this example.

esotericnonsense · 2017-09-18T04:51:30Z

An additional performance gain could be gotten by tying this HWM to a percentage of the prune target.

For example, with prune=100000 you could let the data get to 100G x 1.10 before pruning, or cap it at 100G and prune down to 100G x 0.90 (similar effect on dbcache in both cases).

Looking at the documentation in -help:

automatically prune block files to stay under the specified target size in MiB

so probably the 'remain below' option makes more sense, but that retains the far slower IBD mechanic at low prune levels

sdaftuar · 2017-09-29T19:09:49Z

No strong feelings from me, but when we worked on the pruning implementation our goal was to have the target be something that should be achievable. So if we were to decide that it's worth exceeding it intentionally (eg for performance reasons during ibd), we should remember that we need to clearly communicate that to users.

But now that we in theory support non-atomic flushes, perhaps we can use that to flush less often during IBD even while we prune.

luke-jr · 2017-11-10T09:42:48Z

Indeed, users expect that if they set prune=5000, the blockchain size remains <= 5000 MB. Perhaps it would make sense to have a prune-extra option to specify additional space to free when doing pruning and reduce its frequency (we can default it to 10% or something).

Sjors · 2018-02-10T14:06:06Z

See also #12404.

…d pruning again soon after ac51a26 During IBD, when doing pruning, prune 10% extra to avoid pruning again soon after (Luke Dashjr) Pull request description: Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly. Alternative to #11359 Tree-SHA512: 631e4e8f94f5699e98a2eff07204aa2b3b2325b2d92e8236b8c8d6a6730737a346e0ad86024e705f5a665b25e873ab0970ce7396740328a437c060f99e9ba4d9

DrahtBot · 2018-07-14T11:23:28Z

Needs rebase

Sjors · 2018-07-14T14:45:51Z

This is superseded by #11658 which was just merged.

maflcko · 2018-07-14T15:07:13Z

Closing for now as per @Sjors

…to avoid pruning again soon after ac51a26 During IBD, when doing pruning, prune 10% extra to avoid pruning again soon after (Luke Dashjr) Pull request description: Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly. Alternative to bitcoin#11359 Tree-SHA512: 631e4e8f94f5699e98a2eff07204aa2b3b2325b2d92e8236b8c8d6a6730737a346e0ad86024e705f5a665b25e873ab0970ce7396740328a437c060f99e9ba4d9

Add a pruning 'high water mark' to reduce the frequency of pruning ev…

ace8846

…ents

fanquake added the Validation label Sep 18, 2017

esotericnonsense mentioned this pull request Sep 19, 2017

[rpc] getblockchaininfo: add size_on_disk, prune_target_size #11367

Merged

luke-jr mentioned this pull request Nov 11, 2017

During IBD, when doing pruning, prune 10% extra to avoid pruning again soon after #11658

Merged

esotericnonsense mentioned this pull request Feb 19, 2018

Prune more aggressively during IBD #12404

Closed

DrahtBot added the Needs rebase label Jul 14, 2018

maflcko closed this Jul 14, 2018

laanwj removed the Needs rebase label Oct 24, 2019

bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

Uh oh!

esotericnonsense commented Sep 17, 2017 •

edited

Loading

Uh oh!

esotericnonsense commented Sep 18, 2017 •

edited

Loading

Uh oh!

esotericnonsense commented Sep 18, 2017 •

edited

Loading

Uh oh!

sdaftuar commented Sep 29, 2017

Uh oh!

luke-jr commented Nov 10, 2017

Uh oh!

Sjors commented Feb 10, 2018

Uh oh!

DrahtBot commented Jul 14, 2018

Uh oh!

Sjors commented Jul 14, 2018

Uh oh!

maflcko commented Jul 14, 2018

Uh oh!

Uh oh!

Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

Uh oh!

Conversation

esotericnonsense commented Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

esotericnonsense commented Sep 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

esotericnonsense commented Sep 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdaftuar commented Sep 29, 2017

Uh oh!

luke-jr commented Nov 10, 2017

Uh oh!

Sjors commented Feb 10, 2018

Uh oh!

DrahtBot commented Jul 14, 2018

Uh oh!

Sjors commented Jul 14, 2018

Uh oh!

maflcko commented Jul 14, 2018

Uh oh!

Uh oh!

esotericnonsense commented Sep 17, 2017 •

edited

Loading

esotericnonsense commented Sep 18, 2017 •

edited

Loading

esotericnonsense commented Sep 18, 2017 •

edited

Loading