index: initial sync speedup, parallelize process #26966

furszy · 2023-01-25T13:36:23Z

The current procedure for building the block filter index involves processing filters one at a time;
Reading blocks, undo data, and previous headers from disk sequentially.

This PR introduces a new mechanism to perform the work concurrently. Dividing the filters
generation workload among a pool of workers that can be configured by the user,
significantly increasing the speed of the index construction process.

The same concurrent processing model has been applied to the transactions index as well.

The newly introduced init flag -indexworkers=<n> enables the concurrent sync
behavior.
Where "n" is the number of worker threads that will be spawned at startup to create ranges
of block filters during the initial sync process. Destroying the workers pool once the
initial sync completes.
Note: by default, the parallelized sync process is not enabled.

Now the juicy part:
In my computer, with the node in debug mode and on IBD, with -indexworkers=4, the
block filter index generation took less than an hour. While, in master, the sync took more than 7 hours.

Important Note:
As the access to the block data on disk is protected by cs_main, this new feature runs substantially
faster when the node is not in IBD.

Testing Notes:

Sync your node without any index.
Restart the node with one of the indexes (-blockfilterindex or -txindex) and -connect=0 (to sync only the index, without running the net/validation threads. Since threads won't be competing for cs_main, this will give you a more accurate result).

You’ll see a "[index name] is enabled at height [height]" log entry once it finishes. Then it’s just a matter of subtracting the index startup time from the "index synced" log time.

Keep in mind that threads are shared among all indexes you start. So if you run both indexes at the same time, your benchmark results cannot be compared against single-index runs.

DrahtBot · 2023-01-25T13:36:26Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/26966.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	w0xlt, TheCharlatan, ismaelsadeeq
Approach ACK	ryanofsky
Stale ACK	pinheadmz

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#32997 (index: Deduplicate HashKey / HeightKey handling by mzumsande)
#32541 (index: store per-block transaction locations for efficient lookups by romanz)
#31308 (ci, iwyu: Treat warnings as errors for specific directories by hebasto)
#29770 (index: Check all necessary block data is available before starting to sync by fjahr)
#17783 (common: Disallow calling IsArgSet() on ALLOW_LIST options by ryanofsky)
#17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
#17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
#17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

“the error will be already be” → “the error will already be” [duplicate “be” makes the sentence invalid]

^{drahtbot_id_4_m}

Sjors · 2023-01-26T14:38:27Z

Cool, will take it for a spin...

furszy · 2023-01-28T15:18:54Z

Cool @Sjors, just pushed a small update. Found a little bug.

Going to add an important note to the PR description (because otherwise testing results will vary a lot):

As the access to the block data on disk is protected by cs_main, this new feature runs substantially
faster when the node is not in IBD.
(where "substantially" here means full index sync, with 5 workers, in less than 20 minutes in my computer).

mzumsande

It's probably much easier suggested than done, but did you attempt to implement parallelization in a more general way so that other indices could benefit from it as well? On first glance, txindex, and the indices suggested in PRs (#24539, #26951) seem to be parallelizable as well (not coinstatsindex though).

furszy · 2023-01-30T13:37:06Z

It's probably much easier suggested than done, but did you attempt to implement parallelization in a more general way so that other indices could benefit from it as well?

Yeah, that is part of the plan. I started with the block filter index because it requires an special treatment that txindex parallelization does not require (block/undo data reading and block filters creation can be parallelized but writing must be done sequentially due the need to link filter headers to their predecessors to create the filters-chain on disk).

My idea was to start reviewing this one, so the process gets as clean as possible, and then move forward with the generalization step. It's usually more natural to abstract processes when the specific cases are well-defined.

w0xlt

Concept ACK

Perhaps it could have separate parallelization and index logic. So it could be reused in other indexes.

Sjors · 2023-02-07T13:47:32Z

Building the index on (AMD Ryzen 7950X, blocks stored on SSD):

master @ fe86616: 35'15" (mostly 1 alternating CPU thread)
this PR (rebased)
- n=8: 5'20" (uses about 8 of 32 CPU threads as expected)
- n=32: 4'26" (pleasantly close to 100% CPU usage with a dip every 10 seconds, but it drops to only 1 CPU in the last minute or two)

I made sure to not load any wallets and disabled other indexes.

I didn't test if the index was correct.

I wonder if, for users without this index, it would be faster to generate the index, rescan the wallet and then delete it again. Combined with #26951 you would only have to generate filters up to the age of the wallet (IIUC, cc @pstratem).

Note to self for future benchmarks: -txindex takes 1:07'14", -coinstatsindex takes 3.5 hours.

furszy · 2023-02-07T14:18:36Z

Great results @Sjors!.

Could also give it a run rebased on top #27006.
On master, the index initial sync is slower when the node is in IBD because the index thread has to compete for access to block data on disk through cs_main acquisition.

I didn't test if the index was correct.

The PR contain a test verifying it.

Side note:
I'm working on generalizing the parallelization flow so other indexes, like the txindex and #26951 can make use of it too.

pinheadmz

code review ACK 068537a

Built and tested on macos/arm64 as well as Debian/x86. Left several questions and suggestions. I really like ThreadPool as a util and look forward to using it for http workers as well, to share the code.

I'm currently rebuilding block filter and tx indexes with this branch to compare against master, which after 48 hours was only about 70% complete...

Show Signature

pinheadmz's public key is on openpgp.org

src/test/threadpool_tests.cpp

pinheadmz · 2025-07-24T15:21:37Z

src/test/threadpool_tests.cpp

+                counter.fetch_add(1);
+            });
+        }
+        std::this_thread::sleep_for(std::chrono::milliseconds{100});


d2138be

is this sleep to give tasks a chance to execute if the blocking breaks?

d2138be

is this sleep to give tasks a chance to execute if the blocking breaks?

Good eye.
IIRC, I added it to wait until the workers actually get blocked. Otherwise, the thread pool queue size would be greater than the expected value (because it did not consumed the blocking tasks).
Still, we could remove this by adding some ready_promises, as did in test 2. A bit more code but less fragile.

src/test/threadpool_tests.cpp

src/init.cpp

src/index/base.cpp

src/index/base.h

src/index/blockfilterindex.cpp

pinheadmz · 2025-07-25T17:15:28Z

src/index/base.h

+    /// async result batches in a synchronous fashion (if required).
+    [[nodiscard]] virtual std::any CustomProcessBlock(const interfaces::BlockInfo& block_info) {
+        // If parallel sync is enabled, the child class must implement this method.
+        if (AllowParallelSync()) return std::any();


76aa1c2

I wonder if this condition should be a bit more attention-getting, since it should never execute right? Either log something or Assume() for the benefit of future index developers?

76aa1c2

I wonder if this condition should be a bit more attention-getting, since it should never execute right? Either log something or Assume() for the benefit of future index developers?

Hmm, or we could remove this line and call CustomAppend by default. Then it would be up to the child index to decide whether it needs sequential ordering during parallel sync or not. This way, the tx and BIP352 indexes would require fewer lines of code.

src/util/threadpool.h

pinheadmz · 2025-07-27T11:01:26Z

Did a rough benchmark test. Froze a full node at height 906551 by restarting with -noconnect and also -txindex -blockfilterindex. First test was from master, after 48 hours I aborted the process after reaching only:

$ bitcoin-cli getindexinfo
{
  "txindex": {
    "synced": false,
    "best_block_height": 770289
  },
  "basic block filter index": {
    "synced": true,
    "best_block_height": 906551
  }
}

Running this PR (restarting with empty indexes/), both indexing operations were complete after about 16 hours:

2025-07-25T18:12:46Z txindex thread start
2025-07-25T18:12:46Z basic block filter index thread start
...
2025-07-25T19:48:16Z basic block filter index thread exit
2025-07-26T10:14:21Z txindex thread exit

{
  "txindex": {
    "synced": true,
    "best_block_height": 906551
  },
  "basic block filter index": {
    "synced": true,
    "best_block_height": 906551
  }
}

pinheadmz

re-ACK 9bba5ef

Changes since last review are minimal responses to my own review suggestions. Built on macos/arm64 and debian/x86. Ran functional and unit tests, ran with -indexworkers=16 on mainnet fullnode with >900000 blocks

Show Signature

pinheadmz's public key is on openpgp.org

pinheadmz · 2025-07-29T14:50:53Z

src/util/threadpool.h

+
+        // Create workers
+        for (int i = 0; i < num_workers; i++) {
+            m_workers.emplace_back(&util::TraceThread, m_name + "_threadpool_worker_" + util::ToString(i), [this] { WorkerThread(); });


This is bike shedding now so ok to ignore -- but when we set thread names on Linux they are truncated to 15 characters. So I dunno I guess "thread" and "worker" are redundant in the name of a thread?

bitcoin/src/util/threadnames.cpp

Lines 22 to 27 in 2f410ad

//! internal name.

static void SetThreadName(const char* name)

{

#if defined(PR_SET_NAME)

// Only the first 15 characters are used (16 - NUL terminator)

::prctl(PR_SET_NAME, name, 0, 0, 0);

…ations by caching last header 99afb9d refactor: init, simplify index shutdown code (furszy) 0faafb5 index: decrease ThreadSync cs_main contention (furszy) f1469eb index: cache last block filter header (furszy) a6756ec index: blockfilter, decouple header lookup into its own function (furszy) 331f044 index: blockfilter, decouple Write into its own function (furszy) bcbd7eb bench: basic block filter index initial sync (furszy) Pull request description: Work decoupled from bitcoin#26966 per request. The aim is to remove an unnecessary disk read operation that currently takes place with every new arriving block (or scanned block during background sync). Instead of reading the last filter header from disk merely to access its hash for constructing the next filter, this work caches it, occupying just 32 more bytes in memory. Also, reduces `cs_main` lock contention during the index initial sync process. And, simplifies the indexes initialization and shutdown procedure. Testing Note: To compare the changes, added a pretty basic benchmark in the second commit. Alternatively, could also test the changes by timing the block filter sync from scratch on any network; start the node with `-blockfilterindex` and monitor the logs until the syncing process finish. Local Benchmark Results: *Master (c252a0f): | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 132,042,516.60 | 7.57 | 0.3% | 7.79 | `BlockFilterIndexSync` *PR (43a212c): | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 126,915,841.60 | 7.88 | 0.6% | 7.51 | `BlockFilterIndexSync` ACKs for top commit: Sjors: re-ACK 99afb9d achow101: ACK 99afb9d TheCharlatan: Re-ACK 99afb9d andrewtoth: ACK 99afb9d Tree-SHA512: 927daadd68f4ee1ca781a89519539b895f5185a76ebaf525fbc246ea8dcf40d44a82def00ac34b188640802844b312270067f1b33e65a2479e06be9169c616de # Conflicts: # src/index/base.cpp # src/index/base.h # src/index/blockfilterindex.cpp

DrahtBot · 2025-08-08T00:19:14Z

🚧 At least one of the CI tasks failed.
_{Task TSan, depends, no gui: https://github.com/bitcoin/bitcoin/runs/47637510992}
_{LLM reason (✨ experimental): The CI failure is due to the threadpool_tests timing out.}

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

…ations by caching last header 99afb9d refactor: init, simplify index shutdown code (furszy) 0faafb5 index: decrease ThreadSync cs_main contention (furszy) f1469eb index: cache last block filter header (furszy) a6756ec index: blockfilter, decouple header lookup into its own function (furszy) 331f044 index: blockfilter, decouple Write into its own function (furszy) bcbd7eb bench: basic block filter index initial sync (furszy) Pull request description: Work decoupled from bitcoin#26966 per request. The aim is to remove an unnecessary disk read operation that currently takes place with every new arriving block (or scanned block during background sync). Instead of reading the last filter header from disk merely to access its hash for constructing the next filter, this work caches it, occupying just 32 more bytes in memory. Also, reduces `cs_main` lock contention during the index initial sync process. And, simplifies the indexes initialization and shutdown procedure. Testing Note: To compare the changes, added a pretty basic benchmark in the second commit. Alternatively, could also test the changes by timing the block filter sync from scratch on any network; start the node with `-blockfilterindex` and monitor the logs until the syncing process finish. Local Benchmark Results: *Master (c252a0f): | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 132,042,516.60 | 7.57 | 0.3% | 7.79 | `BlockFilterIndexSync` *PR (43a212c): | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 126,915,841.60 | 7.88 | 0.6% | 7.51 | `BlockFilterIndexSync` ACKs for top commit: Sjors: re-ACK 99afb9d achow101: ACK 99afb9d TheCharlatan: Re-ACK 99afb9d andrewtoth: ACK 99afb9d Tree-SHA512: 927daadd68f4ee1ca781a89519539b895f5185a76ebaf525fbc246ea8dcf40d44a82def00ac34b188640802844b312270067f1b33e65a2479e06be9169c616de

ismaelsadeeq · 2025-08-12T11:19:15Z

Fwiw I encountered a segfault previously while testing on master on previous PR HEAD 9bba5ef

I ran the node on mainnet with -noconnect -txindex -blockfilterindex -indexworkers=4

Although I was not able to later reproduce the crash it seems to be happen intermittently cc @furszy

furszy

Although I was not able to later reproduce the crash it seems to be happen intermittently

Thanks for the report! Check it now if you can. There was a very subtle bug on which the worker threads might have accessed an index Sync() local variable post-destruction.

And add option to customize thread pool workers count

It also adds coverage for initial sync from a particular block. Mimicking a node restart.

DrahtBot · 2025-08-20T11:26:41Z

🚧 At least one of the CI tasks failed.
_{Task MSan, depends: https://github.com/bitcoin/bitcoin/runs/48475050585}
_{LLM reason (✨ experimental): The CI failure is caused by a test timeout due to the index not reaching the expected height within the allowed time.}

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot added the UTXO Db and Indexes label Jan 25, 2023

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch 3 times, most recently from deb2e6e to d8b0f5e Compare January 25, 2023 16:29

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from d8b0f5e to 65dc850 Compare January 28, 2023 15:04

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from 65dc850 to 09cc56a Compare January 28, 2023 15:40

mzumsande reviewed Jan 29, 2023

View reviewed changes

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from 09cc56a to 43e6237 Compare January 30, 2023 16:15

w0xlt reviewed Jan 30, 2023

View reviewed changes

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch 2 times, most recently from b2b62f5 to 4e1fba1 Compare January 30, 2023 20:55

furszy mentioned this pull request Jan 31, 2023

reduce cs_main scope, guard block index 'nFile' under a local mutex #27006

Closed

DrahtBot added the Needs rebase label Feb 17, 2023

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch 2 times, most recently from 1da78b2 to f264ed5 Compare February 21, 2023 14:01

DrahtBot removed the Needs rebase label Feb 21, 2023

pinheadmz approved these changes Jul 25, 2025

View reviewed changes

DrahtBot requested review from w0xlt, TheCharlatan, ryanofsky and ismaelsadeeq July 25, 2025 20:02

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from 068537a to 9bba5ef Compare July 25, 2025 21:35

pinheadmz approved these changes Jul 29, 2025

View reviewed changes

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from 9bba5ef to b349fba Compare August 7, 2025 22:43

DrahtBot added the CI failed label Aug 8, 2025

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch 2 times, most recently from 647f6f0 to f6a52c6 Compare August 8, 2025 02:33

DrahtBot removed the CI failed label Aug 8, 2025

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from f6a52c6 to 49efddf Compare August 12, 2025 20:25

furszy commented Aug 12, 2025

View reviewed changes

furszy mentioned this pull request Aug 19, 2025

Indexes stuck on unknown best block after unclean shutdown #33208

Open

DrahtBot added the Needs rebase label Aug 19, 2025

furszy added 5 commits August 20, 2025 05:17

util: introduce general purpose thread pool

f56a416

init: provide thread pool to indexes

ed8cb94

And add option to customize thread pool workers count

index: implement index parallel sync

dcbde84

index: enable block filter index parallel sync

2b0279e

It also adds coverage for initial sync from a particular block. Mimicking a node restart.

txindex: enable parallel sync

69792b6

furszy force-pushed the 2022_parallelize_blockfilter_index_2 branch from 49efddf to 69792b6 Compare August 20, 2025 09:26

DrahtBot added the CI failed label Aug 20, 2025

DrahtBot removed the Needs rebase label Aug 20, 2025

	//! internal name.
	static void SetThreadName(const char* name)
	{
	#if defined(PR_SET_NAME)
	// Only the first 15 characters are used (16 - NUL terminator)
	::prctl(PR_SET_NAME, name, 0, 0, 0);

index: initial sync speedup, parallelize process #26966

Are you sure you want to change the base?

index: initial sync speedup, parallelize process #26966

Conversation

furszy commented Jan 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing Notes:

Uh oh!

DrahtBot commented Jan 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

LLM Linter (✨ experimental)

Uh oh!

Sjors commented Jan 26, 2023

Uh oh!

furszy commented Jan 28, 2023

Uh oh!

mzumsande left a comment

Choose a reason for hiding this comment

Uh oh!

furszy commented Jan 30, 2023

Uh oh!

w0xlt left a comment

Choose a reason for hiding this comment

Uh oh!

Sjors commented Feb 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

furszy commented Feb 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pinheadmz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pinheadmz Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

furszy Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pinheadmz Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

furszy Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pinheadmz commented Jul 27, 2025

Uh oh!

pinheadmz left a comment

Choose a reason for hiding this comment

Uh oh!

pinheadmz Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Aug 8, 2025

Uh oh!

ismaelsadeeq commented Aug 12, 2025

Uh oh!

furszy left a comment

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Aug 20, 2025

Uh oh!

Uh oh!

furszy commented Jan 25, 2023 •

edited

Loading

DrahtBot commented Jan 25, 2023 •

edited

Loading

Sjors commented Feb 7, 2023 •

edited

Loading

furszy commented Feb 7, 2023 •

edited

Loading

pinheadmz left a comment •

edited

Loading