Skip to content

Conversation

martinus
Copy link
Contributor

@martinus martinus commented Jan 25, 2021

In #11281, commit ccd8ef6 the LOCK(cs_main) scope was reduced for the wallet. This does the same for the rest API, which is a huge performance boost for rest_block call when used from multiple threads.

I'm not 100% sure though if this is allowed here...

My test setup, on an Intel i7 with 6 cores (12 threads), locked to 3.2GHz:

  1. start a fully synced bitcoind, with this bitcoin.conf:
    server=1
    rest=1
    rpcport=8332
    rpcthreads=12
    rpcworkqueue=64
    txindex=1
    dbcache=2000
    
  2. Wait until log message shows loadblk thread exit, so that bitcoind is idle.
  3. Run ApacheBench: 10000 requests, 12 parallel threads, fetching block nr. 600000 in binary:
    ab -n 10000 -c 12 "http://127.0.0.1:8332/rest/block/00000000000000000007316856900e76b4f7a9139cfbfba89842c8d196cd5f91.bin"
    

Time per request (mean)

  • 97.434 [ms] on master
  • 20.431 [ms] this branch

So this can process 4.8 times as many requests, and saturates all my cores instead of keeping them partly idle waiting in the lock.

In bitcoin#11281, commit ccd8ef6 the `LOCK(cs_main)` scope was reduced for
the wallet. This does the same for the rest API, which is a huge
performance boost for `rest_block` call when used from multiple threads.

My test setup, on an Intel i7 with 6 cores (12 threads), locked to 3.2GHz:

1. start a fully synced bitcoind, with this `bitcoin.conf`:
   ```
   server=1
   rest=1
   rpcport=8332
   rpcthreads=12
   rpcworkqueue=64
   txindex=1
   dbcache=2000
   ```
2. Wait until log message shows `loadblk thread exit`, so that bitcoind
   is idle.
3. Run ApacheBench: 10000 requests, 12 parallel threads, fetching block
   nr. 600000 in binary:
   ```
   ab -n 10000 -c 12 "http://127.0.0.1:8332/rest/block/00000000000000000007316856900e76b4f7a9139cfbfba89842c8d196cd5f91.bin"
   ```
Requests per second:
* 97.434 [ms] (mean) on master
* 20.431 [ms] this branch

So this can process about 5 times as many requests, and saturates all my
cores instead of keeping them idle waiting in the lock.
Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, looks like same approach can be used in other places.

}

if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to call with CDiskBlockPos, not with CBlockIndex*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only moved the two lines out of the lock scope, the overload for ReadBlockFromDisk does that (I think CDiskBlockPos was renamed to FlatFilePos in 65a489e)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(While touching this) since the Apple "goto fail" vulnerability (https://dwheeler.com/essays/apple-goto-fail), we add brackets for conditionals of more than one line

-    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus()))
+    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus())) {
         return RESTERR(req, HTTP_NOT_FOUND, hashStr + " not found");
+    }

@jonatack
Copy link
Member

Concept ACK, will review soon.

@maflcko
Copy link
Member

maflcko commented Mar 17, 2021

See also #17161

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 62f1a5b

Maybe here in zmqpublishnotifier.cpp as well, IIUC, since the lock is taken in ReadBlockFromDisk()

bool ReadBlockFromDisk(CBlock& block, const CBlockIndex* pindex, const Consensus::Params& consensusParams)
{
    FlatFilePos blockPos;
    {
        LOCK(cs_main);
        blockPos = pindex->GetBlockPos();
    }
diff --git a/src/zmq/zmqpublishnotifier.cpp b/src/zmq/zmqpublishnotifier.cpp
index 6ae866cc07..10893b3291 100644
--- a/src/zmq/zmqpublishnotifier.cpp
+++ b/src/zmq/zmqpublishnotifier.cpp
@@ -206,7 +206,6 @@ bool CZMQPublishRawBlockNotifier::NotifyBlock(const CBlockIndex *pindex)
     const Consensus::Params& consensusParams = Params().GetConsensus();
     CDataStream ss(SER_NETWORK, PROTOCOL_VERSION | RPCSerializationFlags());
     {
-        LOCK(cs_main);
         CBlock block;
         if(!ReadBlockFromDisk(block, pindex, consensusParams))

}

if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(While touching this) since the Apple "goto fail" vulnerability (https://dwheeler.com/essays/apple-goto-fail), we add brackets for conditionals of more than one line

-    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus()))
+    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus())) {
         return RESTERR(req, HTTP_NOT_FOUND, hashStr + " not found");
+    }

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK 62f1a5b

Happy to also rereview if you decide to pick up jonatack's suggestion: #21006 (comment)

@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25168 (refactor: Avoid passing params where not needed by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@achow101
Copy link
Member

Are you still working on this?

@martinus
Copy link
Contributor Author

@achow101 nope, I don't have the time/interest any more for this

@fanquake
Copy link
Member

Picked up in #26308.

maflcko pushed a commit that referenced this pull request Dec 8, 2022
…any requests per second

d7f61e7 rpc: reduce LOCK(cs_main) scope in gettxoutproof (Andrew Toth)
4d92b5a rpc: reduce LOCK(cs_main) scope in GetUndoChecked and getblockstats (Andrew Toth)
efd82ae rpc: reduce LOCK(cs_main) scope in blockToJSON (Andrew Toth)
f00808e rpc: reduce LOCK(cs_main) scope in GetBlockChecked and getblock (Andrew Toth)
7d253c9 zmq: remove LOCK(cs_main) from NotifyBlock (Andrew Toth)
c75e3d2 rest: reduce LOCK(cs_main) scope in rest_block (Andrew Toth)

Pull request description:

  Picking up from #21006.

  After commit ccd8ef6 it is no longer required to hold `cs_main` when calling `ReadBlockFromDisk`. This can be verified in `master` at https://github.com/bitcoin/bitcoin/blob/master/src/node/blockstorage.cpp#L755. Same can be seen for `UndoReadFromDisk` https://github.com/bitcoin/bitcoin/blob/master/src/node/blockstorage.cpp#L485.

  The first commit moves `ReadBlockFromDisk` outside the lock scope in `rest_block`, where we can see a huge performance improvement when fetching blocks with multiple threads.

  My test setup, on an Intel i7 with 8 cores (16 threads):

  1. Start a fully synced bitcoind, with this `bitcoin.conf`:
  ```
      rest=1
      rpcthreads=16
      rpcworkqueue=64
      rpcuser=user
      rpcpassword=password
  ```
  2. Run ApacheBench: 10000 requests, 16 parallel threads, fetching block nr. 750000 in binary:
  ```
      ab -n 10000 -c 16 "http://127.0.0.1:8332/rest/block/0000000000000000000592a974b1b9f087cb77628bb4a097d5c2c11b3476a58e.bin"
  ```

  Time per request (mean)
      183 ms on master
      30 ms this branch

  So this can process 6.1 times as many requests, and saturates all the cores instead of keeping them partly idle waiting in the lock. With 8 threads the mean times were 90 ms on master and 19 ms on this branch, a speedup of 4.7x.

  Big thanks to martinus for finding this and the original PR.

  The second commit is from a suggestion on the original PR by jonatack to remove the unnecessary `LOCK(cs_main)` in the zmq notifier's `NotifyBlock`.

  I also found that this approach could be applied to rpcs `getblock` (including `verbosity=3`), `getblockstats`, and `gettxoutproof` with similar very good results. The above benchmarks steps need to be modified slightly for RPC. Run the following ApacheBench command with different request data in a file named `data.json`:
  ```
  ab -p data.json -n 10000 -c 16 -A user:password "http://127.0.0.1:8332/"
  ```
  For `getblock`, use the following in `data.json`:
  ```
  {"jsonrpc": "1.0", "id": "curltest", "method": "getblock", "params": ["0000000000000000000592a974b1b9f087cb77628bb4a097d5c2c11b3476a58e"]}
  ```
  master - 184 ms mean request time
  branch - 28 ms mean request time

  For `getblock` with verbosity level 3, use the following in `data.json`:
  ```
  {"jsonrpc": "1.0", "id": "curltest", "method": "getblock", "params": ["0000000000000000000592a974b1b9f087cb77628bb4a097d5c2c11b3476a58e", 3]}
  ```
  This verbosity level fetches an undo file from disk, so it benefits from this approach as well. However, a lot of time is spent serializing to JSON so the performance gain is not as severe.
  master - 818 ms mean request time
  branch - 505 ms mean request time

  For `getblockstats`, use the following in `data.json`:
  ```
  {"jsonrpc": "1.0", "id": "curltest", "method": "getblockstats", "params": ["0000000000000000000592a974b1b9f087cb77628bb4a097d5c2c11b3476a58e", ["minfeerate","avgfeerate"]]}
  ```
  This request used a lock on reading both a block and undo file, so the results are very good.
  master - 244 ms mean request time
  branch - 28 ms mean request time

ACKs for top commit:
  MarcoFalke:
    re-ACK d7f61e7 💫
  hebasto:
    ACK d7f61e7, I have reviewed the code and it looks OK. Did not make benchmarking though.

Tree-SHA512: 305ac945b4571c5f47646d4f0e78180d7a3d40b2f70ee43e4b3e00c96a465f6d0b9c750b8e85c89ed833e557e2cdb5896743f07ef90e4e53d4ad85452b545886
@bitcoin bitcoin locked and limited conversation to collaborators Oct 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants