Skip to content

Conversation

furszy
Copy link
Member

@furszy furszy commented Apr 15, 2023

During initial block download, the node's wallet(s) scans every arriving block looking for data that it owns.
This process can be resource-intensive, as it involves sequentially scanning all transactions within each
arriving block.

To avoid wasting processing power, we can skip blocks that occurred before the wallet's creation time,
since these blocks are guaranteed not to contain any relevant wallet data.

This has direct implications (an speed improvement) on the underlying blockchain synchronization process
as well. The reason is that the validation interface queue is limited to 10 tasks per time. This means that no
more than 10 blocks can be waiting for the wallet(s) to be processed while we are synchronizing the chain
(activating the best chain to be more precise).
Which can be a bottleneck if blocks arrive and are processed faster from the network than what they are
processed by the wallet(s).

So, by skipping not relevant blocks in the wallet's IBD scanning process, we will also improve the chain
synchronization time.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 15, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK pinheadmz, ishaanam, achow101
Concept ACK pablomartin4btc, MarcoFalke

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27286 (wallet: Keep track of the wallet's own transaction outputs in memory by achow101)
  • #26836 (wallet: finish addressbook encapsulation and simplify addressbook migration by furszy)
  • #26596 (wallet: Migrate legacy wallets to descriptor wallets without requiring BDB by achow101)
  • #26008 (wallet: cache IsMine scriptPubKeys to improve performance of descriptor wallets by achow101)
  • #25907 (wallet: rpc to add automatically generated descriptors by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@pinheadmz
Copy link
Member

concept ACK, seems like a great idea.

@@ -83,6 +83,7 @@ struct BlockInfo {
const uint256& hash;
const uint256* prev_hash = nullptr;
int height = -1;
int64_t time = 0;
Copy link
Member

@maflcko maflcko Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use int64_t, when the block time is denoted in a type-safe NodeSeconds?

Edit: I guess all you need to do is switch from ...::GetBlockTime() to ...::Time()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly to be consistent with the chain interface, wallet and spkm sources. Timestamps are represented as int64_t there. E.g. the FoundBlock class, findFirstBlockWithTimeAndHeight method, the CWallet::m_best_block_time member, first key time, descriptor creation time etc.

I think that is clear that block times are always in seconds, so the use of an arithmetic type shouldn't be much of a worry but we could switch all of them to NodeSeconds too (probably in another PR that does it all at once?).

fanquake added a commit that referenced this pull request Apr 17, 2023
e07dd5f test: fix bumpfee 'spend_one_input' occasional failure (furszy)

Pull request description:

  CI test failure, in master: https://cirrus-ci.com/task/5975232842825728.
  In #27469 https://cirrus-ci.com/task/6452468402356224

  Most of the subtests in `wallet_bumpfee.py` expect to find spendable UTXOs of 0.001 btc in the rbf wallet. They use the `spend_one_input()` method which fails if none of them exist.

  The sporadic failure comes from the recently added `test_feerate_checks_replaced_outputs` subtest that can spend all them. Leaving the next subtests with no 0.001 UTXOs to spend.

  To solve it, this PR moves the recently added case into a "context independent subtests" section.
  Which is placed at the end to not affect other cases.

ACKs for top commit:
  achow101:
    ACK e07dd5f
  pablomartin4btc:
    ACK e07dd5f.

Tree-SHA512: c150ed6fcfbb6f1502eaf6370a57aae0da991c0fc48e8bb3d446805f4336abba5d80ff0de26094914da95432dd0255030fe527001af4510dfdcefbc7230f14d6
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 17, 2023
…ailure

e07dd5f test: fix bumpfee 'spend_one_input' occasional failure (furszy)

Pull request description:

  CI test failure, in master: https://cirrus-ci.com/task/5975232842825728.
  In bitcoin#27469 https://cirrus-ci.com/task/6452468402356224

  Most of the subtests in `wallet_bumpfee.py` expect to find spendable UTXOs of 0.001 btc in the rbf wallet. They use the `spend_one_input()` method which fails if none of them exist.

  The sporadic failure comes from the recently added `test_feerate_checks_replaced_outputs` subtest that can spend all them. Leaving the next subtests with no 0.001 UTXOs to spend.

  To solve it, this PR moves the recently added case into a "context independent subtests" section.
  Which is placed at the end to not affect other cases.

ACKs for top commit:
  achow101:
    ACK e07dd5f
  pablomartin4btc:
    ACK bitcoin@e07dd5f.

Tree-SHA512: c150ed6fcfbb6f1502eaf6370a57aae0da991c0fc48e8bb3d446805f4336abba5d80ff0de26094914da95432dd0255030fe527001af4510dfdcefbc7230f14d6
Copy link
Member

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some general code review. I see we already have nCreateTime for descriptors and part of this PR seems to be just bringing that up to the wallet level, so that's not a redundancy.

Also I don't know if this is a problematic edge case or it could just be a good way to test but you could create a key and send BTC to it in a block, then a few blocks later import into a new descriptor and rescan. The wallet should miss the first transaction.

@furszy furszy force-pushed the 2023_wallet_birthtime branch from 4698614 to 9785754 Compare April 18, 2023 14:35
@furszy
Copy link
Member Author

furszy commented Apr 18, 2023

Thanks for the review pinheadmz.

Also I don't know if this is a problematic edge case or it could just be a good way to test but you could create a key and send BTC to it in a block, then a few blocks later import into a new descriptor and rescan. The wallet should miss the first transaction.

This PR is not modifying the rescan process. Only IBD.
Guess that you provided a timestamp that is +2 hours ahead of the block time (network time variability window) or the "now" arg to skip rescan?

@furszy furszy force-pushed the 2023_wallet_birthtime branch from 9785754 to a2d3902 Compare April 18, 2023 14:42
Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. Light code review, I'd to spend more time with it, which I'll do soon. Good stuff!

@furszy furszy force-pushed the 2023_wallet_birthtime branch 2 times, most recently from 3914f4f to 1a9a349 Compare April 19, 2023 00:29
@furszy furszy force-pushed the 2023_wallet_birthtime branch 2 times, most recently from 7bfe7b4 to 7be71a2 Compare April 20, 2023 21:53
@furszy
Copy link
Member Author

furszy commented Apr 20, 2023

Updated per feedback, thanks achow.

Changes:

  • Replaced BLANK_BIRTH_TIME constant by setting the m_birth_time to its max numeric value.
  • Tackled a block time variability concern by using the maximum time in the chain (a moving forward only timestamp) instead of the raw block time.
  • Extended the grace period to be twice as big as the one used in the rescan process. Even when this is not strictly needed, the rationale is to be a bit more conservative during IBD scanning because it's an automatic process with no user interaction (unlike the rescan process that is usually triggered by an user action such an import).

Copy link
Contributor

@ishaanam ishaanam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK, this looks good so far but I haven't reviewed the first commit yet.

@furszy furszy force-pushed the 2023_wallet_birthtime branch from 7be71a2 to 76396ca Compare May 23, 2023 21:25
@furszy furszy force-pushed the 2023_wallet_birthtime branch from 7cf50fb to 82bb783 Compare May 25, 2023 13:47
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. Haven't reviewed the code.

@pinheadmz
Copy link
Member

This PR is not modifying the rescan process. Only IBD.

sanity check: because we already skip old blocks during rescan?

#26679

@furszy
Copy link
Member Author

furszy commented May 25, 2023

sanity check: because we already skip old blocks during rescan?

#26679

Yes. Different entry points.

#26679 addressed the scanning of the existing chain during load time.
And here, the emphasis shifts to the synchronization process when the
wallet is active.
When the node activates the best chain on each processed block and
signal the "block connected" event.

@pinheadmz
Copy link
Member

To review this PR I wrote a simple functional test: pinheadmz@1f0f945 which surprised me by failing. I added logging and comments to the test to share it because I don't entirely understand what's happening but it looks to me like, the wallet generates a combo(...) descriptor with a timestamp of 1 and therefore the wallet creation time is always 1 and so the new logic doesn't actually skip any blocks. But if that's the case, does that also mean #26679 doesn't work either?

test output:

2023-05-25T23:10:29.244000Z TestFramework (INFO): PRNG seed is: 8288012825723617302
2023-05-25T23:10:29.245000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__5pngcq0
2023-05-25T23:10:29.886000Z TestFramework (INFO): Generate 10 blocks with 10-hour timestamp gaps
2023-05-25T23:10:29.927000Z TestFramework (INFO): Initial blockchain sync the wallet's node

not(500.00000000 == 250)

Why does this descriptor have a timestamp of 1?
{'desc': 'combo(022d715e6f0d7c037dbccd3327f0b3bc3c76258a7e017e1a64902f76406c87dfda)#pesd53nq', 'timestamp': 1, 'active': False}

2023-05-25T23:10:31.083000Z TestFramework (INFO): Stopping nodes
2023-05-25T23:10:31.306000Z TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test__5pngcq0
2023-05-25T23:10:31.306000Z TestFramework (INFO): Tests successful

and then in bitcoin debug.log, this output from CWallet::blockConnected(): m_birth_time.load(): 1. Every time we run that function, the log says wallet birthday is 1

@furszy
Copy link
Member Author

furszy commented May 26, 2023

To review this PR I wrote a simple functional test: pinheadmz@1f0f945 which surprised me by failing. I added logging and comments to the test to share it because I don't entirely understand what's happening but it looks to me like, the wallet generates a combo(...) descriptor with a timestamp of 1 and therefore the wallet creation time is always 1 and so the new logic doesn't actually skip any blocks. But if that's the case, does that also mean #26679 doesn't work either?

The issue is on your test setup. The combo descriptor comes from the test framework default wallet creation process, not the inner wallet.

The framework, after creating the default wallet, imports one of the hardcoded private keys (link) used to receive the coinbases coins (link).

This means calling the test framework importprivkey, which is a proxy function with two possible outcomes:

  1. For legacy wallet, call the RPC command importprivkey: which doesn't receive a timestamp, so creation time is initialized to the minimum one.
  2. For descriptors wallet, call the RPC command importdescriptors with timestamp=0. (which then is changed by the node internally to 1, because of the minimum supported timestamp)

The simplest solution for the test is to create the wallet by yourself, by calling createwallet, instead of using the default one initialized by the framework.
And, remember to set the mock time before creating the wallet, so it uses the time established by you and not the clock time.

Thanks for testing :).

@pinheadmz
Copy link
Member

@furszy awwwwwesome thank you for catching that. Ok, passing test is now at pinheadmz@f123870

Copy link
Member

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 82bb783

Reviewed code and tested locally, wrote a test. I am running this branch side-by-side with master on a VPS and will report who wins the race in a day or so ;-)

Show Signature
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

ACK 82bb7831fa6052620998c7eef47e48ed594248a8
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmRwxNIACgkQ5+KYS2KJ
yTq5nhAA3dJDzRO4VmDUA8n9YO4RtpcQBHT/t9NOsoyNFszTZOQb47JqCTDSGBeZ
mwgdfTIMWDkIesi5lG81ATdGTvxeJjx/7OjN90x46Lj3bJuyVTJxrnrVtLuv/2tg
G9fnfXO2Ngod/daq9psvNhLfQLhW3CQNtsJ3x4NOJafKLOyxmd+dKxnXzJaKAicp
ZlnIyUtArxLiDFXRmJRpmKAX2TRj49HooiY6gKlyyTtiOitB48AQdsh6hzaQ2O3q
wyYzIwM8jEC3/hIbemPaoLphc6esevgXRCcEo3zv+VvnDdrtNBwXEKFtThkPniEq
5hF2zaVCB6cqlvjcFMnkZOVafxE9GWe7fJR7OPa0W2vS3YaSZD7zdhwTpK2KV0Cj
HG21b2x5VFep6cQDl1Xqlktu21ExAWg2zOwESCnrJ7riRXtb3+Xe/eny90aeYhyX
0+ORbVy8AVyBYC1+nBjqpJA8JAoncm3xgNTpbgvX8XrflSaLcbhod5UAh0R+NE4c
QGvGDa6ITO0+Cu/Yba4TwgOxzBmA/oXSxUecGR5eQi3KuKobVlfdLbKgjt1tfscs
t414zb/cjcgpxVMXrrLZMMttg06I9YvSVUEcPCeU2YgiLgigTqunwsZ7KsOJM5ih
HB4+y7/lIg+7kme6ZheZ221r7nWY6atR5piFe09AqLmKPBEviio=
=VjDW
-----END PGP SIGNATURE-----

pinheadmz's public key is on keybase

@DrahtBot DrahtBot requested review from achow101 and ishaanam May 26, 2023 14:41
@ishaanam
Copy link
Contributor

re-ACK 82bb783

@DrahtBot DrahtBot removed the request for review from ishaanam May 26, 2023 22:09
@achow101
Copy link
Member

re-ACK 82bb783

@DrahtBot DrahtBot removed the request for review from achow101 May 27, 2023 01:18
@achow101 achow101 merged commit 10c4a46 into bitcoin:master May 27, 2023
@furszy furszy deleted the 2023_wallet_birthtime branch May 27, 2023 01:42
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 29, 2023
…scanning prior birth time

82bb783 wallet: skip block scan if block was created before wallet birthday (furszy)
a082434 refactor: single method to append new spkm to the wallet (furszy)

Pull request description:

  During initial block download, the node's wallet(s) scans every arriving block looking for data that it owns.
  This process can be resource-intensive, as it involves sequentially scanning all transactions within each
  arriving block.

  To avoid wasting processing power, we can skip blocks that occurred before the wallet's creation time,
  since these blocks are guaranteed not to contain any relevant wallet data.

  This has direct implications (an speed improvement) on the underlying blockchain synchronization process
  as well. The reason is that the validation interface queue is limited to 10 tasks per time. This means that no
  more than 10 blocks can be waiting for the wallet(s) to be processed while we are synchronizing the chain
  (activating the best chain to be more precise).
  Which can be a bottleneck if blocks arrive and are processed faster from the network than what they are
  processed by the wallet(s).

  So, by skipping not relevant blocks in the wallet's IBD scanning process, we will also improve the chain
  synchronization time.

ACKs for top commit:
  ishaanam:
    re-ACK 82bb783
  achow101:
    re-ACK 82bb783
  pinheadmz:
    ACK 82bb783

Tree-SHA512: 70158c9657f1fcc396badad2c4410b7b7f439466142640b31a9b1a8cea4555e45ea254e48043c9b27f783d5e4d24d91855f0d79d42f0484b8aa83cdbf3d6c50b
@maflcko
Copy link
Member

maflcko commented May 30, 2023

Looks like the fuzz test needs an update?

82bb7831fa6052620998c7eef47e48ed594248a8 is the first bad commit
echo 'AQAAAAEAAAA='|base64  --decode  > /tmp/a
FUZZ=wallet_notifications ./src/test/fuzz/fuzz  /tmp/a
fuzz: wallet/test/fuzz/notifications.cpp:175: void wallet::(anonymous namespace)::wallet_notifications_fuzz_target(FuzzBufferType): Assertion `total_amount == bal_a + bal_b' failed.
==301525== ERROR: libFuzzer: deadly signal

@furszy
Copy link
Member Author

furszy commented May 30, 2023

Thanks Marco. Fixed in #27786.
Would be nice if the CI alert us about this kind of stuff prior merging the PR.
The base fuzzing corpus should have detected it right?.

@maflcko
Copy link
Member

maflcko commented May 31, 2023

The base fuzzing corpus should have detected it right?.

Yes, but the fuzz target is too slow right now, so the fuzz input directory is empty.

fanquake added a commit to bitcoin-core/gui that referenced this pull request May 31, 2023
a10f032 fuzz: fix wallet notifications.cpp (furszy)

Pull request description:

  Fixing bitcoin/bitcoin#27469 (comment).

  As the fuzzing test requires all blocks to be scanned by the wallet
  (because it is asserting the wallet balance at the end), we need to
  ensure that no blocks are skipped by the recently added wallet
  birth time functionality.

  This just means setting the chain accumulated time to the maximum
  value, so the wallet birth time is always below it, and the block is
  always processed by the wallet.

ACKs for top commit:
  MarcoFalke:
    lgtm ACK a10f032, thanks

Tree-SHA512: c9b38c52917cc36674415470752625b8161fc6b878b0b87d6926b462ba9666be3c225d396604c7e944a4c268fc35fc624807777aa0ed94bddbe18d8f8436de3c
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 31, 2023
a10f032 fuzz: fix wallet notifications.cpp (furszy)

Pull request description:

  Fixing bitcoin#27469 (comment).

  As the fuzzing test requires all blocks to be scanned by the wallet
  (because it is asserting the wallet balance at the end), we need to
  ensure that no blocks are skipped by the recently added wallet
  birth time functionality.

  This just means setting the chain accumulated time to the maximum
  value, so the wallet birth time is always below it, and the block is
  always processed by the wallet.

ACKs for top commit:
  MarcoFalke:
    lgtm ACK a10f032, thanks

Tree-SHA512: c9b38c52917cc36674415470752625b8161fc6b878b0b87d6926b462ba9666be3c225d396604c7e944a4c268fc35fc624807777aa0ed94bddbe18d8f8436de3c
achow101 added a commit that referenced this pull request Oct 17, 2023
…gging

8a553c9 wallet: Add TxStateString function for debugging and logging (Ryan Ofsky)

Pull request description:

  I found this useful while debugging silent conflict between #10102 and #27469 recently

ACKs for top commit:
  ishaanam:
    utACK 8a553c9
  achow101:
    ACK 8a553c9
  furszy:
    Code ACK 8a553c9

Tree-SHA512: 87965c66bcb59a21e7639878bb567e583a0e624735721ff7ad1104eed6bb9fba60607d0e3de7be3304232b3a55f48bab7039ea9c26b0e81963e59f9acd94f666
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Oct 21, 2023
… and logging

8a553c9 wallet: Add TxStateString function for debugging and logging (Ryan Ofsky)

Pull request description:

  I found this useful while debugging silent conflict between bitcoin#10102 and bitcoin#27469 recently

ACKs for top commit:
  ishaanam:
    utACK 8a553c9
  achow101:
    ACK 8a553c9
  furszy:
    Code ACK 8a553c9

Tree-SHA512: 87965c66bcb59a21e7639878bb567e583a0e624735721ff7ad1104eed6bb9fba60607d0e3de7be3304232b3a55f48bab7039ea9c26b0e81963e59f9acd94f666
@furszy furszy mentioned this pull request Jan 4, 2024
18 tasks
@bitcoin bitcoin locked and limited conversation to collaborators May 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants