Skip to content

Conversation

furszy
Copy link
Member

@furszy furszy commented Nov 16, 2023

Work decoupled from #28574.

Instead of performing multiple single write operations per spkm
setup call, this PR batches them all within a single atomic db txn.

Speeding up the process and preventing the wallet from entering
an inconsistent state if any of the intermediate transactions fail
(which shouldn't happen but.. if it does, it is better to not store
any spkm rather than storing them partially).

To compare the changes, added benchmark in the first commit.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 16, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors, achow101, BrandonOdiwuor, theStack
Approach ACK S3RK

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28574 (wallet: optimize migration process, batch db transactions by furszy)
  • #28333 (wallet: Construct ScriptPubKeyMans with all data rather than loaded progressively by achow101)
  • #27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
  • #27286 (wallet: Keep track of the wallet's own transaction outputs in memory by achow101)
  • #26008 (wallet: cache IsMine scriptPubKeys to improve performance of descriptor wallets by achow101)
  • #25907 (wallet: rpc to add automatically generated descriptors by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot changed the title wallet: batch all atomic spkms setup db writes in a single db txn wallet: batch all atomic spkms setup db writes in a single db txn Nov 16, 2023
@furszy furszy changed the title wallet: batch all atomic spkms setup db writes in a single db txn wallet: batch all individual spkms setup db writes in a single db txn Nov 16, 2023
@DrahtBot DrahtBot changed the title wallet: batch all individual spkms setup db writes in a single db txn wallet: batch all individual spkms setup db writes in a single db txn Nov 16, 2023
Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Didn't look deeper at the code yet, but ran the new benchmark (on an SSD) for comparison via

$ ./src/bench/bench_bitcoin -filter=WalletCreatePlain\|WalletCreateEncrypted

master (commit bb4554c which adds the benchmark)

ns/op op/s err% total benchmark
797,011,189.00 1.25 0.6% 8.74 WalletCreateEncrypted
391,984,469.00 2.55 1.1% 4.30 WalletCreatePlain

PR head (1901360):

ns/op op/s err% total benchmark
724,523,571.00 1.38 0.5% 7.97 WalletCreateEncrypted
319,389,684.00 3.13 0.8% 3.50 WalletCreatePlain

Looks like a rough ~10% speedup for encrypted and ~22% speedup for plain wallets on my machine. Even without speedups, I think it's good practice to avoid entering an inconsistent state, as mentioned in the PR description.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 1901360

Not sure if b85b75d is needed since we're getting rid of the legacy wallet anyway. But it looks correct.

Ran bench on MacBook Pro 2019 (2,3 GHz 8-Core Intel Core i9, SSD):

Before:

src/bench/bench_bitcoin -filter="WalletCreate[E|P].*" -min-time=30000                          
ns/op op/s err% total benchmark
924,961,364.67 1.08 1.8% 30.41 WalletCreateEncrypted
537,964,679.50 1.86 2.3% 35.03 WalletCreatePlain

After:

ns/op op/s err% total benchmark
777,090,017.50 1.29 1.3% 32.65 WalletCreateEncrypted
361,932,639.25 2.76 0.6% 33.34 WalletCreatePlain

Here's a commit that batches the descriptor import for external signers: Sjors@286883a (I had to make TopupWithDB() public)

@DrahtBot DrahtBot requested a review from theStack November 21, 2023 10:41
Instead of performing multiple atomic write
operations per descriptor setup call, batch
them all within a single atomic db txn.
Instead of performing multiple atomic write
operations per legacy spkm setup call, batch
them all within a single atomic db txn.
Instead of doing one db transaction per descriptor setup,
batch all descriptors' setup writes in a single db txn.

Speeding up the process and preventing the wallet from entering
an inconsistent state if any of the intermediate transactions
fail.
@furszy furszy force-pushed the 2023_wallet_batch_keypool_creation branch from 1901360 to 5def411 Compare November 22, 2023 02:03
Co-authored-by: furszy <matiasfurszyfer@protonmail.com>
@furszy furszy force-pushed the 2023_wallet_batch_keypool_creation branch from 5def411 to f053024 Compare November 22, 2023 02:07
@furszy
Copy link
Member Author

furszy commented Nov 22, 2023

Updated per feedback, thanks for the review @Sjors.

Here's a commit that batches the descriptor import for external signers: Sjors@286883a (I had to make TopupWithDB() public)

Thanks. Pulled the commit and made two changes to it:

  1. Instead of making TopupWithDB() public, made it protected. The ExternalSignerScriptPubKeyMan class is derived from DescriptorScriptPubKeyMan.
  2. Removed the unimplemented SetupDescriptor() function from DescriptorScriptPubKeyMan. Only ExternalSignerScriptPubKeyMan utilizes it.

@Sjors
Copy link
Member

Sjors commented Nov 22, 2023

re-utACK f053024

CI failure seems spurious

@S3RK
Copy link
Contributor

S3RK commented Nov 27, 2023

Concept and Approach ACK

@achow101
Copy link
Member

achow101 commented Dec 4, 2023

ACK f053024

@DrahtBot DrahtBot requested a review from S3RK December 4, 2023 21:56
Copy link
Contributor

@BrandonOdiwuor BrandonOdiwuor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f053024

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK f053024

@fanquake fanquake merged commit a7f4f1a into bitcoin:master Dec 8, 2023
bilingual_str error_string;
std::vector<bilingual_str> warnings;

fs::path wallet_path = test_setup->m_path_root / strprintf("test_wallet_%d", random.rand32()).c_str();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Does it need to be random? m_path_root should already be (more) random. If not, it would be good to remove. Also, to reduce the use of c_str, which (in other parts of the code) is fragile and can lead to bugs. So git grep c_str will be less verbose if unused used of c_str are removed.

Copy link
Member Author

@furszy furszy Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it doesn't. Initially, the wallet_path was inside the bench.run() lambda to create wallets under different paths. This was to avoid flushing the wallet to disk and removing the directory within the benchmark execution (which has nothing to do with what we are trying to benchmark). Then, at the end, I changed the approach to occupy less memory.

But.. thinking it further now, we could go back to the previous approach and not be concerned about memory by setting the max iterations number. Also, probably for another working path, wouldn't be bad to introduce a way to clean up stuff within the benchmark execution that is deliberately not included in the bench results.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, probably for another working path, wouldn't be bad to introduce a way to clean up stuff within the benchmark execution that is deliberately not included in the bench results.

Sgtm, but if the cleanup only takes a very short time, I think it is fine to just leave as-is.


fs::path wallet_path = test_setup->m_path_root / strprintf("test_wallet_%d", random.rand32()).c_str();
bench.run([&] {
auto wallet = CreateWallet(context, wallet_path.u8string(), /*load_on_start=*/std::nullopt, options, status, error_string, warnings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: u8string looks wrong here, according to the fs.h documentation.

achow101 added a commit that referenced this pull request Feb 12, 2024
9a3c5c8 scripted-diff: rename ZapSelectTx to RemoveTxs (furszy)
83b7628 wallet: batch and simplify ZapSelectTx process (furszy)
595d50a wallet: migration, remove extra NotifyTransactionChanged call (furszy)
a2b071f wallet: ZapSelectTx, remove db rewrite code (furszy)

Pull request description:

  Work decoupled from #28574. Brother of #28894.

  Includes two different, yet interconnected, performance and code improvements to the zap wallet transactions process.

  1) As the goal of the `ZapSelectTx` function is to erase tx records that match any of the inputted hashes. There is no need to traverse the whole database record by record. We could just check if the tx exist, and remove it directly by calling `EraseTx()`.

  2) Instead of performing single write operations per removed tx record, this PR batches them all within a single atomic db txn.

  Moreover, these changes will enable us to consolidate all individual write operations that take place during the wallet migration process into a single db txn in the future.

ACKs for top commit:
  achow101:
    ACK 9a3c5c8
  josibake:
    ACK 9a3c5c8

Tree-SHA512: fb2ecc48224c400ab3b1fbb32e174b5b13bf03794717727f80f01f55fb183883b067a68c0a127b2de8885564da15425d021a96541953bf38a72becc2e9929ccf
achow101 added a commit that referenced this pull request Oct 24, 2024
c98fc36 wallet: migration, consolidate external wallets db writes (furszy)
7c9076a wallet: migration, consolidate main wallet db writes (furszy)
9ef20e8 wallet: provide WalletBatch to 'SetupDescriptorScriptPubKeyMans' (furszy)
34bf079 wallet: refactor ApplyMigrationData to return util::Result<void> (furszy)
aacaaaa wallet: provide WalletBatch to 'RemoveTxs' (furszy)
57249ff wallet: introduce active db txn listeners (furszy)
91e065e wallet: remove post-migration signals connection (furszy)
055c053 wallet: provide WalletBatch to 'DeleteRecords' (furszy)
122d103 wallet: introduce 'SetWalletFlagWithDB' (furszy)
6052c78 wallet: decouple default descriptors creation from external signer setup (furszy)
f2541d0 wallet: batch MigrateToDescriptor() db transactions (furszy)
66c9936 bench: add coverage for wallet migration process (furszy)

Pull request description:

  Last step in a chain of PRs (#26836, #28894, #28987, #29403).

  #### Detailed Description:
  The current wallet migration process performs only individual db writes. Accessing disk to
  delete all legacy records, clone and clean each address book entry for every created wallet,
  create each new descriptor (with their corresponding master key, caches and key pool), and
  also clone and delete each transaction that requires to be transferred to a different wallet.

  This work consolidates all individual disk writes into two batch operations. One for the descriptors
  creation from the legacy data and a second one for the execution of the migration process itself.
  Efficiently dumping all the information to disk at once atomically at the end of each process.

  This represent a speed up and also a consistency improvement. During migration, we either
  want to succeed or fail. No other outcomes should be accepted. We should never leave a
  partially migrated wallet on disk and request the user to manually restore the previous wallet from
  a backup (at least not if we can avoid it).

  Since the speedup depends on the storage device, benchmark results can vary significantly.
  Locally, I have seen a 15% speedup on a USB 3.2 pendrive.

  #### Note for Testers:
  The first commit introduces a benchmark for the migration process. This one can be
  cherry-picked on top of master to compare results pre and post changes.

  Please note that the benchmark setup may take some time (~70 seconds here) due to the absence
  of a batching mechanism for the address generation process (`GetNewDestination()` calls).

ACKs for top commit:
  achow101:
    ACK c98fc36
  theStack:
    re-ACK c98fc36
  pablomartin4btc:
    re-ACK c98fc36

Tree-SHA512: a52d5f2eef27811045d613637c0a9d0b7e180256ddc1c893749d98ba2882b570c45f28cc7263cadd4710f2c10db1bea33d88051f29c6b789bc6180c85b5fd8f6
@bitcoin bitcoin locked and limited conversation to collaborators Dec 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants