-
Notifications
You must be signed in to change notification settings - Fork 37.7k
wallet: batch all individual spkms setup db writes in a single db txn #28894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wallet: batch all individual spkms setup db writes in a single db txn #28894
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Didn't look deeper at the code yet, but ran the new benchmark (on an SSD) for comparison via
$ ./src/bench/bench_bitcoin -filter=WalletCreatePlain\|WalletCreateEncrypted
master (commit bb4554c which adds the benchmark)
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
797,011,189.00 | 1.25 | 0.6% | 8.74 | WalletCreateEncrypted |
391,984,469.00 | 2.55 | 1.1% | 4.30 | WalletCreatePlain |
PR head (1901360):
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
724,523,571.00 | 1.38 | 0.5% | 7.97 | WalletCreateEncrypted |
319,389,684.00 | 3.13 | 0.8% | 3.50 | WalletCreatePlain |
Looks like a rough ~10% speedup for encrypted and ~22% speedup for plain wallets on my machine. Even without speedups, I think it's good practice to avoid entering an inconsistent state, as mentioned in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1901360
Not sure if b85b75d is needed since we're getting rid of the legacy wallet anyway. But it looks correct.
Ran bench on MacBook Pro 2019 (2,3 GHz 8-Core Intel Core i9, SSD):
Before:
src/bench/bench_bitcoin -filter="WalletCreate[E|P].*" -min-time=30000
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
924,961,364.67 | 1.08 | 1.8% | 30.41 | WalletCreateEncrypted |
537,964,679.50 | 1.86 | 2.3% | 35.03 | WalletCreatePlain |
After:
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
777,090,017.50 | 1.29 | 1.3% | 32.65 | WalletCreateEncrypted |
361,932,639.25 | 2.76 | 0.6% | 33.34 | WalletCreatePlain |
Here's a commit that batches the descriptor import for external signers: Sjors@286883a (I had to make TopupWithDB()
public)
Instead of performing multiple atomic write operations per descriptor setup call, batch them all within a single atomic db txn.
Instead of performing multiple atomic write operations per legacy spkm setup call, batch them all within a single atomic db txn.
Instead of doing one db transaction per descriptor setup, batch all descriptors' setup writes in a single db txn. Speeding up the process and preventing the wallet from entering an inconsistent state if any of the intermediate transactions fail.
1901360
to
5def411
Compare
Co-authored-by: furszy <matiasfurszyfer@protonmail.com>
5def411
to
f053024
Compare
Updated per feedback, thanks for the review @Sjors.
Thanks. Pulled the commit and made two changes to it:
|
re-utACK f053024 CI failure seems spurious |
Concept and Approach ACK |
ACK f053024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK f053024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-review ACK f053024
bilingual_str error_string; | ||
std::vector<bilingual_str> warnings; | ||
|
||
fs::path wallet_path = test_setup->m_path_root / strprintf("test_wallet_%d", random.rand32()).c_str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Does it need to be random? m_path_root
should already be (more) random. If not, it would be good to remove. Also, to reduce the use of c_str
, which (in other parts of the code) is fragile and can lead to bugs. So git grep c_str
will be less verbose if unused used of c_str are removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it doesn't. Initially, the wallet_path
was inside the bench.run()
lambda to create wallets under different paths. This was to avoid flushing the wallet to disk and removing the directory within the benchmark execution (which has nothing to do with what we are trying to benchmark). Then, at the end, I changed the approach to occupy less memory.
But.. thinking it further now, we could go back to the previous approach and not be concerned about memory by setting the max iterations number. Also, probably for another working path, wouldn't be bad to introduce a way to clean up stuff within the benchmark execution that is deliberately not included in the bench results.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, probably for another working path, wouldn't be bad to introduce a way to clean up stuff within the benchmark execution that is deliberately not included in the bench results.
Sgtm, but if the cleanup only takes a very short time, I think it is fine to just leave as-is.
|
||
fs::path wallet_path = test_setup->m_path_root / strprintf("test_wallet_%d", random.rand32()).c_str(); | ||
bench.run([&] { | ||
auto wallet = CreateWallet(context, wallet_path.u8string(), /*load_on_start=*/std::nullopt, options, status, error_string, warnings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: u8string
looks wrong here, according to the fs.h
documentation.
9a3c5c8 scripted-diff: rename ZapSelectTx to RemoveTxs (furszy) 83b7628 wallet: batch and simplify ZapSelectTx process (furszy) 595d50a wallet: migration, remove extra NotifyTransactionChanged call (furszy) a2b071f wallet: ZapSelectTx, remove db rewrite code (furszy) Pull request description: Work decoupled from #28574. Brother of #28894. Includes two different, yet interconnected, performance and code improvements to the zap wallet transactions process. 1) As the goal of the `ZapSelectTx` function is to erase tx records that match any of the inputted hashes. There is no need to traverse the whole database record by record. We could just check if the tx exist, and remove it directly by calling `EraseTx()`. 2) Instead of performing single write operations per removed tx record, this PR batches them all within a single atomic db txn. Moreover, these changes will enable us to consolidate all individual write operations that take place during the wallet migration process into a single db txn in the future. ACKs for top commit: achow101: ACK 9a3c5c8 josibake: ACK 9a3c5c8 Tree-SHA512: fb2ecc48224c400ab3b1fbb32e174b5b13bf03794717727f80f01f55fb183883b067a68c0a127b2de8885564da15425d021a96541953bf38a72becc2e9929ccf
c98fc36 wallet: migration, consolidate external wallets db writes (furszy) 7c9076a wallet: migration, consolidate main wallet db writes (furszy) 9ef20e8 wallet: provide WalletBatch to 'SetupDescriptorScriptPubKeyMans' (furszy) 34bf079 wallet: refactor ApplyMigrationData to return util::Result<void> (furszy) aacaaaa wallet: provide WalletBatch to 'RemoveTxs' (furszy) 57249ff wallet: introduce active db txn listeners (furszy) 91e065e wallet: remove post-migration signals connection (furszy) 055c053 wallet: provide WalletBatch to 'DeleteRecords' (furszy) 122d103 wallet: introduce 'SetWalletFlagWithDB' (furszy) 6052c78 wallet: decouple default descriptors creation from external signer setup (furszy) f2541d0 wallet: batch MigrateToDescriptor() db transactions (furszy) 66c9936 bench: add coverage for wallet migration process (furszy) Pull request description: Last step in a chain of PRs (#26836, #28894, #28987, #29403). #### Detailed Description: The current wallet migration process performs only individual db writes. Accessing disk to delete all legacy records, clone and clean each address book entry for every created wallet, create each new descriptor (with their corresponding master key, caches and key pool), and also clone and delete each transaction that requires to be transferred to a different wallet. This work consolidates all individual disk writes into two batch operations. One for the descriptors creation from the legacy data and a second one for the execution of the migration process itself. Efficiently dumping all the information to disk at once atomically at the end of each process. This represent a speed up and also a consistency improvement. During migration, we either want to succeed or fail. No other outcomes should be accepted. We should never leave a partially migrated wallet on disk and request the user to manually restore the previous wallet from a backup (at least not if we can avoid it). Since the speedup depends on the storage device, benchmark results can vary significantly. Locally, I have seen a 15% speedup on a USB 3.2 pendrive. #### Note for Testers: The first commit introduces a benchmark for the migration process. This one can be cherry-picked on top of master to compare results pre and post changes. Please note that the benchmark setup may take some time (~70 seconds here) due to the absence of a batching mechanism for the address generation process (`GetNewDestination()` calls). ACKs for top commit: achow101: ACK c98fc36 theStack: re-ACK c98fc36 pablomartin4btc: re-ACK c98fc36 Tree-SHA512: a52d5f2eef27811045d613637c0a9d0b7e180256ddc1c893749d98ba2882b570c45f28cc7263cadd4710f2c10db1bea33d88051f29c6b789bc6180c85b5fd8f6
Work decoupled from #28574.
Instead of performing multiple single write operations per spkm
setup call, this PR batches them all within a single atomic db txn.
Speeding up the process and preventing the wallet from entering
an inconsistent state if any of the intermediate transactions fail
(which shouldn't happen but.. if it does, it is better to not store
any spkm rather than storing them partially).
To compare the changes, added benchmark in the first commit.