-
Notifications
You must be signed in to change notification settings - Fork 37.7k
wallet: Improve AvailableCoins performance by reducing duplicated operations #24699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aec587b
to
e6e77b3
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK very nice, cursory glance looks good. do you have any benchmarks on how much this improves performance when dealing with large wallets with many transactions? |
One of the simulation scenarios I ran has 10050 deposits and 4950 spends. It's runtime went from 4 hr 18 min to 38 min after the changes in this PR (plus one additional change to the simulation script which would apply a speedup that I'm not sure about, but would not dominate the total runtime). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
for (const auto& entry : m_wallet->mapWallet) { | ||
result.emplace_back(MakeWalletTx(*m_wallet, entry.second)); | ||
result.emplace(MakeWalletTx(*m_wallet, entry.second)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a bad change for the GUI when loading a big wallet.
The order by hash requirement comes from TransactionTablePriv::updateWallet
implementation, definitely something to improve/refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it won't be too bad, but fixing that seems like a bigger change than suitable for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review ACK e6e77b3
e6e77b3
to
91c3f77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 91c3f77
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code reviewed 91c3f77.
Two things to add:
-
About the signing providers cache (fcc2160):
Seems that this going to cache the private keys when the wallet is unencrypted/unlocked and then keep them available in memory when the wallet gets encrypted/locked.--> update: meh nah.. could actually assert that the provider in the cache does not have a sk stored.Even when the changes are small, wouldn't hurt to add test coverage for it (like getting the signing provider without including the private keys, then get it with them, check cache update, etc).
Maybe, if you are ok, I could work on it (not sure how much it will take me to do it but.. I can give it a shot to start getting deeper over the descriptors architecture). -
I do agree with @promag comment for 3b8b47b but.. yeah, better to work on it on a future PR as well.
src/wallet/spend.cpp
Outdated
@@ -189,7 +189,7 @@ void AvailableCoins(const CWallet& wallet, std::vector<COutput>& vCoins, const C | |||
|
|||
bool solvable = provider ? IsSolvable(*provider, wtx.tx->vout[i].scriptPubKey) : false; | |||
bool spendable = ((mine & ISMINE_SPENDABLE) != ISMINE_NO) || (((mine & ISMINE_WATCH_ONLY) != ISMINE_NO) && (coinControl && coinControl->fAllowWatchOnly && solvable)); | |||
int input_bytes = GetTxSpendSize(wallet, wtx, i, (coinControl && coinControl->fAllowWatchOnly)); | |||
int input_bytes = CalculateMaximumSignedInputSize(wtx.tx->vout[i], provider.get(), /*use_max_sig=*/(coinControl && coinControl->fAllowWatchOnly)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing af54709:
nit: would be good to extract wtx.tx->vout[I]
into its own ref variable at the top of the vout
for loop (line 165). We are accessing the same value in the vector several times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do this if I have to touch again.
src/wallet/spend.cpp
Outdated
@@ -187,10 +187,13 @@ void AvailableCoins(const CWallet& wallet, std::vector<COutput>& vCoins, const C | |||
|
|||
std::unique_ptr<SigningProvider> provider = wallet.GetSolvingProvider(wtx.tx->vout[i].scriptPubKey); | |||
|
|||
bool solvable = provider ? IsSolvable(*provider, wtx.tx->vout[i].scriptPubKey) : false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-note for 7d64540:
The difference, aside from the clear speedup, is that we are no longer going to call VerifyScript
after producing the dummy signature (which if would had failed in the past, the node would had crashed for the IsSolvable
assertion).
Hmm, that would probably be better. I'll try that. I don't think there's really a benchmark for this so it'll be hard to measure the effect. I think the second solution is also faster as getting the range from the multimap appears to be O(n) on average too.
This was considered and it specifically only caches the SigningProvider before private keys are added by making a copy.
Feel free to try. You can reach out to me if you have any questions. |
Actually there are a lot of other places that |
If you like it, feel free to cherry-pick it: furszy/bitcoin@42fbb69 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code ACK 91c3f77
Code review re-ACK 1f61630 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 1f61630
In AvailableCoins, we need to know whether we can solve for an output. This was done by using IsSolvable, which just calls ProduceSignature and produces a dummy signature. However, we already do that in order to get the size of the input by using CalculateMaximumSignedInputSize. As this function returns -1 if ProduceSignature fails, we can just remove the use of IsSolvable and check that input_bytes is not -1 to determine the solvability of an output.
1f61630
to
38ead65
Compare
reACK 38ead65 verified rebase with |
re-ACK 38ead65 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff ACK 38ead65
In order to avoid constantly re-deriving the same keys in DescriptorScriptPubKeyMan, cache the SigningProviders generated inside of GetSigningProvider.
For some reason, the primary consumer of getWalletTxs requires the transactions to be in hash order when it is processing them. std::map will iterate in hash order so the transactions end up in that order when placed into the vector. To ensure this order when mapWallet is no longer ordered, the vector is replaced with a set which will maintain the hash order.
38ead65
to
bc886fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK bc886fc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff re-reACK bc886fc
…ducing duplicated operations
3a4f8bc bench: add benchmark for wallet 'AvailableCoins' function. (furszy) Pull request description: #### Rationale `AvailableCoins` is part of several important flows for the wallet; from RPC commands that create transactions like `fundrawtransaction`, `send`, `walletcreatefundedpsbt`, get the available balance, list the available coins with `listunspent` etc. to GUI connected processes that perform the same or similar actions: tx creation, available balance calculation, present the spendable coins in the coin control dialog. As we are improving this process in #24699, #25005 and there are more structural changes coming on the way. This benchmark aims to ensure us that, at least, there are no regressions (obviously performance improvements are great but, at least for me, this heads into the direction of having a base metric to compare future structural changes). #### Implementation Notes There are 5 new benchmarks, one per wallet supported output type (LEGACY, P2SH_SEGWIT, BECH32, BECH32M), plus a multi-output-type wallet benchmark which contains outputs from all the descriptor types. The test, by default, fills-up the wallet with 1k transactions, 2k outputs. Mainly to not consume much time if the user just want to verify that no substantial regressions were introduced. But, my expectation for those who are focused on this process is to use a much higher number locally to really note the differences across commits. ACKs for top commit: achow101: ACK 3a4f8bc hernanmarino: ACK 3a4f8bc aureleoules: ACK 3a4f8bc Tree-SHA512: d0bb4c165f1efa181b47cb31561e6217eff9135bcd1b6761a7292f9018e456d13d18a1b886c2e2268d35c52f9e1fd8e0f252972424e5c5f00c280620b79c5a1b
… function. 3a4f8bc bench: add benchmark for wallet 'AvailableCoins' function. (furszy) Pull request description: #### Rationale `AvailableCoins` is part of several important flows for the wallet; from RPC commands that create transactions like `fundrawtransaction`, `send`, `walletcreatefundedpsbt`, get the available balance, list the available coins with `listunspent` etc. to GUI connected processes that perform the same or similar actions: tx creation, available balance calculation, present the spendable coins in the coin control dialog. As we are improving this process in bitcoin#24699, bitcoin#25005 and there are more structural changes coming on the way. This benchmark aims to ensure us that, at least, there are no regressions (obviously performance improvements are great but, at least for me, this heads into the direction of having a base metric to compare future structural changes). #### Implementation Notes There are 5 new benchmarks, one per wallet supported output type (LEGACY, P2SH_SEGWIT, BECH32, BECH32M), plus a multi-output-type wallet benchmark which contains outputs from all the descriptor types. The test, by default, fills-up the wallet with 1k transactions, 2k outputs. Mainly to not consume much time if the user just want to verify that no substantial regressions were introduced. But, my expectation for those who are focused on this process is to use a much higher number locally to really note the differences across commits. ACKs for top commit: achow101: ACK 3a4f8bc hernanmarino: ACK 3a4f8bc aureleoules: ACK 3a4f8bc Tree-SHA512: d0bb4c165f1efa181b47cb31561e6217eff9135bcd1b6761a7292f9018e456d13d18a1b886c2e2268d35c52f9e1fd8e0f252972424e5c5f00c280620b79c5a1b
c497a19 Fix comment about how wallet txs are sorted (John Moffett) Pull request description: The wallet transactions in the node are not sorted by txid (or any hash) since bitcoin/bitcoin#24699. This is how they're stored in memory now: https://github.com/bitcoin-core/gui/blob/835212cd1d8f8fc7f19775f5ff8cc21c099122b2/src/wallet/wallet.h#L397-L399 ACKs for top commit: achow101: ACK c497a19 jarolrod: ACK c497a19 Tree-SHA512: e72559991688452ef254474d4235dc75fac655bce04909c3a0eece907360f4c6f57707db9b4373a4bd2271b23c57e863684c33e0728adf48e477c5499cdfdad7
… sorted c497a19 Fix comment about how wallet txs are sorted (John Moffett) Pull request description: The wallet transactions in the node are not sorted by txid (or any hash) since bitcoin#24699. This is how they're stored in memory now: https://github.com/bitcoin-core/gui/blob/835212cd1d8f8fc7f19775f5ff8cc21c099122b2/src/wallet/wallet.h#L397-L399 ACKs for top commit: achow101: ACK c497a19 jarolrod: ACK c497a19 Tree-SHA512: e72559991688452ef254474d4235dc75fac655bce04909c3a0eece907360f4c6f57707db9b4373a4bd2271b23c57e863684c33e0728adf48e477c5499cdfdad7
While running my coin selection simulations, I noticed that towards the end of the simulation, the wallet would become slow to make new transactions. The wallet generally performs much more slowly when there are a large number of transactions and/or a large number of keys. The improvements here are focused on wallets with a large number of transactions as that is what the simulations produce.
Most of the slowdown I observed was due to
DescriptorScriptPubKeyMan::GetSigningProvider
re-deriving keys every time it is called. To avoid this, it will now cache theSigningProvider
produced so that repeatedly fetching theSigningProvider
for the same script will not result in the same key being derived over and over. This has a side effect of making the function non-const, which makes a lot of other functions non-const as well. This helps with wallets with lots of address reuse (as my coin selection simulations are), but not if addresses are not reused as keys will end up needing to be derived the first timeGetSigningProvider
is called for a script.The
GetSigningProvider
problem was also exacerbated by unnecessarily fetching aSigningProvider
for the same script multiple times. ASigningProvider
is retrieved to be used inside ofIsSolvable
. A few lines later, we useGetTxSpendSize
which fetches aSigningProvider
and then callsCalculateMaximumSignedInputSize
. We can avoid a second call toGetSigningProvider
by usingCalculateMaximumSignedInputSize
directly with theSigningProvider
already retrieved forIsSolvable
.There is an additional slowdown where
ProduceSignature
with a dummy signer is called twice for each output. The first time isIsSolvable
checks thatProduceSignature
succeeds, thereby informing whether we have solving data. The second isCalculateMaximumSignedInputSize
which returns -1 ifProduceSignature
fails, and returns the input size otherwise. We can reduce this to one call ofProduceSignature
by usingCalculateMaximumSignedInputSize
's result to setsolvable
.Lastly, a lot of time is spent looking in
mapWallet
andmapTxSpends
to determine whether an output is already spent. The performance of these lookups is slightly improved by changing those maps to usestd::unordered_map
andstd::unordered_multimap
respectively.