-
Notifications
You must be signed in to change notification settings - Fork 37.7k
wallet: Increase OUTPUT_GROUP_MAX_ENTRIES to 100 #18418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Concept ACK |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Will review when #17824 is merged and this is rebased. Some context for reviewers: https://bitcoincore.reviews/17824.html#l-327 (from here to the end of the discussion) |
Added some minor cleanups from #17824 |
Rebased, no code changes. |
Rebased, since #17824 was merged. |
Rebased |
src/wallet/init.cpp
Outdated
@@ -43,7 +43,7 @@ const WalletInitInterface& g_wallet_init_interface = WalletInit(); | |||
void WalletInit::AddWalletOptions(ArgsManager& argsman) const | |||
{ | |||
argsman.AddArg("-addresstype", strprintf("What type of addresses to use (\"legacy\", \"p2sh-segwit\", or \"bech32\", default: \"%s\")", FormatOutputType(DEFAULT_ADDRESS_TYPE)), ArgsManager::ALLOW_ANY, OptionsCategory::WALLET); | |||
argsman.AddArg("-avoidpartialspends", strprintf("Group outputs by address, selecting all or none, instead of selecting on a per-output basis. Privacy is improved as an address is only used once (unless someone sends to it after spending from it), but may result in slightly higher fees as suboptimal coin selection may result due to the added limitation (default: %u (always enabled for wallets with \"avoid_reuse\" enabled))", DEFAULT_AVOIDPARTIALSPENDS), ArgsManager::ALLOW_ANY, OptionsCategory::WALLET); | |||
argsman.AddArg("-avoidpartialspends", strprintf("Group outputs by public key, selecting many (possibly all) or none, instead of selecting on a per-output basis. Privacy is improved as public keys are mostly swept with fewer transactions and outputs are aggregated in clean change addresses. It may result in higher fees due to less optimal coin selection caused by this added limitation and possibly a larger-than-necessary number of inputs being used. Always enabled for wallets with \"avoid_reuse\" enabled, otherwise default: %u.", DEFAULT_AVOIDPARTIALSPENDS), ArgsManager::ALLOW_ANY, OptionsCategory::WALLET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it by scriptPubKey? It seems to me that "address" might be more accurate / intuitive for users than public key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. UTXOs could be sent to multiple addresses corresponding to the same pubkey, and they'd be grouped into different Output Groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scriptPubKey can also have multiple address types and those would be grouped in the same output group, which was the motivation to make this more precise if I remember correctly. So spk would be the correct term to use here IMO. But it seems there is a lot of support for not making a change and spk might be too technical here, so I have reverted this bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scriptPubKey can also have multiple address types
I don't believe this is true. A scriptPubKey will encode deterministically to a single address (base58 for P2PKH and P2SH, bech32 for P2WPKH and P2WSH, bech32m for segwit v1+).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this code changed since I opened this PR then. It used to be an issue, see #17605. As sipa mentions there it was never an issue on descriptor wallets but I didn't expect that behavior was changed on legacy wallets as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, this was about marking as reused, not the grouping. I guess this is what made me do the doc change in the first place but I was just confused.
@@ -342,14 +342,14 @@ def test_full_destination_group_is_preferred(self): | |||
txid = self.nodes[1].sendtoaddress(address=ret_addr, amount=0.5) | |||
inputs = self.nodes[1].getrawtransaction(txid, 1)["vin"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I added a line to grab the amount and fee for this transaction here, i.e.
txinfo = self.nodes[1].gettransaction(txid)
print("Paid fee of {} when sending {}".format(txinfo["fee"], txinfo["amount"]))
It looks like the fee would be 0.00136966BTC 😱 I thought it might get caught by maxapsfee
but it went through. Am I missing something else that would catch this case?
I understand this scenario isn't typical, but is this maybe... not the best default option for users? 0.00137 is like $66, and they could be paying this fee on a 0.001BTC transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a problem (or at least not worse than any alternative). The high fees are intrinsic in the UTXOs that are already owned by the wallet. The fact that they get realized when making this 0.001BTC payment is no better/worse than if they were realized when making a subsequent payment for 100BTC, or if they were used to make a subsequent payment of 0.00001 BTC.
The only way this could be improved is if our coin selection algorithm favored using relatively few coins when the prevailing fee rate is high, and favored using relatively many coins when the prevailing fee rate is low. I don't believe we have such logic in our wallet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(typed this before github showed johns anwer so there is some duplication here I think) It's not great in theory. But I think we can expect that users know what they are getting into here. The users have reused addresses a lot (willingly or unwillingly) and realized that this is a threat to their privacy. Then they have opted in to avoid reuse which warns users that this can create higher fees via high numbers of inputs. And there isn't even much of a risk here if somebody sets this option just for the fun of it. As long as there is no address reuse in that wallet, avoid reuse also won't kick in.
Because this is a privacy feature that users have opted into it seems the better trade-off to me that someone is surprised by a higher fee rather than being surprised that they could only consolidate 10 outputs in one transaction when they had 30 outputs in that destination. And then they have to further watch out to no compromise themselves when consolidating the remaining 20 outputs. What users are getting aside from privacy protection is a consolidation transaction. While the fees might be high on this first tx it will hopefully also create a change output that is larger and will be much cheaper to spend later on.
maxapsfee
is only relevant if avoid partial spends is not activated. If avoid partial spends is not set, the wallet will still try to avoid partial spends and prefer to use those coins that avoid partial spends of those that don't if the premium is below the threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh thanks for explaining this. I can see how a user turning -avoidpartialspends
on would be consciously choosing to opt for privacy over lower short-term fees.
maxapsfee is only relevant if avoid partial spends is not activated. If avoid partial spends is not set, the wallet will still try to avoid partial spends and prefer to use those coins that avoid partial spends of those that don't if the premium is below the threshold.
Ohh my bad, thanks for being patient with me!
@@ -446,7 +446,7 @@ static RPCHelpMan sendtoaddress() | |||
{"estimate_mode", RPCArg::Type::STR, RPCArg::Default{"unset"}, std::string() + "The fee estimate mode, must be one of (case insensitive):\n" | |||
" \"" + FeeModes("\"\n\"") + "\""}, | |||
{"avoid_reuse", RPCArg::Type::BOOL, RPCArg::Default{true}, "(only available if avoid_reuse wallet flag is set) Avoid spending from dirty addresses; addresses are considered\n" | |||
"dirty if they have previously been used in a transaction."}, | |||
"dirty if they have previously been used in a transaction. If true, this also activates avoidpartialspends, grouping outputs by their addresses."}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can mention avoidpartialspends in single quotes or a complete sentence to describe what it does: If true, it will avoid partial spends and group unspent outputs by address.
Reading the comments about increasing nit: I think it will be better if we have MIN, MAX and DEFAULT in such cases and allow user to set anything between MIN and MAX. For
Mentioned it as 'nit' because use of constants in Bitcoin Core is already discussed in few PRs: #21815 (comment) (4) Increasing it from 10 to 100 is not an issue, real issues are with coin selection however that is out of scope for this PR. Had one related question which was answered by Andrew Chow: https://bitcoin.stackexchange.com/q/106418/ Other issue: Tx:
I am not sure why other inputs were used because sum of inputs associated with Maybe because it will create a small change? But |
ACK e6fe1c3 |
@prayank23: Please try to focus on the PR in question. Additional features can be added in a separate PR e.g. user configurable max. The StackExchange post is interesting but I would argue not relevant to the intent of this PR. Relay connections are certainly not relevant. |
To review this PR you need to test
Never mentioned anything about relay. The link of other PR is about use of different constants in Bitcoin Core. It is relevant because value of OUTPUT_GROUP_MAX_ENTRIES was 10 earlier and 100 after this PR with not enough reasons to justify those numbers which affects Bitcoin Core users unaware of these things. |
I agree with you that the rationale (upsides and downsides to this change) for increasing from 10 to 100 seems weakly explained/explored and hence it is hard to assess Concept ACK/NACK. As @xekyo said in the PR review club:
It seems we only have gut feel of wallet devs as a reason for Concept ACKing. Ideally we would have more than that but we can explore in the PR review club later. I think how constants are set generally (e.g. relay connections) are a much broader subject best not explored in the comments of this PR. |
@michaelfolkson I've had a think about this and my rough framework for the soundness of a
It seems to me that 10 is too low for this and weakens the utility of
@prayank An |
|
retACK e6fe1c3 |
tACK Verified the tests are checking against intended behaviour. No strong feeling over the value of @glozow 's rationale here #18418 (comment) makes sense to me. But in the end this is also as arbitrary as 10. Would be better if this rationale could be included in the PR description. Was a little hard to find reasoning behind why it was decided to change it. |
I was thinking about possible examples in which a user has saved
In both cases its difficult to know if 100 will be good enough for all the users. Maybe it works for most of them.
Agree. It will be useful if users could decide this and save in Since we had discussed about other wallets in PR review club meeting, I was curious if they use any such max limit: Electrum: if any coin is spent from a user address, all coins are. Wasabi: Treat same-scriptpubkey coins as one big coin. If group has an unconfirmed, then the whole group is unconfirmed. Joinmarket : To prevent forced address reuse attacks, (sometimes called dust attacks, but that's a poor name), any deposit to an already-used address is freezed by default. Samourai: Not sure but I tested with 15 UTXOs for same address and they were all added as inputs for the transaction. I am assuming it is same as Electrum or Wasabi. |
ACK e6fe1c3 |
Concept NACK
Reason: Privacy. There are few other related issues as well which are mentioned in #22018 |
@prayank23: I suspect that increasing OUTPUT_GROUP_MAX_ENTRIES to unlimited (no max) would get some NACKs based on the discussions at the PR review clubs and @glozow's "low enough to not be a footgun (fee-wise) for users that turn on -avoidpartialspends" And an option in bitcoin.conf would have greater scope and more complexity than this PR though anyone is free to open that PR before or after this PR is merged. Hence I think this PR is a choice between the status quo (10) or 100 rather than the two choices you outline as your preferences. And there seems to be consensus amongst the wallet devs that 100 is better than 10. Hence Concept ACK, Approach ACK on this PR. You have to set some defaults. You can't have users needing to decide before they get started on a large number of variables where many of them will have no idea what they represent. Hence a new default of 100 with a possible future PR to make it an option seems the best path forward. |
10 or 100, both are bad for privacy. Increasing it from 10 to 100 is worse if you consider attacker will be able to know more about the version of core used by victim.
Then we should mention the privacy issues in docs, trade-offs and reasons for not improving privacy. |
I'm pretty sure keeping the version of Core private is near the bottom of the privacy priority wishlist. Otherwise we would make no improvements and add no features to Core that leak the Core version being used. The general advice appears to be that users should run a recent version (if not the most recent version) of Core. That doesn't leak anything regarding your transactions or your net worth.
I think everyone supports improving the docs. edit: This discussion should move to IRC and away from this PR I think. Your Concept NACK has been logged with a rationale, other reviewers can assess whether they agree with it or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK e6fe1c3
Follow-up to #17824.
This increases OUTPUT_GROUP_MAX_ENTRIES to 100 which means that OutputGroups will now be up to 100 outputs large, up from previously 10. The main motivation for this change is that during the PR review club on #17824 several participants signaled that 100 might be a better value here.
I think fees should be manageable for users but more importantly, users should know what they can expect when using the wallet with this configuration, so I also tried to clarify the documentation on
-avoidpartialspends
andavoid_reuse
a bit. If there are other additional ways how or docs where users can be made aware of the potential consequences of using these parameters, please let me know. Another small upside is that there seem to be a high number of batching transactions with 100 and 200 inputs(source) giving these transactions a bit of a larger anonymity set, although that is probably a very weak argument.