Skip to content

Conversation

achow101
Copy link
Member

We've long since moved away from using the legacy address type unless the user explicitly wants it, so we should not re-introduce using it as change automatically.

@kristapsk
Copy link
Contributor

For privacy reasons I might want to have P2PKH change if sending to P2PKH recipient.

@luke-jr
Copy link
Member

luke-jr commented Feb 19, 2022

If someone really doesn't want this behaviour, wouldn't they probably just not have a legacy/internal provider?

@achow101
Copy link
Member Author

If someone really doesn't want this behaviour, wouldn't they probably just not have a legacy/internal provider?

It is not possible to remove a provider. A user would have to create a blank wallet, create descriptors for the other address types, and import them.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept NACK

As @kristapsk mentioned users should be able to express selectively.

@luke-jr
Copy link
Member

luke-jr commented Feb 19, 2022

It is not possible to remove a provider.

Maybe a better solution would be a way to disable using a provider?

@achow101
Copy link
Member Author

For privacy reasons I might want to have P2PKH change if sending to P2PKH recipient.

You can still choose to by setting the startup option -changetype, or explicitly setting the change type or change address during transaction creation. However as a default behavior, I don't think we should be allowing P2PKH change.

Concept NACK

As @kristapsk mentioned users should be able to express selectively.

Users are still able to. User overrides still allow P2PKH.

@maflcko
Copy link
Member

maflcko commented Feb 19, 2022

Related #23831

@S3RK
Copy link
Contributor

S3RK commented Feb 21, 2022

It is not possible to remove a provider.

Maybe a better solution would be a way to disable using a provider?

It's possible to deactivate a descriptor using importdescriptors with active:false which stops it from producing new addresses.

@achow101 achow101 requested a review from murchandamus March 11, 2022 15:05
Copy link
Contributor

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, but the motivation for this PR could be explained a bit better.

My take would be that as it is, Bitcoin Core will spend all types of UTXOs together anyway, so matching a recipient output's type to then spend the funds together with e.g. P2TR outputs in a later transaction would mean that in many cases there would be heavy indication what the change was upon spending (if it wasn't obvious for other tell-tales already). So, as it is, it would seem that output matching provides at best a temporary privacy benefit. On the other hand, legacy outputs incur significantly more cost in blockweight and fees upon spending. So, it seems fine to me to never create legacy outputs unless it's explicitly demanded or unavoidable. @josibake tried to convince me of the benefits of output matching generally recently, so I'd be curious to get a refresher of his arguments on this matter.

I am actually thinking right now that recipient output type matching could be a lot more beneficial, if we preferentially built input sets for coin selection from a single type of UTXOs. The resulting input sets would probably also be more waste-effective than mixed input sets, as e.g. legacy input sets would be preferred at low feerates and more modern output types should be more attractive at higher feerates than mixed sets of the same input count.

@@ -2036,8 +2036,6 @@ OutputType CWallet::TransactionChangeType(const std::optional<OutputType>& chang
any_wpkh = true;
} else if (type == TxoutType::SCRIPTHASH) {
any_sh = true;
} else if (type == TxoutType::PUBKEYHASH) {
any_pkh = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also delete the creation of the variable any_pkh in 2028, since it's not used anywhere.

if (has_legacy_spkman && any_pkh) {
// Currently pkh is the only type supported by the LEGACY spkman
return OutputType::LEGACY;
}

if (has_bech32m_spkman) {
return OutputType::BECH32M;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the idea of nudging people to use bech32m by default, but what's the point of a "default" if we then default to something else?

@josibake
Copy link
Member

@josibake tried to convince me of the benefits of output matching generally recently, so I'd be curious to get a refresher of his arguments on this matter.

Matching the change address to the payment address preserves the user's privacy at the time of the transaction and until the change UTXO is later spent. Additionally, users can take steps to keep from leaking information in a later transaction, e.g migrate the legacy UTXO to a newer output type (which on-chain would look like the UTXO was being spent to a bech32), before mixing it with other UTXOs to fund later transactions.

I am actually thinking right now that recipient output type matching could be a lot more beneficial, if we preferentially built input sets for coin selection from a single type of UTXOs. The resulting input sets would probably also be more waste-effective than mixed input sets, as e.g. legacy input sets would be preferred at low feerates and more modern output types should be more attractive at higher feerates than mixed sets of the same input count.

This is the ideal scenario! We keep output matching for all types to preserve privacy and we have coin selection avoid mixing different output types when selecting inputs and always prefer to get rid of older output types when possible.

I've been working on this in #24584 and would love feedback on the approach. This is better than not matching legacy addresses as it keeps the privacy improvements for all address types while also addressing concerns about later leaking information and concerns about creating new legacy UTXO's.

Concept -0 on not matching change outputs for legacy, in favor of #24584

@maflcko
Copy link
Member

maflcko commented Mar 16, 2022

while also addressing concerns about later leaking information and concerns about creating new legacy UTXO's.

I don't think ##24584 affects the creation of legacy UTXO's unless I am missing something?

@josibake
Copy link
Member

I don't think ##24584 affects the creation of legacy UTXO's unless I am missing something?

ah, I could have worded that better. if we keep matching for legacy addresses, under #24584 new legacy UTXOs will still be created. the logic in #24584 tries to always spend these older outputs first so that they are quickly rotated into new type UTXOs.

@murchandamus
Copy link
Contributor

ACK e33d46a

@achow101
Copy link
Member Author

achow101 commented Aug 1, 2022

Rebased for silent merge conflict.

We've long since moved away from using the legacy address type unless
the user explicitly wants it, so we should not re-introduce using it as
change.
@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 4, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@murchandamus
Copy link
Contributor

reACK 16ada8d

@ghost
Copy link

ghost commented Aug 5, 2022

We've long since moved away from using the legacy address type unless the user explicitly wants it, so we should not re-introduce using it as change automatically.

Why not? If the user is sending bitcoin to legacy address it should be okay to use a legacy change address by default.

Isn't the PR undoing some of the privacy benefits added in #23789

@murchandamus
Copy link
Contributor

I've been thinking more about the proposed change here. I would argue that before #24584, matching output types mostly improved privacy temporarily for the time between a transaction was created and the change output was spent, as the circumstances in which the change was later spent would more often reveal that it was the change, e.g. when it got spent together with more modern outputs which would indicate that the other output probably was the cause for using legacy outputs. There might have been more of a benefit if the change was a solitary input on the next transaction. However, especially given that there are many other heuristics that also hint at the change output, it seemed a bit unreasonable to increase the future cost of spending the change output by a factor 2.17× for this tenuous privacy benefit.

After #24584 our wallets are presumably achieving single-type input sets for most transactions, and this actually changes the benefit of matching legacy outputs more to a question of “privacy vs cost”. Although other heuristics that allow recognizing change would still not be impacted (e.g. based on amounts, or general wallet fingerprints matching between original and change transaction but not recipient transaction), but at least the legacy output is less likely to be mixed in with other outputs.

On the other hand, there is also a meta question of whether creating more legacy outputs helps sustain the acceptability of legacy use. Fewer legacy outputs on the network in general would also make matching legacy and then later spending legacy stick out more. Right now feerates are extremely low, so maybe the additional cost is currently worth the privacy benefit, but if the feerates go up significantly again, sitting on more legacy outputs would be more costly—although they should get selected quickly at the current low feerates as well. And potential higher feerates in the future would prompt more people to switch to blockweight-efficient output types to lower their own future cost, which would reduce our creation of legacy outputs in turn then.

So, perhaps we don't need to get active at this time and remove matching of legacy outputs now. Most of the downsides will fix themselves in the long run anyway, and at current feerates the cost is likely low anyway.

I.e. code looks good to me, I could see the argument go either way for various reasons.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK

Sorry for disagreeing with the changes that you worked on since Feb. This affects privacy and makes no sense. Not trying to block something at last moment that has lot of consensus but this PR started with disagreements.

Finding change in a transaction is a big win for chain analysis algos and @MarcoFalke did a great work in #23789 that would make it difficult. There were some issues which @josibake partially fixed in #24584. This is a rare event in bitcoin core and I dont want to undo it.

ℹ️ Even samourai (privacy wallet) has not fixed the issues mentioned in #24584

Privacy matters a lot in bitcoin but wasn't appreciated by everyone and I have seen comments for PRs in past (Example: Coin Control being opposed by one influential dev in 2011). In some cases bank accounts provide better privacy for clients. They don't broadcast every transaction, we do. So change address plays an important role in UTXO transactions.

World is not going to end if some users send bitcoin to legacy address using bitcoin core wallet which is already less used. However, this may inspire other projects using bitcoin core RPC.

@achow101
Copy link
Member Author

achow101 commented Aug 9, 2022

Why not? If the user is sending bitcoin to legacy address it should be okay to use a legacy change address by default.

Not necessarily. The user is sending Bitcoin to an address given to them by the recipient - they have no control over that address type. However the user may want to use segwit or taproot because those types are cheaper when spending. In a scenario where the user is trying to optimize for least fees (which many users do), it is actually not okay to send to change to legacy because that will incur higher fees in the future.

This affects privacy and makes no sense.

Privacy is not the end all be all of wallet design. Bitcoin Core's wallet is not a privacy focused wallet - we do not advertise as being private nor is privacy the primary focus. We try to encourage privacy and implement privacy improving behavior, but not to the detriment of other factors, such as cost to the user and effect on all the other nodes in the network. Given that legacy address types cost more to the user, cost more in block space, and can cost more to validate (in terms of CPU time), I think it is better for us to avoid making legacy outputs as much as possible, and that would include the contents of this PR. While it does have a negative impact on privacy, I do not think that the privacy gained from this behavior outweights the effects that continued legacy use has on everyone else.

A potential alternative is to add an option that enables strict change type matching. Instead of removing this behavior entirely, it can be put behind a startup option or a transaction creation option so that users who do want the additional privacy can choose to enable it.

@ghost
Copy link

ghost commented Aug 9, 2022

Privacy is not the end all be all of wallet design. Bitcoin Core's wallet is not a privacy focused wallet - we do not advertise as being private nor is privacy the primary focus. We try to encourage privacy and implement privacy improving behavior, but not to the detriment of other factors, such as cost to the user and effect on all the other nodes in the network.

Bitcoin Core may not be privacy focused in past. It all depends on users and reviewers.

Given that legacy address types cost more to the user, cost more in block space, and can cost more to validate (in terms of CPU time), I think it is better for us to avoid making legacy outputs as much as possible, and that would include the contents of this PR.

Users can decide if they want to use legacy or other addresses in that case if they have a choice.

While it does have a negative impact on privacy, I do not think that the privacy gained from this behavior outweights the effects that continued legacy use has on everyone else.

Privacy is either by default or nothing. Leaving loopholes for chain analysis firms to make algos based on these PRs does not make sense.

@josibake
Copy link
Member

Right now feerates are extremely low, so maybe the additional cost is currently worth the privacy benefit, but if the feerates go up significantly again, sitting on more legacy outputs would be more costly—although they should get selected quickly at the current low feerates as well. And potential higher feerates in the future would prompt more people to switch to blockweight-efficient output types to lower their own future cost, which would reduce our creation of legacy outputs in turn then.

So, perhaps we don't need to get active at this time and remove matching of legacy outputs now. Most of the downsides will fix themselves in the long run anyway, and at current feerates the cost is likely low anyway.

This sums up my feelings on this. I'd add that high feerates is a future problem, whereas privacy is a current problem. Not saying that we shouldn't be optimizing for fees now, but rather saying that I'd prefer we improve privacy today with slight tradeoffs in efficiency vs solutions that improve efficiency today at the expensive of privacy.

It's also worth mentioning that there are often ways to intervene with inefficient UTXOs (e.g migrating legacy UTXOs to newer types manually in low fee periods, improved coin-selection algos that are fee and address type aware, etc), whereas damaging privacy on-chain is a permanent mistake.

@maflcko
Copy link
Member

maflcko commented Aug 11, 2022

However, especially given that there are many other heuristics that also hint at the change output

Jup, just to mention one example: The Bitcoin Core wallet already sets the locktime, which is a "rare" (~20%) signal, so most change can already be assumed correctly by just excluding the outputs that were spent without locktime set.

If locktime isn't enough there are plenty of other fields:

@ghost
Copy link

ghost commented Aug 11, 2022

However, especially given that there are many other heuristics that also hint at the change output

Jup, just to mention one example: The Bitcoin Core wallet already sets the locktime, which is a "rare" (~20%) signal, so most change can already be assumed correctly by just excluding the outputs that were spent without locktime set.

If locktime isn't enough there are plenty of other fields:

There are a few other wallets that use same default locktime, version etc. as core and these aren't reasons to make privacy worse by not matching legacy address for change. Could be improvements in future or solved by discussion with devs of other wallets.

@murchandamus
Copy link
Contributor

Users can decide if they want to use legacy or other addresses in that case if they have a choice.

@1440000bytes: Just like they'd have a choice if this PR were merged—they would just have to explicitly select the legacy output type for their change…

Obviously, defaults matter which is why you care what the default is.

The argument is not about whether there is a privacy benefit (we all agree), but about how the privacy benefit weighs compared to the increased cost. I would request that you provide evidence for the tradeoff being worth it rather than just repeatedly demanding the best possible privacy at any cost—the former would be a contribution to the conversation, the latter lacks nuance and becomes just noise.

Also, if you don't mind, which other wallets match Bitcoin Core's fingerprints?

@ghost
Copy link

ghost commented Aug 11, 2022

@1440000bytes: Just like they'd have a choice if this PR were merged—they would just have to explicitly select the legacy output type for their change…

Obviously, defaults matter which is why you care what the default is.

There are already so many things users have to look at if they want privacy. I care about the changes in this pull request because neither I want to make such mistakes in bitcoin transactions nor I want to see others being affected. Example:

Alice: I just a did a transaction using bitcoin core and chainalysis won't be able to find change address in this

Bob: I hope you didn't send it to legacy address?

Alice: 😢

bumpfee RPC doesn't even allow users selecting inputs, change etc.

Privacy is more important than other things and if this bothers some developers, there should be an option in GUI and bitcoin.conf to optimize certain things in transactions based on fees, privacy etc. Example:

image

At least users won't have to carefully look at several things when doing bitcoin transactions.

The argument is not about whether there is a privacy benefit (we all agree), but about how the privacy benefit weighs compared to the increased cost. I would request that you provide evidence for the tradeoff being worth it rather than just repeatedly demanding the best possible privacy at any cost—the former would be a contribution to the conversation, the latter lacks nuance and becomes just noise.

  1. I am not demanding anything. I am disagreeing with the changes in this pull request as a reviewer and responding to comments.

  2. If we all agree there is a privacy benefit in not merging this pull request there is no point in speculating about future fees and fingerprints. If change address could be identified using other things, we should try to fix them if possible and not make things worse. If they really affect privacy then wallet: Strictly match tx change type to improve privacy #23789 wallet: avoid mixing different OutputTypes during coin selection #24584 are useless and should never have been merged.

Also, if you don't mind, which other wallets match Bitcoin Core's fingerprints?

Bluewallet: Version 2, Locktime 0
Blockstream Green: Version 2, Locktime 0
Hexa: Version 2, Locktime 0

I am sure there might be more wallets that use same version and locktime. Some wallets also match the fingerprint partially because either locktime or version matches. Wallets that use Bitcoin Core RPC which I had also mentioned in earlier comments would most probably have the same fingerprint. Transactions could even be sent to other bitcoin core user.


I am working as developer for an exchange and we will be using bitcoin core wallets so it can be added to the list of exchanges that use same fingerprint.

@murchandamus
Copy link
Contributor

#23789 and #24584 clearly were privacy improvements. While the former traded that off for additional cost, the latter actually seems to result both in a privacy improvement and a cost reduction. I agree with you that we should continue to work on removing privacy leaks. If you're looking for a project that is 100% focused on privacy, you will continue to be disappointed.

Beyond that I think it's fair I explain myself to you before going back to mostly ignoring you:
I get that you consider privacy the utmost priority. It's a valid point-of-view, and valuable for the project someone take that position. However, Bitcoin Core has a heterogeneous user base. While you care deeply for privacy, other users might care more about other things, e.g. usability or fees. Our decision should consider all of our users' needs, not just yours, and must be achievable within the available resources we an muster: we must consider security, reliability, privacy, cost, feasibility, maintainability, usability, relatability, and current network circumstances, etc.…

While you appear to gladly suffer any sort of discomfort, costs, and overabundance of configuration options (which need to be developed, maintained, documented, and may be confusing to users) for better privacy, another user might be surprised and annoyed to pay more than twice the fees and never even consider changing defaults. While the additional cost is easy to denominate, the privacy reduction is not a binary all-or-nothing as you make it out to be, but depends on its actual real-world effect in context of the transaction's publication. Privacy is inherently contextual to the corresponding UTXO pool, the involved parties, the unique transaction history they appear in as well as what everyone else is doing on the network and the threat-model you're considering.

As someone that does care above average about privacy (and has contributed to a number of privacy improvements over the years), but does not hold the "privacy-at-any-cost" position, frankly, I consider your constant barrage of “contributions” to the mailing list, issues and PRs a disservice to our goals. By intransigently taking the fundamental position “privacy-over-everything” and piss-poorly arguing for it, you're undermining more nuanced discussions on the topics you care most about and inject noise that hampers progress. Perhaps you should consider whether your actions actually further your goals or whether you're doing more harm than good.

@ghost
Copy link

ghost commented Aug 12, 2022

consider your constant barrage of “contributions” to the mailing list, issues and PRs a disservice to our goals

While I partially agree with a few things you mentioned, I don't need permission from anyone to contribute based on my interests. I will continue to post things I understand and can help bitcoin. Others are free to do it in other topics.

This pull request doesn't involve payjoin, coinjoin etc. and is related to a basic privacy associated with any UTXO model chain. Revealing change address to save a few dollars should never be a default in a project like bitcoin core. If users are okay with it, why not create the options suggested in previous comment?

In future there could be requests to only match taproot address by accusing others of being too privacy focused or their contributions or privacy works better with only taproot address. We should try to make it difficult that change address is not revealed and not create opportunities. There are better ways to deal with fees vs privacy options.


Calling other opinion as "noise" and their contributions as "barrage" won't help

@achow101
Copy link
Member Author

Since this was introduced in 23.0, there has not been a noticeable increase in legacy outputs being created, so there doesn't seem to be much of an effect either way. Given that fees are currently low, this change probably won't have much of an effect right now, so closing this. However, if feerates or legacy usage increases, or if users complain, we may want to revisit this in the future.

@achow101 achow101 closed this Aug 12, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Aug 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants