Skip to content

Conversation

furszy
Copy link
Member

@furszy furszy commented Sep 1, 2022

Depends on #27601, please go there first.

This work aims to define, and implement a base standard mechanism to
detect individual change outputs.

Context

Currently, the wallet detects whether an output is change or not based
on data stored in the address book.

There is no notion of “change outputs”, the wallet detects change scripts.

Connoting that any address book record modification has implications
on all the historical outputs related to that particular destination. Meaning
that all those outputs can either be change or not. There is no middle-ground
granular distinction.

How Change Detection Currently Works?

The wallet walks-through the transaction outputs, extracts the script
destination and verify the following two points:

  1. If the destination doesn't exist in the address book, then the script
    is a "change address".

  2. If the destination exists in the address book, but it doesn't have a
    label, then the script is a "change address".

Motivation

There are a good number of problems in the current approach:

  • We make the wallet dependent on an external structure, with separate storage.
    Which has to be updated and maintained along with the wallet state.

  • It cannot be maintained nor recovered across different wallet instances.
    Cannot re-create the, possibly custom, address book data only by importing
    the wallet descriptor string.

  • As the address book is an structure that the user can freely modify, the change
    detection process might differ through different wallets.

  • The current rudimentary assumptions of "no address book entry" or "no label set for the address book entry"
    to denote that certain script destination is change or not can easily be broken:
    E.g. derive an address from one of the wallet’s external paths manually. Then send coins to it.
    As the receive destination wasn't created inside the wallet, the wallet has no associated address book entry.
    So, the reception is invalidly detected as change (added a test case for it).

  • The wallet can't detect change outputs on more complex scripts such as multi-sig change outputs.

  • The wallet is not able to detect change outputs going to an internal address if the internal address has a label.
    (E.g. the user can manually set a label for the internal address and, doing that, make that all the change
    outputs, in the wallet history, that were sent to the destination are no longer detected as change).

  • There isn’t a way to distinguish the external reception of coins into an internal address. Coins reception on any
    internal address are always detected as change.

New Change Detection Mechanism Goals

Aiming to:

  • Define a base mechanism to align different wallet implementations. Preventing each piece of software
    from diverging on the basic change outputs distinction.

  • Detect change outputs on-demand without requiring to maintain an external data structure synced with the
    latest wallet state.

  • Independently, and accurately, detect change outputs regardless data stored in structures that the user
    can freely modify.

  • Granular distinction between change vs non-change outputs that were sent to the same internal address.
    E.g. the reception of coins, from an external source, on internal addresses will not longer be detected
    as change anymore.

  • Expand the change detection to more complex scripts such as a multi-sig protected addresses. (While they
    are added into the wallet on an internal spkm)

Change Output Detection Rules

A transaction output is change if it fulfills the following points:

  1. At least one of the parent transaction inputs is from the wallet. (If none of them are, then the wallet is receiving coins
    on an internal address).

  2. The script extracted destination is from the wallet and is located in one of the internal script pub key manager.
    (e.g. derived from an internal derivation path)

What about legacy wallets?

If the legacy wallet is HD post-split, we have an internal derivation path, so we can follow the same process as
descriptors wallet. Unless the destination is on the pre-split key pool, in which case, we fallback to the follow-up
case.

If the legacy wallet is pre-split, we continue using the address book as we either have an HD wallet with keys
derived only on the external path, or we are using raw keypool.

———————————————————————

Extra Note

This PR, in about 85% at least, is about expanding the current test coverage for the change output detection area.

TO DO (still WIP):

  • Save “internal” flag on non-active descriptors so the wallet can use them on the change detection process.
    (which will fix the currently failing test cases).
  • Re-organize commits so tests always pass.
  • Verify backwards compatibility.

@DrahtBot DrahtBot added the Wallet label Sep 1, 2022
@furszy furszy changed the title wallet: standardize change output detection process [WIP] wallet: standardize change output detection process Sep 1, 2022
@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 1, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/25979.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK ghost

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29936 (fuzz: wallet: add target for CreateTransaction by brunoerg)
  • #29136 (wallet: addhdkey RPC to add just keys to wallets via new void(KEY) descriptor by achow101)
  • #28710 (Remove the legacy wallet and BDB dependency by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@ghost
Copy link

ghost commented Sep 2, 2022

Interesting pull request although I haven't tested it yet.

Does this also fix #20935 and #20795 ?

How Change Detection Currently Works?
The wallet walks-through the transaction outputs, extracts the script
destination and verify the following two points:

If the destination doesn't exist in the address book, then the script
is a "change address".

If the destination exists in the address book, but it doesn't have a
label, then the script is a "change address".

I was assuming change address had different derivation path. Is that false?

@furszy
Copy link
Member Author

furszy commented Sep 2, 2022

Does this also fix #20935 and #20795 ?

Yes for the change outputs detection part. It will allow us to properly detect each individual output as change or not, independently on what is stored in the address book entry (making the process deterministic across instances, and not dependent on a data structure that the user can freely modify).

As an edge case example; with this, users could even receive coins on an internal address and the new mechanism will properly detect it as an external reception, not a change output (the transaction inputs aren't from the wallet, so no coins are returning to it). Still, this behavior is obviously not encouraged, nor should be easy to do in the wallet, as it breaks the whole idea of internal/external derivation paths.

Plus, would recommend you to check #25685 as well. The first commit there fixes a misleading wallet behavior where even if you set add_inputs=false (telling the wallet to disallow automatic coin selection), the wallet will still fetch coins internally and use them in the Coin Selection process (if you don't pre-set inputs manually).

How Change Detection Currently Works?
The wallet walks-through the transaction outputs, extracts the script
destination and verify the following two points:

If the destination doesn't exist in the address book, then the script
is a "change address".

If the destination exists in the address book, but it doesn't have a
label, then the script is a "change address".

I was assuming change address had different derivation path. Is that false?

Not entirely, the assumption is most of the time correct. It breaks on pre-split legacy wallets, where we don't have an internal derivation path. In those wallet versions we (1) only use an external path to derive public and change addresses, or if we go further back in time, (2) we use a raw pool of keys with no derivation paths at all. (Thus why the initial change detection process was implemented using the address book).

@ghost
Copy link

ghost commented Sep 3, 2022

Concept ACK

@achow101
Copy link
Member

While I like the idea this is going for, I don't think it is sufficient. It relies on m_internal_spk_managers which only contains the currently active SPKMs. If a user were to remove a SPKM from being active, all of the addresses that were produced by that would no longer be detected as change. This is especially problematic with #25907 which rotates all of the currently active descriptors.

In general, I'm not sure that we can determine which output is change without storing additional metadata. What has been suggested before is to explicitly store an "IsChange" value in address book entries, with a "smart" fallback like what this PR does.

@furszy
Copy link
Member Author

furszy commented Jan 11, 2023

finally here. Thanks for the input achow101

While I like the idea this is going for, I don't think it is sufficient. It relies on m_internal_spk_managers which only contains the currently active SPKMs. If a user were to remove a SPKM from being active, all of the addresses that were produced by that would no longer be detected as change. This is especially problematic with #25907 which rotates all of the currently active descriptors.

I'm probably missing some context but.. wouldn't that be solvable by adding an internal field to the WalletDescriptor class? (32a990a). So we can keep track of non-active internal descriptors too.
It should work fine while we don't have any descriptor removal functionality (which would also mean to remove txs from the wallet etc) and require to keep historical records.

In general, I'm not sure that we can determine which output is change without storing additional metadata. What has been suggested before is to explicitly store an "IsChange" value in address book entries, with a "smart" fallback like what this PR does.

I think that we should have a more granular distinction and move from "change addresses" to "change outputs". Storing any extra metadata, if needed, inside the wallet transaction class.

A good use case for this is the reception of coins from an external source on internal addresses, which shouldn't be detected as change as it could be a simple reception or part of a dust attack.
if the tx has no inputs belonging to the wallet, then no output should be labeled as change.

furszy added 13 commits October 12, 2024 12:56
General concept:
 Create transactions using the local wallet to different destinations provided by an external wallet.
 Verifying that the local wallet can detect the change output prior and post the tx is added to the wallet.

The following cases are covered for a descriptor wallet:
 * 1) Create tx that sends to a legacy p2pkh addr and verify change detection.
 * 2) Create tx that sends to a p2wpkh addr and verify change detection.
 * 3) Create tx that sends to a wrapped p2wpkh addr and verify change detection.
 * 4) Create tx that sends to a taproot addr and verify change detection.

And the following ones for a legacy-only wallet:
 * 1) Create tx that sends to a legacy p2pkh addr and verify change detection.
 * 2) Create tx that sends to a p2wpkh addr and verify change detection.
 * 3) Create tx that sends to a wrapped p2wpkh addr and verify change detection.
…allet

The current change detection process assumption can easily be broken by creating an address manually,
from an external derivation path, and send coins to it. As the address was not created by the
wallet, it will not be in the addressbook, there by will be treated as change when it's clearly not.

The wallet will properly detect it once the transaction gets added to the wallet and the address
(and all the previous unused address) are derived and stored in the address book.
1) The change output goes to an address in one of the wallet external path.
2) The change goes back to the source.
   As the source is an external destination, and we are currently detecting
   change through it output script, the change will be marked as external
   (not change).
3) The user setting an address book label to a destination created from an
   internal key.
Currently, the wallet detects whether an output is change or not based
on data stored in the address book.

There is no notion of “change outputs”, the wallet detects change scripts.

Meaning that any address book data modification has implications
on all the historical outputs related to a particular destination as all of
them can either be change or not. There is no middle-ground.

The wallet walks-through the transaction outputs, extracts the script
destination and verify the following two points:

1) If the destination doesn't exist in the address book, then the script
    is a "change address".

2) If the destination exists in the address book, but it doesn't have a
     label, then the script is a "change address".

There are a good number of problems in the current approach:

- We make the wallet dependent on an external structure, with separate storage.
   Which has to be updated and maintained along with the wallet state.

- It cannot be maintained nor recovered across different wallet instances.
   Cannot re-create the address book data only by importing the wallet descriptor string.

- As the address book is an structure that the user can freely modify, the change
   detection process might differ through different wallets.

- The current rudimentary assumptions of "no address book entry" or "no label set for the address book entry"
  to denote that certain script destination is change or not can easily be broken:
  E.g. derive an address from one of the wallet’s external paths manually. Then send coins to it.
  As the receive destination wasn't created inside the wallet, the wallet has no associated address book entry.
  So, the reception is invalidly detected as change (added a test case for it).

- The wallet can't detect change outputs on more complex scripts such as multi-sig change outputs.

- The wallet is not able to detect any change output going to an internal address if the internal address has a label.
  (E.g. the user can manually set a label for the internal address and, doing that, make that all the change outputs,
   in the wallet history, that were sent to the destination are no longer detected as change).

- There isn’t a way to distinguish the external reception of coins into an internal address. Coins reception on any
   internal address are always detected as change.

Aiming to:

* Define a base mechanism to align different wallet implementations. Preventing each piece of software
  from diverging on the basic change outputs distinction.

* Detect change outputs on-demand without requiring to maintain an external data structure synced with the
   latest wallet state.

* Independently, and accurately, detect change outputs regardless data stored in structures that the user
   can freely modify.

* Granular distinction between change vs non-change outputs that were sent to the same internal address.
  E.g. the reception of coins, from an external source, on internal addresses will not longer be detected
  as change anymore.

* Expand the change detection to more complex scripts such as a multi-sig protected addresses. (While they
  are added into the wallet on an internal spkm)

A transaction output is change if it fulfills the following points:

1) At least one of the parent transaction inputs is from the wallet. (If none of them are, then the wallet is receiving coins
    on an internal address).

2) The script extracted destination is from the wallet and is located in one of the internal script pub key manager.
    (e.g. derived from an internal derivation path)

If the legacy wallet is HD post-split, we have an internal derivation path, so we can follow the same process as
descriptors wallet. Unless the destination is on the pre-split key pool, in which case, we fallback to the follow-up
case.

If the legacy wallet is pre-split, we continue using the address book as we either have an HD wallet with keys
derived only on the external path, or we are using raw keypool.
As the wallet is receiving those coins from outside, them should
not be detected as change.

E.g. the user manually obtained and shared one of the internal addresses
and received coins there.
…t to the same internal address

A transaction output is change if it fulfills the following rules:

1) At least one of the transaction inputs is from the wallet.
   (If none of them are, then the wallet is receiving coins
    on an internal address).

2) The script extracted destination is from the wallet and is
   located in one of the internal script pub key manager.
   (e.g. derived from an internal derivation path)
To detect whether active and non-active descriptors are internal or not.
@furszy furszy force-pushed the 2022_wallet_change_detection branch from 00534c4 to 679c918 Compare October 12, 2024 16:09
@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

@murchandamus
Copy link
Contributor

Hey @furszy, is this ready for review?

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 1, 2025

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

1 similar comment
@DrahtBot
Copy link
Contributor

DrahtBot commented May 1, 2025

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@maflcko
Copy link
Member

maflcko commented May 4, 2025

The dependency was closed more than a year ago: #27601 (comment)

@furszy
Copy link
Member Author

furszy commented May 4, 2025

The dependency was closed more than a year ago: #27601 (comment)

Time flies. Closing for now as it is not in my priorities.

@furszy furszy closed this May 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants