Skip to content

Conversation

dergoegge
Copy link
Member

@dergoegge dergoegge commented Jul 19, 2023

We currently have two different identifiers for transactions: txid (refering to the hash of a transaction without witness data) and wtxid (referring to the hash of a transaction including witness data). Both are typed as uint256 which could lead to type-safety bugs in which one transaction identifier type is passed where the other would be expected.

This PR introduces explicit Txid and Wtxid types that (if used) would cause compilation errors for such type confusion bugs.

(Only the orphanage is converted to use these types in this PR)

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 19, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK stickies-v, hebasto, instagibbs, BrandonOdiwuor, glozow, achow101
Concept ACK TheCharlatan, luke-jr, ismaelsadeeq

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28690 (build: Introduce internal kernel library by TheCharlatan)
  • #28481 (txorphanage: add size accounting, use wtxids, support multiple announcers by glozow)
  • #28438 (Use serialization parameters for CTransaction by ajtowns)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented Jul 19, 2023

How is this different from GenTxid?

@dergoegge
Copy link
Member Author

How is this different from GenTxid?

I think mostly compile-time vs. runtime checks? GenTxid can hold both txid types but doesn't enforce type correctness at compile time. Any type checks would happen at runtime, e.g. assert(gtxid.IsWtxid()). Also nothing prevents passing the wrong txid type when constructing a GenTxid.

(I guess with this PR we could consider making GenTxid a std::variant<Txid, Wtxid>)

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

We don't consistently have clear variable naming (e.g. the ambiguous hash; or txid being used to refer to a wtxid), sometimes requiring quite a bit of developer time to figure out the expected ~type. I have experienced this quite a bit myself.

Introducing types makes this explicit at relatively low overhead, and I think the approach of subclassing uint256 makes sense.

We'll have to be careful for this not to lead to unnecessary refactoring code churn, but that shouldn't be a reason not to make this change imo.

@glozow
Copy link
Member

glozow commented Jul 20, 2023

I've definitely written/seen bugs that could be prevented if Txid and Wtxid were different types. When I review code it's something that I look for and it comes up surprisingly often. Interfaces taking uint256 often implicitly expect them to be a txid, eg #23325.

Agree that checking at compile time would be nice and probably prevent bugs / make review easier. Concept ACK assuming we're ok with the amount of code change.

@dergoegge
Copy link
Member Author

We'll have to be careful for this not to lead to unnecessary refactoring code churn, but that shouldn't be a reason not to make this change imo.

In terms of refactoring, this should be similar to the time type-safety PRs in that we can gradually convert our code.

@instagibbs
Copy link
Member

instagibbs commented Jul 20, 2023

concept ACK, it comes up pretty often

to work around this I'm writing Assume's everywhere to make things clearer for me :)

@jonatack
Copy link
Member

jonatack commented Jul 20, 2023

If helpful, a somewhat similar pros/cons discussion took place in March 2021 in #21328.

@dergoegge dergoegge force-pushed the 2023-07-typesafe-txids branch from eb26626 to 369e790 Compare July 21, 2023 15:05
@dergoegge dergoegge changed the title rfc: Type-safe transaction identifiers util: Type-safe transaction identifiers Jul 21, 2023
@dergoegge dergoegge force-pushed the 2023-07-typesafe-txids branch from 369e790 to 1aecdbc Compare July 21, 2023 15:23
@dergoegge dergoegge force-pushed the 2023-07-typesafe-txids branch from 1aecdbc to c7338d8 Compare July 21, 2023 15:28
@dergoegge dergoegge marked this pull request as ready for review July 21, 2023 15:31
@dergoegge
Copy link
Member Author

Ready for review

@dergoegge dergoegge force-pushed the 2023-07-typesafe-txids branch from c7338d8 to 76fe2c3 Compare July 21, 2023 15:33
@TheCharlatan
Copy link
Contributor

Concept ACK

@luke-jr
Copy link
Member

luke-jr commented Jul 25, 2023

Concept ACK, but I'm not too sure about the Wtxid type. It's literally just a hash of the transaction, not really an identifier per se.

@dergoegge
Copy link
Member Author

Concept ACK, but I'm not too sure about the Wtxid type.

Could you clarify your objections about the Wtxid type? This PR aims to catch type confusion bugs (confusion between txids and wtxids) at compile-time and I don't see how to do that without these explicit types.

Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 76fe2c3

played around with some extensions to see where it could go, I think longer term we could use GenTxid for situations where both are acceptable, and Wtxid/Txid when only one type is, and slowly migrate towards that. f.e. GenTxid could have a GetTypedHash to convert to the other.

@dergoegge
Copy link
Member Author

@MarcoFalke could you give this another look? You've done similar work on our time type-safety, so I'd appreciate your feedback.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some quick comments

@dergoegge dergoegge force-pushed the 2023-07-typesafe-txids branch from 76fe2c3 to c1f9c3b Compare August 8, 2023 13:28
@DrahtBot DrahtBot requested review from BrandonOdiwuor and removed request for BrandonOdiwuor October 12, 2023 12:57
Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 940a499

@DrahtBot DrahtBot requested review from BrandonOdiwuor and removed request for BrandonOdiwuor October 12, 2023 13:44
Copy link
Contributor

@BrandonOdiwuor BrandonOdiwuor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 940a499

Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reACK 940a499
range-diff from last review

@glozow
Copy link
Member

glozow commented Oct 12, 2023

Assuming we should wait until branch off to merge?

@maflcko maflcko added this to the 27.0 milestone Oct 12, 2023
@fanquake
Copy link
Member

Assuming we should wait until branch off to merge?

Any particular reason? If these start getting used more widely, it's just going to make doing backports more complicated if so.

@glozow
Copy link
Member

glozow commented Oct 12, 2023

Assuming we should wait until branch off to merge?

Any particular reason? If these start getting used more widely, it's just going to make doing backports more complicated if so.

Idc that much, was asked why I didn't merge yet. Conversions are still possible so it shouldn't be that complicated, but whatever you prefer. Could use this time to look at what it looks like to change the rest of the codebase and see if we run into any bumps.

@@ -1852,7 +1852,7 @@ void PeerManagerImpl::BlockConnected(const std::shared_ptr<const CBlock>& pblock
LOCK(m_recent_confirmed_transactions_mutex);
for (const auto& ptx : pblock->vtx) {
m_recent_confirmed_transactions.insert(ptx->GetHash());
if (ptx->GetHash() != ptx->GetWitnessHash()) {
if (ptx->HasWitness()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an improvement? Comparing two uint256s is O(1) time and fairly cache friendly, whereas checking an arbitrary number of inputs' witnesses might not be either of those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing Txid and Wtxid is no longer allowed. We could use ToUint256 or add a cache for HasWitness?

I like using HasWitness because it is immediately obvious what we are checking.

Copy link
Member

@glozow glozow Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can CTransaction::HasWitness be changed to check whether hash != m_witness_hash (after the unavoidable first time when you need to iterate through the inputs to look for witnesses)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could move the "iterate through" part into ComputeWitnessHash, and have HasWitness just return hash.ToUint256() == m_witness_hash.ToUint256(). Note that that's an example of where ToUint256() not doing a copy would make sense.

uint256 m_wrapped;

// Note: Use FromUint256 externally instead.
transaction_identifier(const uint256& wrapped) : m_wrapped{wrapped} {}
Copy link
Contributor

@ajtowns ajtowns Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of FromUint256 over just declaring this as explicit and making it public?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be better but would allow something like Txid{Wtxid{}} right now (because of the conversion function). We could make it a template like we do for Compare...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But (a) that would presumably be silly to write, and (b) the conversion function is going away, which will ensure any code like that also goes away.

@achow101
Copy link
Member

ACK 940a499

@achow101 achow101 merged commit 5572f98 into bitcoin:master Oct 26, 2023
@dergoegge
Copy link
Member Author

I'll open a follow up PR for @ajtowns's suggestions soon.

achow101 added a commit to bitcoin-core/gui that referenced this pull request Nov 28, 2023
…itness (28107 follow-up)

af1d2ff [primitives] Precompute result of CTransaction::HasWitness (dergoegge)

Pull request description:

  This addresses bitcoin/bitcoin#28107 (comment) from #28107.

ACKs for top commit:
  theStack:
    ACK af1d2ff
  achow101:
    ACK af1d2ff
  stickies-v:
    ACK af1d2ff
  TheCharlatan:
    ACK af1d2ff

Tree-SHA512: a77654ae429d0d7ce12daa309770e75beec4f8984734f80ed203156199425af43b50ad3d8aab85a89371a71356464ebd4503a0248fd0103579adfc74a55aaf51
AngusP added a commit to AngusP/bitcoin that referenced this pull request Mar 24, 2024
…ad of ambiguous uint256.

Wtxid/Txid types introduced in bitcoin#28107
AngusP added a commit to AngusP/bitcoin that referenced this pull request Mar 25, 2024
…ad of ambiguous uint256.

Wtxid/Txid types introduced in bitcoin#28107
AngusP added a commit to AngusP/bitcoin that referenced this pull request Mar 25, 2024
…ad of ambiguous uint256.

Wtxid/Txid types introduced in bitcoin#28107
AngusP added a commit to AngusP/bitcoin that referenced this pull request Mar 27, 2024
…ad of ambiguous uint256.

Wtxid/Txid types introduced in bitcoin#28107
AngusP added a commit to AngusP/bitcoin that referenced this pull request Mar 28, 2024
…ad of ambiguous uint256.

Wtxid/Txid types introduced in bitcoin#28107
glozow added a commit that referenced this pull request Apr 9, 2024
a8203e9 refactor: Simplify `extra_txn` to be a vec of CTransactionRef instead of a vec of pair<Wtxid, CTransactionRef> (AngusP)
c3c1843 refactor: Use typesafe Wtxid in compact block encoding message, instead of ambiguous uint256. (AngusP)

Pull request description:

  The first commit replaces `uint256` with typesafe `Wtxid` (or `Txid`) types introduced in #28107.

  The second commit then simplifies the extra tx `vector` to just be of `CTransactionRef`s instead of a `std::pair<Wtxid, CTransactionRef>`, as it's easy to get a `Wtxid` from a transaction ref.

ACKs for top commit:
  glozow:
    ACK a8203e9
  dergoegge:
    ACK a8203e9

Tree-SHA512: b4ba1423a8059f9fc118782bd8a668213d229c822f22b01267de74d6ea97fe4f2aad46b5da7c0178ecc9905293e9a0eafba1d75330297c055e27fd53c8c8ebfd
@bitcoin bitcoin locked and limited conversation to collaborators Oct 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.