Skip to content

Conversation

ajtowns
Copy link
Contributor

@ajtowns ajtowns commented Sep 9, 2023

Choose whether witness is included in transaction serialization via serialization parameter rather than the stream version. See #25284 and #19477 for previous context.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 9, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK maflcko, theuni
Approach ACK sipa

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28311 ([WIP] BIP300 (Drivechains) consensus-level logic by luke-jr)
  • #27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
  • #26415 (rpc,rest,zmq: faster getblock, NotifyBlock and rest_block by reading raw block by andrewtoth)
  • #21283 (Implement BIP 370 PSBTv2 by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sipa
Copy link
Member

sipa commented Sep 10, 2023

It doesn't seem crazy to me to drop -rpcserialversion and/or replace it with RPC specific parameters, but we should probably deprecate it first, no? We could try to do that in 26.0, and then continue with the changes in this PR for the release after the branching.

@ajtowns
Copy link
Contributor Author

ajtowns commented Sep 11, 2023

It doesn't seem crazy to me to drop -rpcserialversion and/or replace it with RPC specific parameters, but we should probably deprecate it first, no? We could try to do that in 26.0, and then continue with the changes in this PR for the release after the branching.

(Isn't that what I suggested in the OP? If there's a nuance, I've missed it)

I've opened up a PR to deprecate -rpcserialversion=0 in 26.0 at #28448.

@ajtowns ajtowns added this to the 27.0 milestone Sep 11, 2023
@maflcko
Copy link
Member

maflcko commented Sep 11, 2023

  • This is mainly because that uses RPCSerializationFlags() and rpcSerializationFlags() interfaces which don't really match the new model.

I don't understand this. It should be trivial to return a TransactionSerParams from RPCSerializationFlags, no?

It seems fine to add an option to RPC methods to disable witness, but shouldn't that be done before removing/deprecating the global setting?

@ajtowns
Copy link
Contributor Author

ajtowns commented Sep 11, 2023

I don't understand this. It should be trivial to return a TransactionSerParams from RPCSerializationFlags, no?

It's not clear to me that it would make sense to do that for interfaces::Chain::rpcSerializationFlags, and it's not clear to me that there's any ongoing value in -rpcserialversion to make it even worth thinking through... so I figured better to just put something up for discussion.

@maflcko
Copy link
Member

maflcko commented Sep 13, 2023

Would it make sense to split up the last commit from this pull, along with your two suggested follow-ups in #25284 (comment) and #25284 (comment) ?

@ajtowns
Copy link
Contributor Author

ajtowns commented Sep 14, 2023

Would it make sense to split up the last commit from this pull, along with your two suggested follow-ups in #25284 (comment) and #25284 (comment) ?

I think so. Done in #28473. (If you wanted to do it, possibly with other cleanups, feel free to consider it up-for-grabs)

@maflcko
Copy link
Member

maflcko commented Nov 9, 2023

re-ACK d0d4308 💇

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK d0d4308750c14c44ed2a9b60bb44d6ec5423bd5a 💇
W7OIAFLZyeG6udQ8cTBp0cfWedyHoACHlsbhwy+IQxob2O6zlcMNX6I/MhhtDU8eXKTmxIc9AnZidQXMa4EEDA==

@ajtowns ajtowns force-pushed the 202309-txwithparams branch from d0d4308 to a0c254c Compare November 13, 2023 22:54
@ajtowns
Copy link
Contributor Author

ajtowns commented Nov 13, 2023

Rebased over #28207

@maflcko
Copy link
Member

maflcko commented Nov 14, 2023

re-ACK a0c254c 🐜

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK a0c254c13a3ef21e257cca3493446c632b636b15 🐜
zfOZlUyCFuozt+jcaewrWGPruTgfXkr8HvGVTeV28+8MWPaLDUFBRzsUDaI8ox0eHjTDmbdeLHedy/iZQiV3Dw==

Copy link
Member

@theuni theuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to step through everything to verify the correctness of the witness vs non-witness choices in the changed code. I agree with all of them.

Only skimmed the tests.

I left a comment about cleaning up version.h includes which I believe should be included as part of this PR.

Looks good otherwise.

@@ -51,9 +51,9 @@ bool ParseHashStr(const std::string& strHex, uint256& result);
// core_write.cpp
UniValue ValueFromAmount(const CAmount amount);
std::string FormatScript(const CScript& script);
std::string EncodeHexTx(const CTransaction& tx, const int serializeFlags = 0);
std::string EncodeHexTx(const CTransaction& tx, const bool without_witness = false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kinda a shame to carry this default param forward. But it's also not much better to have a bool at each callsite. An enum class {witness/no_witness} would remove all ambiguity at the callsites, but that may be overkill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That argument is only needed to support -rpcserialversion, which is deprecated as of #28448, so presumably can/will be pulled out later.

std::string SighashToStr(unsigned char sighash_type);
void ScriptToUniv(const CScript& script, UniValue& out, bool include_hex = true, bool include_address = false, const SigningProvider* provider = nullptr);
void TxToUniv(const CTransaction& tx, const uint256& block_hash, UniValue& entry, bool include_hex = true, int serialize_flags = 0, const CTxUndo* txundo = nullptr, TxVerbosity verbosity = TxVerbosity::SHOW_DETAILS);
void TxToUniv(const CTransaction& tx, const uint256& block_hash, UniValue& entry, bool include_hex = true, bool without_witness = false, const CTxUndo* txundo = nullptr, TxVerbosity verbosity = TxVerbosity::SHOW_DETAILS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about default bool.

try {
ssData >> tx_extended;
ssData >> TX_WITH_WITNESS(tx_extended);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this code is much more straightforward now.

@@ -2418,8 +2418,8 @@ void PeerManagerImpl::ProcessGetData(CNode& pfrom, Peer& peer, const std::atomic
CTransactionRef tx = FindTxForGetData(*tx_relay, ToGenTxid(inv));
if (tx) {
// WTX and WITNESS_TX imply we serialize with witness
int nSendFlags = (inv.IsMsgTx() ? SERIALIZE_TRANSACTION_NO_WITNESS : 0);
m_connman.PushMessage(&pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *tx));
const auto maybe_with_witness = (inv.IsMsgTx() ? TX_NO_WITNESS : TX_WITH_WITNESS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful :)

@@ -4119,7 +4119,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
LogPrint(BCLog::NET, "Ignoring getheaders from peer=%d because active chain has too little work; sending empty response\n", pfrom.GetId());
// Just respond with an empty headers message, to tell the peer to
// go away but not treat us as unresponsive.
m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::HEADERS, std::vector<CBlock>()));
m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::HEADERS, std::vector<CBlockHeader>()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this makes more sense than defining witness-ness for an empty block, but just to clarify...

This is a no-op and would've been a no-op independent of this PR as well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- serializing an empty vector is just WriteCompactSize(s, 0) no matter the elements that would have been in the vector, but to serialize a vector of blocks (even an empty one) you need to decide what to do about witness data (even though there is no witness data in an empty block), whereas you don't need to do that for block headers.

@@ -956,7 +956,7 @@ bool BlockManager::WriteBlockToDisk(const CBlock& block, FlatFilePos& pos) const
}

// Write index header
unsigned int nSize = GetSerializeSize(block, fileout.GetVersion());
unsigned int nSize = GetSerializeSize(TX_WITH_WITNESS(block));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If version is no longer used for fileout in these ops, can't we s/CAutoFile/AutoFile/g ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little bit of a rabbit hole to do that -- need to change BufferedFile to accept an AutoFile first etc. Doesn't seem worth adding here to me. See https://github.com/ajtowns/bitcoin/commits/202311-autofile

@@ -102,7 +102,7 @@ CAmount CTransaction::GetValueOut() const

unsigned int CTransaction::GetTotalSize() const
{
return ::GetSerializeSize(*this, PROTOCOL_VERSION);
return ::GetSerializeSize(TX_WITH_WITNESS(*this));
Copy link
Member

@theuni theuni Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, so much more clear. Nice.

I think we could probably drop version.h here now?

Edit: Oooh, we can drop it from some deep internal headers even (hash.h and consensus/validation.h). See theuni@9986da9 . I think I'd prefer to see that added to this PR so that we don't put it off.

Edit2: Updated version here as I missed undo.h the first time: theuni@3762e58

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to do that after dropping version from GetSerializeSize ? see #28438 (comment)

Copy link
Contributor Author

@ajtowns ajtowns Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #28878. Does that seem okay?

@@ -1541,7 +1541,7 @@ static RPCHelpMan finalizepsbt()
std::string result_str;

if (complete && extract) {
ssTx << mtx;
ssTx << TX_WITH_WITNESS(mtx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be a DataStreams now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the psbt bits can switch to DataStream until CVectorWriter is updated.

if (gArgs.GetIntArg("-rpcserialversion", DEFAULT_RPC_SERIALIZE_VERSION) == 0)
flag |= SERIALIZE_TRANSACTION_NO_WITNESS;
return flag;
return (gArgs.GetIntArg("-rpcserialversion", DEFAULT_RPC_SERIALIZE_VERSION) == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this isn't new, but it's pretty icky to go to gArgs for something as low-level as serialization. Makes me wonder if we're doing that in any tight loops. Nothing to do with this PR, but it be nice to cache this in a local bool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also go away before the next release, see #28438 (comment)

@theuni
Copy link
Member

theuni commented Nov 14, 2023

Btw, the version.h suggesed change above, while large, can be trivially reviewed using git diff --stat, as #include <version.h> is either added or removed in one line and that's it.

Copy link
Member

@theuni theuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback. Sounds like you already have plans for the things I complained about, so no reason to hold this up.

ACK a0c254c

@fanquake fanquake merged commit 1084621 into bitcoin:master Nov 15, 2023
@maflcko
Copy link
Member

maflcko commented Nov 15, 2023

@ajtowns To avoid duplicate work, it would be good if you opened all follow-up pulls you wanted to do, or reply with a list here. Others can then take the remaining follow-ups :)

@ajtowns
Copy link
Contributor Author

ajtowns commented Nov 16, 2023

@ajtowns To avoid duplicate work, it would be good if you opened all follow-up pulls you wanted to do, or reply with a list here. Others can then take the remaining follow-ups :)

I think followups are:

EDIT: err, I'm not particularly attached to any of these if anyone wants to take them up

@maflcko
Copy link
Member

maflcko commented Nov 16, 2023

CVectorWriter to VectorWriter, drop version from SpanReader, allow psbt's to be serialized via DataStream; Drop CDataStream; drop GetVersion(), GetType() ?

I did VectorWriter as part of #28892, which seems already large enough to be its own pull. The rest can be done in a follow-up, I guess.

Drop -rpcserialversion and the without_witness params to TxToUniv and EncodeHexTx

Done in #28890

CAutoFile

Looks like your commits are based on ##28878, so I guess that is the next pull to review for now?

@maflcko
Copy link
Member

maflcko commented Nov 20, 2023

SpanReader

Done in #28912. I guess the CDataStream one can be done as part of ##28451

const bool allow_witness;
SER_PARAMS_OPFUNC
};
static constexpr TransactionSerParams TX_WITH_WITNESS{.allow_witness = true};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if TX_WITH_WITNESS calls should be required and specified everywhere (107 places)

If you write stream << object, it seems like it would make sense for the object to be fully serialized by default, with an option to partially serialize it. I'm asking in the context of #28921 / #10102, because before this PR, CTransaction objects could be serialized normally like other objects, and now they require a TX_WITH_WITNESS wrapper to have the same behavior. If TX_WITH_WITNESS were just the default serialization behavior, I wouldn't need a special case for CTransaction objects in multiprocess code, and maybe other code could be simplified too. Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach we've got looks like:

    dumb_serializer << SER_PARAMS(object)
    // file, CTransaction

which then gets translated to:

    dumb_serializer << smart_object
    // file, ParamsWrapper<TransactionSerParams, CTransaction>

then

    smart_serializer.Serialize(object, SER_PARAMS)
    // ParamsStream<TransactionSerParams, dumb_serializer>, CTransaction

The advantage of saying TX_WITH_WITNESS everywhere is that way we're being explicit about how objects are serialised to dumb streams everywhere, so we get compile errors if we're ever potentially inconsistent.

But I don't think that makes sense for multiprocess: we don't need to be using dumb streams in the first place there because the whole point is to pass objects between our own processes, no? So afaics, we should be using a smart serializer from the start; ie one that provides a GetParams() that returns something that can be be converted to TransactionSerParams. So doing ParamsStream ss{TX_WITH_WITNESS, stream} in CustomBuildField and CustomReadField would be an easy way to to start?

If you want to pass around both CNetAddr and CAddress objects as well, I think this would work:

// serialize.h
template<typename... SerParams>
class MultiSerParam
{
private:
    const std::tuple<SerParams...> m_params;
public:
    constexpr MultiSerParam(const SerParams&... t) : m_params{t...} { }

    template<typename T>
    constexpr operator const T&() const { return std::get<T>(m_params); }
};

// ipc/capnp/common-types.h ?
static constexpr MultiSerParam INTERNAL_SER_PARAMS(
  TX_WITH_WITNESS,
  CNetAddr::V2,
  CAddress::V2_NETWORK
);

void f(const CNetAddr& netaddr, const CAddress& addr, const CTransaction& tx)
{
    DataStream substream;
    ParamsStream ss{INTERNAL_SER_PARAMS, substream};
    ss << netaddr;
    ss << tx;
    ss << addr;
}

Note that making this actuall compile also requires tweaking CNetAddr like so:

-        if (s.GetParams().enc == Encoding::V2) {
+        CNetAddr::SerParams p(s.GetParams());
+        if (p.enc == Encoding::V2) {

Copy link
Contributor

@ryanofsky ryanofsky Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That approach of being able to combine different params is clever and solves my problem, avoiding boilerplate from having to create different streams for each object type.

Maybe at some point it would be useful to support default parameter values, too, so parameters like CNetAddr::V2 and CAddress::V2_NETWORK could be assumed if not specified. For some serialization parameters it may be better to be explicit, but for others there could be downsides. For example if I hardcode V2 in multiprocess code and a V3 serialization is added later it could cause subtle bugs if V3 data is silently lost in IPC calls. But there could be other solutions to this problem like having IpcSerParams with an option to fully serialize all data. In any case this is just a theoretical problem, so I have a good idea of what to do now.

Copy link
Contributor Author

@ajtowns ajtowns Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue is the separation between the CNetAddr::SerParams declaration and the MultiSerParams(..,CNetAddr::V2,..) use, could instead make it:

class IPCSerParam
{
public:
    template<typename T>
    operator T&() const { return T::GetIPCSerParams(); }
};
static constexpr IPCSerParam IPC_SER_PARAMS;

class CNetAddr {
    struct SerParams {
        const Encoding enc;
        static SerParams GetIPCSerParams() { return {Encoding::V2}; }
        SER_PARAMS_OPFUNC
   };
};

That leaves it explicit for the dumb streams, and provides a clear, localised default for internal serialization of the object?

(I'm more worried about changing a default from v2 to v3 causing subtle bugs when deserialising network data or from a file, so I prefer being explicit everywhere and finding some other way to fix the other annoyances)

Copy link
Contributor

@ryanofsky ryanofsky Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the idea of having default parameters is off the table then. So probably the way forward to make IPC code safer is have an IPC serialization parameters like your suggestion.

For now though I left the safety issue aside and implemented a way for IPC code to pass multiple stream parameters to ParamsStream in #28929 (used here in #10102), so the same stream can be used for all serialization, and there is no need for code that creates different streams depending on the object type. This solves the problem of parameters increasing boilerplate in multiprocess code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just saying what my opinion is; doesn't mean I'm necessarily right. What you've come up with looks pretty good to me though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #28438 (comment)

I think your v2->v3 example provides a convincing reason why defaulting to the latest version is not always safe. Before this, I was thinking that even if default transaction parameters were not a good idea, making the latest versions default would probably be good. But since this isn't the case, relying on defaults really isn't a good general solution and IPC code probably should specify some explicit parameters, even if they are not hardcoded version numbers. Another way to do this wouldn't require adding new IPC parameter type could be to define aliases like static constexpr SerParams MAX_VERSION = V2 and have IPC code use MAX_VERSION.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants