Skip to content

Conversation

theuni
Copy link
Member

@theuni theuni commented Aug 22, 2023

RFC because this is rather hackish and I'd like to know if anyone has a better idea.

I'm working on trimming out headers that libbitcoinkernel should not export. To be clear: the problem is not what headers libbitcoinkernel itself uses, but what headers it requires users to use. Basically.. the headers used directly and indirectly by bitcoin-chainstate.cpp.

Topping the list of headers we definitely don't want to require for users are bitcoin-config.h and clientversion.h. The former is somewhat complicated, but clientversion.h only comes from one place: dbwrapper.h.

This PR gets rid of that include by creating a new constructor that sets the version in the cpp file, working around the problem.

Another potential solution would be to create a MakeClientVersionCDataStream() function which just creates a stream and sets the version, and relies on the caller to do the rest.

Any better ideas? Otherwise I'll undraft and ask for review.

theuni added 3 commits August 22, 2023 19:36
…CLIENT_VERSION

clientversion.h should not be required for users of the kernel. dbwrapper
is a deeply nested header that is unlikely to be eliminated from the kernel
any time soon. These constructors will allow the version to be set from headers
without the include.
Prepare for removing clientversion include from dbwrapper.h. Add it where it is
currently picked up indirectly.
Use the new client_version_tag constructors instead.

Also fix the include order in streams.cpp
@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 22, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK TheCharlatan, hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@theuni
Copy link
Member Author

theuni commented Aug 22, 2023

Ping @TheCharlatan @MarcoFalke @ryanofsky

@sipa
Copy link
Member

sipa commented Aug 22, 2023

Honestly, I think we should just get rid of the serialization types and version numbers. They're only used in a few places, and mostly to pass some flags around, mostly unrelated to their original design.

Things like #25284 may allow us to do so.

@TheCharlatan
Copy link
Contributor

Concept ACK

@hebasto
Copy link
Member

hebasto commented Aug 23, 2023

Topping the list of headers we definitely don't want to require for users are bitcoin-config.h and clientversion.h.

Concept ACK on that.

@ajtowns
Copy link
Contributor

ajtowns commented Aug 23, 2023

Sorry if this is naive, but did you consider just introducing a KERNEL_VERSION constant that only gets bumped when it results in meaningful changes (like PROTOCOL_VERSION for p2p)? But @sipa's suggestion seems better if it's feasible.

@@ -6,7 +6,6 @@
#define BITCOIN_DBWRAPPER_H

#include <attributes.h>
#include <clientversion.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are removing this header, but I don't see how this is enforced in the future or currently. For example, the header may still be included indirectly, or added back at any time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Will add some kind of enforcement once we decide on an approach. We'll want some generic way of marking a header as unusable by the kernel.

Another approach might be to have a whitelist of valid kernel headers, copy them to a scratch-space, and build bitcoin-chainstate outside of our tree. That becomes easily doable once @TheCharlatan finishes removing bitcoin-chainstate's boost dependency :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I hacked this up here, just to see how the list is looking: theuni@60da4fa#diff-4cb884d03ebb901069e4ee5de5d02538c40dd9b39919c615d8eaa9d364bbbd77R1002-R1096

@theuni
Copy link
Member Author

theuni commented Aug 23, 2023

Sorry if this is naive, but did you consider just introducing a KERNEL_VERSION constant that only gets bumped when it results in meaningful changes (like PROTOCOL_VERSION for p2p)? But @sipa's suggestion seems better if it's feasible.

@ajtowns Thing is, KERNEL_VERSION would still have a purely internal meaning in this case (and one unrelated to the kernel version :), it would just not be tied to clientversion anymore. That approach would allow a client application to change the value in the header and potentially get some different serialization behavior in their app. Admittedly in this case, from what I can tell, the version is serialized but unused so that's not a real problem.

The approach in this PR which hard-codes the values into the binary so that the downstream user can't possibly affect behavior.

But yes, I'm totally up for dropping this and switching to serialization params if that helps with the dependency here.

@theuni
Copy link
Member Author

theuni commented Aug 23, 2023

Hmm, maybe there's a much simpler solution...

With the offending CDataStreams turned into DataStreams, compile fails in 2 places because version is used:

bitcoin/src/chain.h

Lines 404 to 419 in 33da5d0

SERIALIZE_METHODS(CDiskBlockIndex, obj)
{
LOCK(::cs_main);
int _nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH)) READWRITE(VARINT_MODE(_nVersion, VarIntMode::NONNEGATIVE_SIGNED));
READWRITE(VARINT_MODE(obj.nHeight, VarIntMode::NONNEGATIVE_SIGNED));
READWRITE(VARINT(obj.nStatus));
READWRITE(VARINT(obj.nTx));
if (obj.nStatus & (BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO)) READWRITE(VARINT_MODE(obj.nFile, VarIntMode::NONNEGATIVE_SIGNED));
if (obj.nStatus & BLOCK_HAVE_DATA) READWRITE(VARINT(obj.nDataPos));
if (obj.nStatus & BLOCK_HAVE_UNDO) READWRITE(VARINT(obj.nUndoPos));
// block header
READWRITE(obj.nVersion);
READWRITE(obj.hashPrev);

SERIALIZE_METHODS(CBlockLocator, obj)
{
int nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH))
READWRITE(nVersion);
READWRITE(obj.vHave);
}

Both of those, as far as I can tell, are reading/writing nVersion as a dummy value. The version used by streams is inconsistent.. sometimes it's CLIENT_VERSION, sometimes it's PROTOCOL_VERSION...

Is there any reason to keep the GetVersion() calls for these two since we're not using it at all? As far as I can tell it'd be safe to replace them with zeroes.

@maflcko
Copy link
Member

maflcko commented Aug 23, 2023

Is there any reason to keep the GetVersion() calls for these two since we're not using it at all? As far as I can tell it'd be safe to replace them with zeroes.

Good find. Might be better to use the max value previously written, or just no value at all (if possible), than serializing a zero.

@DrahtBot DrahtBot mentioned this pull request Aug 24, 2023
fanquake added a commit to bitcoin-core/gui that referenced this pull request Sep 7, 2023
…ddress serialization

fa626af Remove unused legacy CHashVerifier (MarcoFalke)
fafa3fc test: add tests that exercise WithParams() (MarcoFalke)
fac81af Use serialization parameters for CAddress serialization (MarcoFalke)
faec591 Support for serialization parameters (MarcoFalke)
fac42e9 Rename CSerAction* to Action* (MarcoFalke)
aaaa3fa Replace READWRITEAS macro with AsBase wrapping function (MarcoFalke)

Pull request description:

  It seems confusing that picking a wrong value for `ADDRV2_FORMAT` could have effects on consensus. (See the docstring of `ADDRV2_FORMAT`).

  Fix this by implementing bitcoin/bitcoin#19477 (comment) .

  This may also help with libbitcoinkernel, see bitcoin/bitcoin#28327

ACKs for top commit:
  TheCharlatan:
    ACK fa626af
  ajtowns:
    ACK fa626af

Tree-SHA512: 229d379da27308890de212b1fd2b85dac13f3f768413cb56a4b0c2da708f28344d04356ffd75bfcbaa4cabf0b6cc363c4f812a8f1648cff9e436811498278318
@fanquake
Copy link
Member

fanquake commented Sep 7, 2023

Note that #25284 has now been merged.

@theuni
Copy link
Member Author

theuni commented Sep 7, 2023

Note that #25284 has now been merged.

Great, thanks for the ping. Going to close this and rework after #25284 since the approach will be completely different now.

@theuni theuni closed this Sep 7, 2023
fanquake added a commit that referenced this pull request Sep 9, 2023
…iskBlockIndex

e73d2a8 refactor: remove clientversion include from dbwrapper.h (Cory Fields)
4240a08 refactor: Use DataStream now that version/type are unused (Cory Fields)
f15f790 Remove version/hashing options from CBlockLocator/CDiskBlockIndex (Cory Fields)

Pull request description:

  This is also a much simpler replacement for #28327.

  There are version fields in `CBlockLocator` and `CDiskBlockIndex` that have always been written but discarded when read.

  I intended to convert them to use SerParams as introduced by #25284, which [ended up looking like this](theuni@3e3af45). However because we don't currently have any definition of what a hash value would mean for either one of those, and we've never assigned the version field any meaning, I think it's better to just not worry about them.

  If we ever need to assign meaning in the future, we can introduce `SerParams` as was done for `CAddress`.

  As for the dummy values chosen:

  `CDiskBlockIndex::DUMMY_VERSION` was easy as the highest ever client version, and I don't expect any objection there.

  `CBlockLocator::DUMMY_VERSION` is hard-coded to the higest _PROTOCOL_ version ever used. This is to avoid a sudden bump that would be visible on the network if CLIENT_VERSION were used instead. In the future, if we ever need to use the value, we can discard anything in the CLIENT_VERSION range (for a few years as needed), as it's quite a bit higher.

  While reviewing, I suggest looking at the throwaway `SerParams` commit above as it shows where the call-sites are. I believe that should be enough to convince one's self that hashing is never used.

ACKs for top commit:
  TheCharlatan:
    Re-ACK e73d2a8
  ajtowns:
    reACK e73d2a8

Tree-SHA512: 45b0dd7c2e918493e2ee92a8e35320ad17991cb8908cb811150a96c5fd584ce177c775baeeb8675a602c90b9ba9203b8cefc0a2a0c6a71078b1d9c2b41e1f3ba
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Sep 19, 2023
… and CDiskBlockIndex

e73d2a8 refactor: remove clientversion include from dbwrapper.h (Cory Fields)
4240a08 refactor: Use DataStream now that version/type are unused (Cory Fields)
f15f790 Remove version/hashing options from CBlockLocator/CDiskBlockIndex (Cory Fields)

Pull request description:

  This is also a much simpler replacement for bitcoin#28327.

  There are version fields in `CBlockLocator` and `CDiskBlockIndex` that have always been written but discarded when read.

  I intended to convert them to use SerParams as introduced by bitcoin#25284, which [ended up looking like this](theuni@3e3af45). However because we don't currently have any definition of what a hash value would mean for either one of those, and we've never assigned the version field any meaning, I think it's better to just not worry about them.

  If we ever need to assign meaning in the future, we can introduce `SerParams` as was done for `CAddress`.

  As for the dummy values chosen:

  `CDiskBlockIndex::DUMMY_VERSION` was easy as the highest ever client version, and I don't expect any objection there.

  `CBlockLocator::DUMMY_VERSION` is hard-coded to the higest _PROTOCOL_ version ever used. This is to avoid a sudden bump that would be visible on the network if CLIENT_VERSION were used instead. In the future, if we ever need to use the value, we can discard anything in the CLIENT_VERSION range (for a few years as needed), as it's quite a bit higher.

  While reviewing, I suggest looking at the throwaway `SerParams` commit above as it shows where the call-sites are. I believe that should be enough to convince one's self that hashing is never used.

ACKs for top commit:
  TheCharlatan:
    Re-ACK e73d2a8
  ajtowns:
    reACK e73d2a8

Tree-SHA512: 45b0dd7c2e918493e2ee92a8e35320ad17991cb8908cb811150a96c5fd584ce177c775baeeb8675a602c90b9ba9203b8cefc0a2a0c6a71078b1d9c2b41e1f3ba
@bitcoin bitcoin locked and limited conversation to collaborators Sep 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Status: Done or Closed or Rethinking
Development

Successfully merging this pull request may close these issues.

8 participants