Skip to content

Conversation

luke-jr
Copy link
Member

@luke-jr luke-jr commented May 18, 2025

Refactoring to avoid unnecessary copies/complexity, at least in the mainnet paths.

Also abstracts out a new TemplateToJSON function to keep track of variable scoping better.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 18, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32547.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK Sjors

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@luke-jr luke-jr force-pushed the mining_avoid_block_copy branch from 7b1a1f1 to 39a3b5b Compare May 18, 2025 09:50
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/runs/42427675996
LLM reason (✨ experimental): The CI failure is caused by linting errors, including spelling mistakes and duplicate includes.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@Sjors
Copy link
Member

Sjors commented May 19, 2025

Concept ACK on adding a TemplateToJSON helper, as well as the new alias for BlockAssembler::Options.

This breaks the multiprocess build somehow, cc @ryanofsky, e.g.

[06:00:12.877] /ci_container_base/src/interfaces/mining.h:54:34: note: unimplemented pure virtual method 'getCoinbaseMerklePath' in 'ProxyClient'
[06:00:12.877]    54 |     virtual std::vector<uint256> getCoinbaseMerklePath() const = 0;

@ryanofsky
Copy link
Contributor

ryanofsky commented May 19, 2025

I think goals of this PR should be achievable without too much work, but unfortunately the implementation as of 39a3b5b isn't really compatible with IPC so a different approach needs to be taken. There are two problems currently:

  1. First, the pure virtual methods cannot be marked const. In general it doesn't make sense to mark virtual methods const, because parent classes can't know how subclasses will implement the virtual methods and what state they may access. The point of pure virtual methods is usually to declare abstract interfaces without leaking information about implementations or making assumptions about them.

  2. Methods called over IPC can't return raw pointers or references by default, without additional custom code. It's fine for IPC methods to return ordinary primitive or struct values, or shared_ptr or unique_ptr values, but generated IPC methods like getBlock and getCoinbaseCommitment can't automatically return raw pointers or references, because after they receive the block data or coinbase commitment data over the socket, then need to return values that can be safely accessed and cleaned up by callers. They can't just return references directly to the received socket data because that data will no longer be available after the methods return. So if they return references they need to copy the data somewhere where it won't be freed right away, but will be freed eventually (to not leak memory), and that requires writing custom code and tracking additional state.

I also don't know motivation for TemplateToJSON and BlockCreateOptions changes in this PR:

  • I think the point of making the CBlockTemplate::Options struct distinct from the BlockCreateOptions struct, while inheritiing from it, was that BlockCreateOptions was supposed to contain high-level options that were exposed to IPC clients, while CBlockTemplate::Options was supposed to contain lower-level options, not exposed externally over IPC, but could be used for internal code and testing. Current PR seems to collapse that distinction, and it is unclear what the goal of the change is.

  • The TemplateToJSON function seems like a nice way to break up code but seems like it should just be taking input data directly instead of an interface pointer that needs to be actively queried. The could make it simpler internally and also allow it to be easily called from contexts where there is only local data and no IPC.

All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods. This should allow dropping methods, simplifying the interface, and also making it more efficient by avoiding copies when it is called locally and no IPC is involved.

The simplest way to do this would be to add new two accessor methods to interfaces::BlockTemplate like:

    virtual const CBlockTemplate* getTemplate() { return nullptr; }
    virtual CBlockTemplate getTemplateData() = 0;

which could be implemented in node::BlockTemplateImpl as:

    const CBlockTemplate* getTemplate() override { return m_block_template.get(); }
    CBlockTemplate getTemplateData() override { return *m_block_template; }

Then if the mining interface is being called from inside the local process, getTemplate() will return a valid pointer which can be used directly. But if the mining interface is being called from a remote process getTemplate() will return null, and getTemplateData() can be used to request and copy the data.

This should work, but would require clients to take call getTemplate() before getTemplateData() to be as efficient as possible and avoid unnecessary copies.

An alternate approach that would avoid unnecessary copies and be simpler for clients but require a little more complexity in the IPC implementation would be to expose an interfaces::BlockTemplate method like

    virtual const CBlockTemplate& getTemplate() = 0;

with a custom implementation that caches the CBlockTemplate internally in the IPC client class so the references it returns will be valid and the data will be freed when the client is destroyed. There are a few examples of this pattern in #10102, but implementing this would requires using some more complicated c++ features like template specialization so would probably recommend trying the simpler approach first.

@Sjors
Copy link
Member

Sjors commented May 19, 2025

All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods.

The original reason for this was so that a Stratum v2 IPC sidecar can fetch the minimal info it needs for a NewTemplate message: basically the version field, some parts of the coinbase and its merkle path. It can do this without fetching the full block.

Later on it turned out that this interface is also very useful for submitting a solution without (again) having to send a full block over the wire. Ditto for an ergonomic way to wait for fees to rise or a new tip to arrive (with multiple clients).

If we assume that Stratum v2 sidecar is running on the same machine as the node, which seems pretty likely, the original optimisation might be overkill and it might as well get the full template. However it's also easier for the sidecar implementation if it doesn't have to hold on to these templates.

An approach where the sidecar calls getTemplateData(), and we drop getBlockHeader(), getBlock(), getCoinbaseTx() and getCoinbaseCommitment() is fine by me. I'm not worried about making the non-sidecar version Sjors#68 less efficient since that approach is deprecated.

The only tricky bits are getWitnessCommitmentIndex() and getCoinbaseMerklePath(), which I'd rather not implement on the sidecar side. We could cache these values in the CBlockTemplate struct (as we do with the coinbase SegWit commitment).

(since #31283 the sidecar doesn't use getTxFees() and getTxSigops() anymore so those were already on the nomination list for deletion)

cc @ismaelsadeeq

@ryanofsky
Copy link
Contributor

ryanofsky commented May 19, 2025

re: #32547 (comment)

All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods.

Sorry, I did not write that very clearly. I am not suggesting dropping the interfaces::BlockTemplate class and only exposing the CBlockTemplate struct. I am just suggesting adding a const CBlockTemplate* accessor to the interface which could return a pointer to the internal struct and avoid copying. The new accessor could replace some of the other accessors, or just augment them.

This would just be a minor change to avoid unnecessary copies, not a broader rethinking of the interface.

@Sjors
Copy link
Member

Sjors commented May 19, 2025

I agree it could just be an addition to the interface, though it also fits well if we want to prune it a bit.

@luke-jr
Copy link
Member Author

luke-jr commented May 20, 2025

In general it doesn't make sense to mark virtual methods const, because parent classes can't know how subclasses will implement the virtual methods and what state they may access. The point of pure virtual methods is usually to declare abstract interfaces without leaking information about implementations or making assumptions about them.

The constness of methods is a property of the interface. It is a way to say calling it will not change the object, and thereby allows using const references.

@maflcko
Copy link
Member

maflcko commented May 27, 2025

Can be turned into draft while the CI is red?

@ryanofsky
Copy link
Contributor

The constness of methods is a property of the interface. It is a way to say calling it will not change the object

Yes, I understand the intent of marking interfaces::BlockTemplate methods const is to say calling those methods will not change the object. And I agree using const to prevent changing the object would be a nice goal, but unfortunately that's not what const keyword does here.

Marking the methods const does not mark the block template data const or do anything that would prevent modifying it. It only makes the interface itself, which is stateless, const, and prevents subclasses implementing the interface from modifying their internal state while not preventing them from modifying CBlockTemplate data. Specifically it makes it impossible for the IPC subclass to update its IPC state, and does nothing to prevent the non-IPC subclass from modifying the block template because it is accessed through a non-const pointer. (It does look like that pointer could be const but that's orthogonal to whether these methods are const.

@fanquake fanquake marked this pull request as draft May 29, 2025 15:28
Copy link

@Dorex45 Dorex45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistent use of const correctness and references //
nice improvement for clarity and performance!

@maflcko
Copy link
Member

maflcko commented Aug 18, 2025

Are you still working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants