-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Skip tx-rehashing on historic blocks #13098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The build failure is due to an eight year old bug in clang: https://bugs.llvm.org/show_bug.cgi?id=8263 |
I like the goal here but I find the changes a bit overwhelming for achieving this. It seems like we could achieve this more easily in other ways, like (1) just building the hash on demand the first time it's requested, or (2) adding a deserialization flag to CTransaction that instructs it to skip the hash if we're not planning to use it? [I'm not sure if (1) is a great solution, since there might be some memory-cache-benefits to building the hash at the time of deserialization, and maybe no one likes the design of having a mutable hash inside of CTransaction, but it would be a super-small diff...] |
I didn't particularly like option (1) because it makes the caching run-time depended (it is always compiled in, but only used when it happens to be requested), compared to having it compile time enforced by typing Note that my approach is a variation of option (2) by passing through the type of the transaction to So I think having explicit types is advantageous in both code clarity and cleanness, with the tradeoff of a minimally larger diff. Note: I will push a fixed up version of the patch by next Monday to minimize the diff and (hopefully) work around the gcc and clang compiler bugs. |
Fair enough, I just don't want us to overlook simpler solutions that might be acceptable. I'm happy to let the reviewers with stronger feelings about the design of these data types chime in. |
eccacd5
to
49f3437
Compare
I just pushed a version that might compile on our buggy clang and that is according to the design that @sipa requested in #13011 (comment)
|
fac161a
to
fa7f987
Compare
@MarcoFalke Thanks for working on this, I'll review soon. @sdaftuar My preference is having separate types, for these reasons:
|
This seems like very useful work, but for serving out historic blocks it would perhaps be better if we could (largely) avoid deserializing and re-serializing them entirely as doing so is tremendously expensive (involving a billionity mallocs). Better is better however. |
Agree. The main thing I don't like about lazy evaluation is that it's much harder to reason about performance then. I think I prefer the current explicit solution, even though much more verbose.
Yes - I was also about to say this. How difficult would it be to serve the data directly from disk, without serialization/deserialization roundtrip?
|
My dumb try at this: #13151 |
Even with 13151 something like this might very useful for making rescan faster... rescan is slow enough to make it largely unusable now. I'd been thinking deserialization was worse than hashing in it, but the performance numbers above make me wonder. |
Certainly Concept ACK if we can get rescan up for this (even as a separate PR), just using it for serving blocks for old nodes is maybe less worth it (at least post-13151). |
Will do a rebase once the other pull is merged |
Nice work! |
fa7f987
to
e933553
Compare
0caab5a
to
1dd43e8
Compare
Any reason not to reuse |
@sipa I'd prefer to make it "type-safe", i.e. make it impossible/compile-time enforced that none of the expensive operations are called. Otherwise this seems too fragile. I think to make it compile-time enforced, this would require to change one of transaction types to be a class that takes template arguments and then add a static assert in |
@MarcoFalke But as long as you keep CMutableTransaction around, such a guarantee doesn't really exist, as that class still has uncached hashing operations (and should probably remain so, see a recent PR that speeds up signing by avoiding CMutableTransaction -> CTransaction conversions). My preference is just to evolve CTransaction into the validation-logic-optimized-efficient transaction representation, and CMutableTransaction into the simple-generic representation. That may mean that uncached hashes are still invoked in places where performance doesn't matter that much. |
After discussing this a bit with @MarcoFalke, I see the advantage of being able to enforce through the type system that no GetHash is unexpectedly called on transactions that don't have a cache in certain specific cases (like serving historical blocks). On the other hand, it's also introducing yet another type for the same data, which seems a high price. As it has const fields, conversion to other transaction types will be expensive if it's ever needed, but it does get the ability to take advantage of more efficient storage if they get implemented (like one malloc per transaction). Overall, I think my preference is to reuse CMutableTransaction here, but if we want the extra guarantees and an extra type, the current approach seems fine. |
1dd43e8
to
506c092
Compare
0650b71
to
f18c222
Compare
f18c222
to
775202d
Compare
} else { | ||
CBlockHeaderAndShortTxIDs cmpctblock(*pblock, fPeerWantsWitness); | ||
connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, cmpctblock)); | ||
if (!ReadBlockFromDisk(block_read, pindex, consensusParams)) assert(!"cannot load block from disk"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like when !pblock && inv.type == MSG_FILTERED_BLOCK
or !pblock && inv.type == MSG_CMPCT_BLOCK
in previous code, the message would be dropped, but now the block gets read from disk? Is this an intentional change if so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditional utACK 775202d if ProcessGetBlockData
behavior is not a problem, or can be left unchanged.
@@ -283,6 +288,43 @@ class CTransaction | |||
const int32_t nVersion; | |||
const uint32_t nLockTime; | |||
|
|||
/** Convert a CMutableTransaction into a CPureTransaction. */ | |||
CPureTransaction(const CMutableTransaction& tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2018-09-18 21:44:48 cppcheck(pr=13098): [src/primitives/transaction.h:292]: (style) Class 'CPureTransaction' has a constructor with 1 argument that is not explicit.
@@ -283,6 +288,43 @@ class CTransaction | |||
const int32_t nVersion; | |||
const uint32_t nLockTime; | |||
|
|||
/** Convert a CMutableTransaction into a CPureTransaction. */ | |||
CPureTransaction(const CMutableTransaction& tx); | |||
CPureTransaction(CMutableTransaction&& tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2018-09-18 21:44:48 cppcheck(pr=13098): [src/primitives/transaction.h:293]: (style) Class 'CPureTransaction' has a constructor with 1 argument that is not explicit.
while (state.KeepRunning()) { | ||
CPureBlock block; | ||
stream >> block; | ||
assert(stream.Rewind(sizeof(block_bench::block413567))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2018-09-18 21:44:48 cppcheck(pr=13098): [src/bench/checkblock.cpp:31]: (warning) Assert statement calls a function which may have desired side effects: 'Rewind'.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Curious why this was closed. |
Rebased (with the p2p changes removed) here: #17529 |
When serving historic blocks, there is no need to calculate the hash of each transaction, so just skip it.
This is mostly refactoring with the following changes:
CPureTransaction
, a stripped down version ofCTransaction
(nohash
orGetHash()
members). Then deriveclass CTransaction
fromCPureTransaction
CPureBlock
which referencesCPureTransactions
in itsvtx
.Deserialization now only takes 14% of the wall clock time it took previously on my machine. You can check locally with:
Sending a single historic block now takes slightly less (90% of the wall clock time it took previously) on a local network. I presume the speedup is more pronounced when sending more than one block.