-
Notifications
You must be signed in to change notification settings - Fork 37.7k
test,refactor: extract script template helpers & widen sigop count coverage #32729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32729. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
95b67f1
to
0582b63
Compare
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
0582b63
to
5d29ce5
Compare
5d29ce5
to
6bf4c8d
Compare
Rebased after #32279 - updating the tests with the new convenience template helpers instead of using |
1d49b3b
to
db4bf84
Compare
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
db4bf84
to
f8208e9
Compare
748b10b
to
cf8f476
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews, addressed all concerns - either by applying it or explaining my take. Let me know if I left out something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed cf8f476
Main thing remaining is question about motivation behind moving methods (see inline comment).
@@ -222,15 +222,6 @@ bool CScript::IsPayToAnchor(int version, const std::vector<unsigned char>& progr | |||
program[1] == 0x73; | |||
} | |||
|
|||
bool CScript::IsPayToScriptHash() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In 6b167d5 etc, what is the gain from making these methods inline by moving them to the header?
- nit: If these moves are kept, would suggest using "move" rather than "pull" in the commit message.
- nit: Would suggest using "extract" rather than "add" (except for
IsPayToWitnessPubKeyHash()
which is new code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we don't introduce a header + implementation duplication - what would be the advantage in this case of separating them to header + impl?
2,3) done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2,3) thanks!
1): Arguments for avoiding the move:
➕: Avoids moving 4 functions to headers, minimizing the diff and making it clearer which parts of the body changed if any.
➕: Someone chose to have them like this, so best to avoid flip-flopping for minor reasons.
➕: Keeps the script.h header small and faster to re-parse for each compilation unit as it is a commonly included header, direct includes:
₿ git grep "<script/script\.h>" |wc -l
98
(➖): Not inlining adds some linking time, but those functions are rarely called.
➕: Increases readability for getting an overview of CScript
for programmers using editors that don't auto-collapse function bodies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer having them in the same place, @ryanofsky, where do you think these helpers belong, in header or cpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #32729 (comment)
I prefer having them in the same place, @ryanofsky, where do you think these helpers belong, in header or cpp?
No real opinion, I just assumed these needed to be moved for the optimization in d76d753.
Abstractly, I think it makes sense for these to be in a header file since they are low-level and natural candidates to inline. Ideally I don't think they'd be in this header file for reasons hodlinator mentioned, and I would probably want them to be standalone functions not class methods since I think the purpose of classes is to encapsulate internal state and these aren't accessing anything private.
But I think anything is fine here, and it makes sense to go with your preference if there aren't objections.
https://github.com/bitcoin/bitcoin/actions/runs/16872143934/job/47788785406?pr=32729#step:6:4338: test/script_tests.cpp(1620): Entering test case "GetOpName_no_missing_mnemonics"
/home/runner/work/_temp/src/test/script_tests.cpp:1622:25: runtime error: load of value 256, which is not a valid value for type 'opcodetype'
#0 0x559b4df3e847 in script_tests::GetOpName_no_missing_mnemonics::test_method() /home/runner/work/_temp/build-asan/src/test/./test/script_tests.cpp:1622:25
#1 0x559b4df3ba1b in script_tests::GetOpName_no_missing_mnemonics_invoker() /home/runner/work/_temp/build-asan/src/test/./test/script_tests.cpp:1620:1
#2 0x559b4d0f360d in boost::function0<void>::operator()() const /usr/include/boost/function/function_template.hpp:771:14
#3 0x559b4d174328 in boost::detail::forward::operator()() /usr/include/boost/test/impl/execution_monitor.ipp:1395:32
#4 0x559b4d174328 in boost::detail::function::function_obj_invoker0<boost::detail::forward, int>::invoke(boost::detail::function::function_buffer&) /usr/include/boost/function/function_template.hpp:137:18
#5 0x559b4d16de2d in boost::function0<int>::operator()() const /usr/include/boost/function/function_template.hpp:771:14
#6 0x559b4d0598fc in int boost::detail::do_invoke<boost::shared_ptr<boost::detail::translator_holder_base>, boost::function<int ()>>(boost::shared_ptr<boost::detail::translator_holder_base> const&, boost::function<int ()> const&) /usr/include/boost/test/impl/execution_monitor.ipp:308:30
#7 0x559b4d0598fc in boost::execution_monitor::catch_signals(boost::function<int ()> const&) /usr/include/boost/test/impl/execution_monitor.ipp:910:16
#8 0x559b4d059e0d in boost::execution_monitor::execute(boost::function<int ()> const&) /usr/include/boost/test/impl/execution_monitor.ipp:1308:16
#9 0x559b4d052498 in boost::execution_monitor::vexecute(boost::function<void ()> const&) /usr/include/boost/test/impl/execution_monitor.ipp:1404:5
#10 0x559b4d052498 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::function<void ()> const&, unsigned long) /usr/include/boost/test/impl/unit_test_monitor.ipp:49:9
#11 0x559b4d0b5fd4 in boost::unit_test::framework::state::execute_test_tree(unsigned long, unsigned long, boost::unit_test::framework::state::random_generator_helper const*) /usr/include/boost/test/impl/framework.ipp:815:44
#12 0x559b4d0b4e9b in boost::unit_test::framework::state::execute_test_tree(unsigned long, unsigned long, boost::unit_test::framework::state::random_generator_helper const*) /usr/include/boost/test/impl/framework.ipp:784:58
#13 0x559b4d0b4e9b in boost::unit_test::framework::state::execute_test_tree(unsigned long, unsigned long, boost::unit_test::framework::state::random_generator_helper const*) /usr/include/boost/test/impl/framework.ipp:784:58
#14 0x559b4d0508eb in boost::unit_test::framework::run(unsigned long, bool) /usr/include/boost/test/impl/framework.ipp:1722:29
#15 0x559b4d07f350 in boost::unit_test::unit_test_main(boost::unit_test::test_suite* (*)(int, char**), int, char**) /usr/include/boost/test/impl/unit_test_main.ipp:250:9
#16 0x7f2c01d841c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f)
#17 0x7f2c01d8428a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f)
#18 0x559b4cf44984 in _start (/home/runner/work/_temp/build-asan/bin/test_bitcoin+0x12c5984) (BuildId: 536afe9665aa0f1665ca9d21723b3d0c1c5aa9a6)
SUMMARY: UndefinedBehaviorSanitizer: invalid-enum-load /home/runner/work/_temp/src/test/script_tests.cpp:1622:25 |
cf8f476
to
738b620
Compare
To measure CheckBlock performance in isolation from deserialization overhead, a ResetChecked() method was introduced to re-enable the block's validation state, allowing repeated validation of the same block instance.
Add benchmark to measure the performance of counting legacy signature operations in a block, independent of other validation steps.
Previous `GetSigOpCount` method was overloaded to take either a `bool` or `scriptSig` as a parameter, without an explanation of when to call each overload. New `CountSigOps` method avoids the overloading and documents how it should be called. The name was chosen to be clearer and consistent with the newer `CountWitnessSigOps` function. Besides the renames, also added primitive argument name reminders to the call sites to highlight the meaning of the call. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
The name `CountP2SHSigOps` was chosen to match `CountWitnessSigOps`, since the two functions are counterparts for handling `P2SH` and `SegWit` scripts. Also, it's called by `GetP2SHSigOpCount` in consensus, so the new name clarifies its significance. A goal of this change is to make the `CheckSigopsBIP54` function match the BIP54 specification which says "For each input in the transaction, count the number of CHECKSIG and CHECKMULTISIG in the input scriptSig and previous output's scriptPubKey, including the P2SH redeemScript." Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
This enables using these constants in script.h definitions in upcoming commits. No naming conflicts exist with these constant names.
Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parentheses and changed the `0x14`/`20` magic constants to descriptive `HASH160_OUTPUT_SIZE`. See: https://learnmeabitcoin.com/technical/script/p2sh/#scriptpubkey
Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parentheses, and changed the `0x20` magic constant to descriptive `WITNESS_V0_SCRIPTHASH_SIZE`. See: https://learnmeabitcoin.com/technical/script/p2wsh/#scriptpubkey
Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parentheses.
Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parentheses, and changed the `0x20` magic constant to descriptive `WITNESS_V1_TAPROOT_SIZE`. See: https://learnmeabitcoin.com/technical/script/p2tr/#scriptpubkey
The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new method. See: https://learnmeabitcoin.com/technical/script/p2pkh/#scriptpubkey
The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new method. Note that compressor has additional prefix checks as well, which are not properly exercised by the `compressed_p2pk` unit test. See: https://learnmeabitcoin.com/technical/script/p2pk/#scriptpubkey and https://learnmeabitcoin.com/technical/keys/public-key/#compressed
The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new method. Note that compressor has additional prefix checks as well, which are not properly exercised by the `uncompressed_p2pk` unit test. See: https://learnmeabitcoin.com/technical/script/p2pk/#scriptpubkey and https://learnmeabitcoin.com/technical/keys/public-key/#uncompressed
738b620
to
8ec20fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addresses remaining nits + fixed test opcode name iteration boundary failure: https://github.com/bitcoin/bitcoin/compare/cf8f476ef25e4fe43e67970ee2ec6a265d49e763..738b620f314780751c4db3af87c9f22792723125
(rebased in a separate push)
@@ -222,15 +222,6 @@ bool CScript::IsPayToAnchor(int version, const std::vector<unsigned char>& progr | |||
program[1] == 0x73; | |||
} | |||
|
|||
bool CScript::IsPayToScriptHash() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we don't introduce a header + implementation duplication - what would be the advantage in this case of separating them to header + impl?
2,3) done
This template helper isn't used outside of tests yet, but for the sake of completeness - and since we're planning on using this for upcoming optimizations -, it's added as a last step. See: https://learnmeabitcoin.com/technical/script/p2wpkh/#scriptpubkey
* `CountSigOpsKnownTemplates` asserts the expected legacy/accurate sig-op totals for all common known script templates (P2PKH, P2SH, P2WPKH/WSH, P2TR, compressed & uncompressed P2PK, OP_RETURN, multisig). * `CountSigOpsErrors` feeds malformed PUSHDATA sequences to confirm the parser never crashes and still counts sig-ops that appear before the error. * `GetOpName_no_missing_mnemonics` checks all opcode names by category. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
…validation Renames `HasValidOps()` to `HasValidBaseOps()` and `MAX_OPCODE` to `MAX_BASE_OPCODE` to clarify that pre-Taproot script validation should only accept opcodes up to OP_NOP10 (0xb9). This prevents Tapscript-only opcodes like OP_CHECKSIGADD (0xba) from being considered valid in scriptPubKey and scriptSig contexts, where only pre-Taproot opcodes should be allowed.
Add explicit test case checking that OP_CHECKSIGADD > MAX_BASE_OPCODE (i.e. that we have opcodes bigger than the max). The other changes in the test are just inlines to reduce scope. Also note that `OpCodeParser` deliberately skips `OP_CHECKSIGADD`, see: `script_parse_tests/parse_script`.
`OP_CHECKSIGADD` wasn't tested this way at all, but we should fuzz all possible values anyway.
8ec20fb
to
3219b59
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 3219b59. A lot of test improvements since last review, only comment and whitespace changes in non-test code. I think I'm caught up on the discussion now too
@@ -222,15 +222,6 @@ bool CScript::IsPayToAnchor(int version, const std::vector<unsigned char>& progr | |||
program[1] == 0x73; | |||
} | |||
|
|||
bool CScript::IsPayToScriptHash() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #32729 (comment)
I prefer having them in the same place, @ryanofsky, where do you think these helpers belong, in header or cpp?
No real opinion, I just assumed these needed to be moved for the optimization in d76d753.
Abstractly, I think it makes sense for these to be in a header file since they are low-level and natural candidates to inline. Ideally I don't think they'd be in this header file for reasons hodlinator mentioned, and I would probably want them to be standalone functions not class methods since I think the purpose of classes is to encapsulate internal state and these aren't accessing anything private.
But I think anything is fine here, and it makes sense to go with your preference if there aren't objections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 3219b59
Renaming the sigop-counting functions while changing behavior of CScript::GetSigOpCount(const CScript& scriptSig)
/CountP2SHSigOps(const CScript& scriptSig, const CScript& scriptPubKey)
makes it relatively easy to verify the change, as no lingering GetSigOpCount
overloads remain.
Renaming HasValidOps()
-> HasValidBaseOps()
makes one paranoid that we haven't been supporting Tapscript op codes everywhere, but the function is not used in those contexts.
I disagree with moving IsPayToTaproot()
etc from .CPP to .H, but it's not blocking.
Stay humble and stack commits! :)
(Not advocating for merging commits, it is helpful to keep similar transforms applied to different areas in separate commits, other reviewers shouldn't be scared by the high count of them).
Summary
This PR extracts a set of cheap, inline helpers for recognizing the most common script templates, clarifies the definition of valid op-codes for pre-Taproot scripts, and splits the existing "deserialize + check-block" benchmark into three orthogonal micro-benchmarks.
The change is behavior-neutral for consensus and policy - every modified call-site performs the same checks as before, just through clearer helper functions.
Context
While working on
GetSigOpCount
optimizations for known script templates I noticed that feature-test coverage was thin for this code path.This PR therefore adds tests for error cases (malformed push-data encodings) and for the expected sigop totals of the various script templates (including edge cases like a sigop after an
OP_RETURN
).Given the difficulty of reviewing the original optimizations themselves, I split all test / benchmark work into this standalone PR.
The recent burst of sigops related refactor activity (#31624, #32521, #32533) underlines the need for extra safety.
The refactors here eliminate magic numbers, deduplicate template checks, and lay groundwork for future performance work.
Structure
script.h
, replace all ad-hoc byte-checks, and add tests where necessary.PUSHDATA
sequences and the pre-taproot / accurate sigop totals for every standard template.OP_CHECKSIGADD
>MAX_OPCODE
.ConsumeOpcodeType