Skip to content

Conversation

JeremyRubin
Copy link
Contributor

Currently, at the end of a test vector (valid or invalid) we iterate the flags one by one and unset them and check that it makes the transaction succeed or fail.

This is called 'minimality' in that we want to check that each flag is sufficient on it's own. However, I think we should be instead doing the 2**(specified) combinations of flag, to ensure that for all combinations of flags only the one we have specified is either valid or invalid.

Otherwise, we might have e.g. subsets of flags that have interactions we're not detecting here.

Interestingly, the asymptotic runtime here should be better on average (since we don't usually set that many flags, v.s. 32 iterations on the one by one checker), but the worst case is 2**32 flag combos. It's up to the test writers to not abuse this check.

Contrived example demonstrating the problem:

const auto sum = flag_a_set + flag_b_set +flag_c_set ;
// check parity is even
if (sum & 1) throw "Oops";

having set "a,b,c", the one-by-one checker for "a,b,c," expects failure would check ["a,b", "a,c", "b,c"] and see no failures since the parity of each is 2. However, ["a", "b", "c"] would fail since the parity is 1.

I'm not aware of any code that is written in this specific manner, but similar circumstance might arise naturally in the code.

@JeremyRubin JeremyRubin changed the title [Tests] Computer the Power Set of all flags instead of one by one exclusion [Tests] Compute the Power Set of all flags instead of one by one exclusion Sep 10, 2021
@fanquake fanquake added the Tests label Sep 10, 2021
@JeremyRubin
Copy link
Contributor Author

If anyone doesn't like the submask code, I could swap it for this algorithm (which is more efficient but not-obvious) https://cp-algorithms.com/algebra/all-submasks.html#toc-tgt-0

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. Here are the results for me before/after. It might be interesting to see the result with https://cp-algorithms.com/algebra/all-submasks.html if you're motivated to try it.

EDIT: added the results for both tests (tx_valid and tx_invalid).

ExcludeIndividualFlags

test/transaction_tests.cpp(187): Entering test suite "transaction_tests"
test/transaction_tests.cpp(189): Entering test case "tx_valid"
6
10
12
6
10
12
14
22
26
28
14
22
26
28
14
22
26
28
2
8
0
65536
6
10
12
0
0
0
0
0
0
0
0
0
0
2
16384
0
0
0
8
65536
8
65536
8
65536
8
65536
8
65536
0
40
65544
65568
65576
40
65544
65568
65576
32
65536
65568
40
65544
65568
65576
8
65536
8
65536
0
0
0
0
64
0
32
6
10
12
0
0
0
64
0
32
0
0
0
64
0
0
64
0
64
8
65536
0
8
65536
0
test/transaction_tests.cpp(189): Leaving test case "tx_valid"; testing time: 918607us
test/transaction_tests.cpp(280): Entering test case "tx_invalid"
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
512
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1024
0
0
2049
4096
4097
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
2049
4096
4097
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
1
65536
0
test/transaction_tests.cpp(280): Leaving test case "tx_invalid"; testing time: 270583us

GenerateAllSubMasks

test/transaction_tests.cpp(204): Entering test suite "transaction_tests"
test/transaction_tests.cpp(206): Entering test case "tx_valid"
0
2
4
6
8
10
12
0
2
4
6
8
10
12
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
8
0
65536
0
2
4
6
8
10
12
0
0
0
0
0
0
0
0
0
0
0
2
16384
0
0
0
0
8
65536
0
8
65536
0
8
65536
0
8
65536
0
8
65536
0
0
8
32
40
65536
65544
65568
65576
0
8
32
40
65536
65544
65568
65576
0
32
65536
65568
0
8
32
40
65536
65544
65568
65576
0
8
65536
0
8
65536
0
0
0
0
64
0
32
0
2
4
6
8
10
12
0
0
0
64
0
32
0
0
0
64
0
0
64
0
64
0
8
65536
0
0
8
65536
0
test/transaction_tests.cpp(206): Leaving test case "tx_valid"; testing time: 1159591us
test/transaction_tests.cpp(297): Entering test case "tx_invalid"
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
512
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1024
0
0
0
1
2049
4096
4097
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
2049
4096
4097
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
65536
0
test/transaction_tests.cpp(297): Leaving test case "tx_invalid"; testing time: 539616us

for (const unsigned int submask : all_flag_combos) {
flags_combos.insert(TrimFlags(submask));
}
flags_combos.erase(flags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, clang-format

@@ -174,7 +174,7 @@ unsigned int FillFlags(unsigned int flags)
 std::set<unsigned int> GenerateAllSubMasks(unsigned int flags)
 {
     std::vector<unsigned int> set_flags;
-    const size_t n_flags = sizeof(unsigned int)*8;
+    const size_t n_flags = sizeof(unsigned int) * 8;
     set_flags.reserve(n_flags);
@@ -184,10 +184,10 @@ std::set<unsigned int> GenerateAllSubMasks(unsigned int flags)
     }
     const size_t combos = 1 << set_flags.size();
     // skip last all set flags
-    std::vector<unsigned int> all_flag_combos(combos-1);
-    for (size_t i = 0; i < combos-1; ++i){
-        for(size_t j = 0; j < set_flags.size(); ++j) {
-            if (i & (1<<j)) {
+    std::vector<unsigned int> all_flag_combos(combos - 1);
+    for (size_t i = 0; i < combos - 1; ++i) {
+        for (size_t j = 0; j < set_flags.size(); ++j) {
+            if (i & (1 << j)) {
                 all_flag_combos[i] |= set_flags[j];

const size_t n_flags = sizeof(unsigned int)*8;
set_flags.reserve(n_flags);
for (size_t i = 0; i < n_flags; ++i) {
const unsigned int flag = flags & (1 << i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, can use braced initialization on these for type safety

-    const size_t n_flags = sizeof(unsigned int)*8;
+    const size_t n_flags{sizeof(unsigned int) * 8};
     set_flags.reserve(n_flags);
     for (size_t i = 0; i < n_flags; ++i) {
-        const unsigned int flag = flags & (1 << i);
+        const unsigned int flag{flags & (1 << i)};

const size_t combos = 1 << set_flags.size();
// skip last all set flags
std::vector<unsigned int> all_flag_combos(combos-1);
for (size_t i = 0; i < combos-1; ++i){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe hoist subtracting one to the declaration of combos

-    const size_t combos = 1 << set_flags.size();
+    const size_t combos = (1 << set_flags.size()) - 1;
     // skip last all set flags
-    std::vector<unsigned int> all_flag_combos(combos-1);
-    for (size_t i = 0; i < combos-1; ++i){
+    std::vector<unsigned int> all_flag_combos(combos);
+    for (size_t i = 0; i < combos; ++i) {

@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22954 ([TESTS] Allow tx_invalid.json tests to include flag rules for if_unset: [A,B,C] then_unset: [D] by JeremyRubin)
  • #22876 ([TESTS] Update Transaction Tests to permit setting a flag as always on and disabling the exhaustive failure test by JeremyRubin)
  • #21702 (Implement BIP-119 Validation (CheckTemplateVerify) by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@JeremyRubin
Copy link
Contributor Author

I added the textbook algorithm for this: feel free to test that the output is the same then I can squash.

@jonatack
Copy link
Member

jonatack commented Sep 13, 2021

@JeremyRubin Nice. The textbook algorithm produces the same values in the tx_invalid test but fewer values in the tx_valid test. Among the values that it does not return compared to your algo, only 2 or 3 of them are unique. WDYT?

FancyPants version

test/transaction_tests.cpp(194): Entering test case "tx_valid"
0
2
4
6
8
10
12
0
2
4
6
8
10
12
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
0
2
8
0
0
2
4
6
8
10
12
0
0
0
0
0
0
0
2
16384
0
0
0
0
8
65536
0
8
65536
0
8
65536
0
8
65536
0
8
65536
0
0
8
32
40
65536
65544
65568
0
8
32
40
65536
65544
65568
0
32
65536
0
8
32
40
65536
65544
65568
0
8
65536
0
8
65536
0
0
0
2
4
6
8
10
12
0
0
0
0
0
0
0
0
8
65536
0
0
8
65536
0
test/transaction_tests.cpp(194): Leaving test case "tx_valid"; testing time: 597522us
test/transaction_tests.cpp(284): Entering test case "tx_invalid"
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
512
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1024
0
0
0
1
2049
4096
4097
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
2049
4096
4097
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
65536
0
test/transaction_tests.cpp(284): Leaving test case "tx_invalid"; testing time: 170187us

Comment on lines 273 to 278
// Check that flags are maximal: transaction should fail if any unset flags are set.
for (auto flags_excluding_one : ExcludeIndividualFlags(verify_flags)) {
if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, ~flags_excluding_one, txdata, strTest, /* expect_valid */ false)) {
for (auto submask_flags : GenerateAllSubMasks(verify_flags)) {
if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, ~submask_flags, txdata, strTest, /* expect_valid */ false)) {
BOOST_ERROR("Too many flags unset: " << strTest);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the intent of this code correctly, the idea is that given a set of chosen flags, we want to check that every possible superset of flags doesn't succeed. If that's so, I believe there's a problem with the current FlipAll( GenerateAllSubMasks( flags ) ) logic for computing all possible supersets.

Eg, assuming 3-bit words, for flags = 100, we'll generate all flags sub masks {000}, and then flip them {111}. However, what we really want is to generate all the ~flags = 011 sub masks {010, 001, 000}, and then flip them {101, 110, 111}.

In short, there's a missing bit flip before computing all the sub masks. Ie. it should be

FlipAll( GenerateAllSubMasks( ~flags ) )

With that said, there's still the issue that flipping flag bits doesn't interact well with the TrimFlags operation done inside GenerateAllSubMasks.

Eg. if the only flag bits wereVERIFY_P2SH and VERIFY_WITNESS (in that order), for flags = 00 we'd generate all the ~flags = 11 sub masks {10, 01, 00}, trim them {10, 00, 00} (01 is mapped to 00), remove duplicates {10, 00}, and finally flip them {01, 11}. However, we ended up with an invalid combination (01) when the correct result would have been {10, 11}.

More generally, after fixing the initial missing flip, the code would look something like

FlipAll( Unique( TrimFlags( GenerateAllSubMasksWithoutTrimming( ~flags ) ) ) )

when what we really want is something like:

for flag_superset in FlipAll( GenerateAllSubMasksWithoutTrimming( ~flags ) ) ) ):
    if flag_superset == TrimFlags( flag_superset ):
        # use flag_superset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah you're totally right; i've been playing with it a bit as well and noticed that it seems to not be quite right (thanks @jonatack for pointing it out).

The logic is definitely off, and I think it may be intractible in this case since GenerateAllSubMasksWithoutTrimming(~flags) ends up being pretty big... with 20 it might be ok, but eventually there would be too many flags.

Copy link
Contributor

@esneider esneider Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Maybe bound the amount of flipped bits? That way you can test some flag interactions, but bound the exponential growth. The bounding capability would also help to put a safeguard against exponential explosion in the tx_invalid test if there are too many set bits in the initial mask.

@JeremyRubin
Copy link
Contributor Author

Closing this for now -- needs more research/discussion on how to accomplish actually using the powerset properly, see #22948 (comment)

@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants