Skip to content

Conversation

theStack
Copy link
Contributor

@theStack theStack commented Oct 19, 2022

This PR adds a fixed false-positive element check to the functional test rpc_scanblocks.py by using a pre-calculated scriptPubKey that collides with the regtest genesis block's coinbase output. Note that determining a BIP158 false-positive at runtime would also be possible, but take too long (we'd need to create and check ~800k output scripts on average, which took at least 2 minutes on average on my machine).

The introduced check is related to issue #26322 and more concretely inspired by PR #26325 which introduces an "accurate" mode that filters out these false-positives. The introduced cryptography routines (siphash for generic data) and helpers (BIP158 ranged hash calculation, relevant scriptPubKey per block determination) could potentially also be useful for more tests in the future that involve compact block filters.

@theStack theStack force-pushed the 202210-test-check_for_false_positives_in_scanblocks branch from a1e9934 to 5555e72 Compare October 19, 2022 17:47
@theStack
Copy link
Contributor Author

@sipa: Since you authored the SipHash implementation optimized for 256-bit ints in Python (9c8593d) and also the other ones in our main codebase, you could might want to take a look at the first commit. I'm pretty confident that it works as intended, but maybe there are some subtleties or corner-cases that I overlooked.

@DrahtBot DrahtBot added the Tests label Oct 19, 2022
@jamesob
Copy link
Contributor

jamesob commented Oct 19, 2022

Concept ACK, good idea.

We will need this in the next commit to calculate ranged hashes
of scriptPubKeys as defined in BIP158.
By now, we add one helper for calculating ranged hashes and another one
for finding relevant scriptPubKeys given a block.
@theStack theStack force-pushed the 202210-test-check_for_false_positives_in_scanblocks branch from 5555e72 to fa54d30 Compare October 19, 2022 23:35
@achow101
Copy link
Member

ISTM a simpler test would be to just have a fixed collision with the fixed addresses that we generate blocks to for each node. Then we would not need the siphash or bip158 specific things and could just do normal scanblocks and see that there are false positives for blocks mined by e.g. node 0 for some scriptPubKey that is clearly not used anywhere else.

A more meta-comment: should we even be scanning block 0 since it's coinbase UTXO is unspendable?

@theStack
Copy link
Contributor Author

theStack commented Oct 25, 2022

ISTM a simpler test would be to just have a fixed collision with the fixed addresses that we generate blocks to for each node. Then we would not need the siphash or bip158 specific things and could just do normal scanblocks and see that there are false positives for blocks mined by e.g. node 0 for some scriptPubKey that is clearly not used anywhere else.

Whether two scriptPubKeys BIP158-collide or not is not only dependent on the scripts, but also on the block hash of the block they'd appear in (that's the k used as key passed to SipHash). The only block hash that we know will basically always be constant is the one from genesis block, hence this choice. Though I assume right now the pre-generated functional test chain generates the exact same blocks and thus block hashes between runs (at least per node), the precalculated false-positives wouldn't work anymore if for whatever reason any minor detail of created blocks (e.g. nVersion, timestamp, included tx) ever changes in the future, which could be very annoying. If I didn't miss something, I think there is no easier way than to use the genesis block. The ranged hash BIP158 calculation routines (involving SipHash) are just there for a verification of the false-positive, which IMHO makes sense.

A more meta-comment: should we even be scanning block 0 since it's coinbase UTXO is unspendable?

Good question. I'd say yes, as I see scanblocks as a general tool for "show me all blocks containing txs that involve scriptPubKey xyz" rather than taking details like spendability into account (there are a lots of UTXOs that are unspendable, some even provably unspendable, like e.g. P2TR outputs with invalid x-only-pubkeys).

@achow101
Copy link
Member

achow101 commented Oct 25, 2022

ACK fa54d30

Whether two scriptPubKeys BIP158-collide or not is not only dependent on the scripts, but also on the block hash of the block they'd appear in (that's the k used as key passed to SipHash). The only block hash that we know will basically always be constant is the one from genesis block, hence this choice

Ah, I forgot about that. Although the test could make a fixed block manually as well, and that would sidestep any determinism issues.

@achow101 achow101 merged commit e25de33 into bitcoin:master Oct 26, 2022
@theStack theStack deleted the 202210-test-check_for_false_positives_in_scanblocks branch October 26, 2022 16:08
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 27, 2022
… rpc_scanblocks.py

fa54d30 test: check for false-positives in rpc_scanblocks.py (Sebastian Falbesoner)
3bca6cd test: add compact block filter (BIP158) helper routines (Sebastian Falbesoner)
25ee74d test: add SipHash implementation for generic data in Python (Sebastian Falbesoner)

Pull request description:

  This PR adds a fixed false-positive element check to the functional test rpc_scanblocks.py by using a pre-calculated scriptPubKey that collides with the regtest genesis block's coinbase output. Note that determining a BIP158 false-positive at runtime would also be possible, but take too long (we'd need to create and check ~800k output scripts on average, which took at least 2 minutes on average on my machine).

  The introduced check is related to issue bitcoin#26322 and more concretely inspired by PR bitcoin#26325 which introduces an "accurate" mode that filters out these false-positives. The introduced cryptography routines (siphash for generic data) and helpers (BIP158 ranged hash calculation, relevant scriptPubKey per block determination) could potentially also be useful for more tests in the future that involve compact block filters.

ACKs for top commit:
  achow101:
    ACK fa54d30

Tree-SHA512: c6af50864146028228d197fb022ba2ff24d1ef48dc7d171bccfb21e62dd50ac80db5fae0c53f5d205edabd48b3493c7aa2040f628a223e68df086ec2243e5a93
@bitcoin bitcoin locked and limited conversation to collaborators Oct 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants