Skip to content

Conversation

stratospher
Copy link
Contributor

Add tests for the following v2 handshake scenarios:

  1. Disconnection happens when > MAX_GARBAGE_LEN bytes garbage is sent
  2. Disconnection happens when incorrect garbage terminator is sent
  3. Disconnection happens when garbage bytes are tampered with
  4. Disconnection happens when AAD of first encrypted packet after the garbage terminator is not filled
  5. bitcoind ignores non-empty version packet and no disconnection happens

All these tests require a modified v2 P2P class (different from EncryptedP2PState used in v2_p2p.py) to implement our custom handshake behaviour based on different scenarios and have been kept in a single test file (test/functional/p2p_v2_misbehaving.py). Shifted the test in test/functional/p2p_v2_earlykeyresponse.py which is of the same pattern to this file too.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 14, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK mzumsande, theStack, achow101
Concept ACK kevkevinpal
Stale ACK sr-gi

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29500 (test: create assert_not_equal util by kevkevinpal)
  • #29420 (test: extend the SOCKS5 Python proxy to actually connect to a destination by vasild)
  • #28521 (net: additional disconnect logging by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@stratospher
Copy link
Contributor Author

@sr-gi, i tried this manually sending idea but i still think intermittent failures are possible there.

  • we can't get rid of can_data_be_received variable because if we don't use this variable, test would succeed irrespective of whether we send 4 bytes network magic first or 4 bytes from ellswift bytes first and we don't want that.
  • so since data_received() always happens in Network thread and send of ellswift bytes + setting can_data_be_received=True happens on MainThread, in the rare scenario that data_received() gets called before setting can_data_be_received, an intermittent failure could happen i think.

here's a branch where i tweaked the code you shared a bit with a sleep statement for making the test crash and reintroducing can_data_be_received variable.

@theStack
Copy link
Contributor

Concept ACK

Copy link
Contributor

@kevkevinpal kevkevinpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

added two comments I will continue to test these changes as I am still trying to understand fully what the tests do exactly

@sr-gi
Copy link
Member

sr-gi commented Feb 29, 2024

  • we can't get rid of can_data_be_received variable because if we don't use this variable, test would succeed irrespective of whether we send 4 bytes network magic first or 4 bytes from ellswift bytes first and we don't want that.

I've been playing around sending anything different than the first 4 bytes as the network magic and test fail on my end (which is expected given data_received is called with magic_sent false on the first-byte mismatch).

Do you have any working example I can try to reproduce?

@stratospher
Copy link
Contributor Author

I've been playing around sending anything different than the first 4 bytes as the network magic and test fail on my end (which is expected given data_received is called with magic_sent false on the first-byte mismatch).

yes! that's expected behaviour and sending anything other than the first 4 bytes as network magic will make the test fail on master.

what i was trying to say was:

  1. on master
  • replacing return b"\xfa\xbf\xb5\xda" with any other 4 bytes will make the test fail and this is expected behaviour
  1. using this patch mentioned in test: fix intermittent failure in p2p_v2_earlykeyresponse #29352
  • replacing return b"\xfa\xbf\xb5\xda" with any other 4 bytes will make the test pass and this isn't behaviour we want
  • this happens because can_data_be_received variable is removed in the patch

so i was just trying to say that we can't remove can_data_be_received variable.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

stratospher and others added 2 commits March 11, 2024 12:58
Early key response test is a special kind of test which requires
modified v2 handshake functions. More such tests can be added
where v2 handshake functions send incorrect garbage terminator,
excess garbage bytes etc.. Hence, rename p2p_v2_earlykey.py to a
general test file name - p2p_v2_misbehaving.py.

random_bitflip function (used in signature tests prior to this
commit) can be used in p2p_v2_misbehaving test to generate wrong
garbage terminator, wrong garbage bytes etc..
So, move the function to util.
Adds a new boolean parameter `expect_success` to the
`add_p2p_connection` method. If set, the node under
test doesn't wait for connection to be established
and is useful for testing scenarios when disconnection
is expected.

Without this parameter, intermittent test failures can happen
if the disconnection happens before wait_until for is_connected
is hit inside `add_p2p_connection`.

Co-Authored-By: Sebastian Falbesoner <sebastian.falbesoner@gmail.com>
@stratospher
Copy link
Contributor Author

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

hmm good point. will need to think about it more. simplest solution would be keeping it all in main implementation in v2_p2p.py haha but doesn't make sense to have it there just for this test file :(

@theStack
Copy link
Contributor

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

hmm good point. will need to think about it more. simplest solution would be keeping it all in main implementation in v2_p2p.py haha but doesn't make sense to have it there just for this test file :(

One idea to avoid at least reimplementing generate_keypair_and_garbage would be to allow setting a fixed garbage length with a new parameter (that is None by default, meaning that a random length will be picked), e.g:

diff --git a/test/functional/test_framework/v2_p2p.py b/test/functional/test_framework/v2_p2p.py
index 8f79623bd8..c27cd5e2fe 100644
--- a/test/functional/test_framework/v2_p2p.py
+++ b/test/functional/test_framework/v2_p2p.py
@@ -111,10 +111,11 @@ class EncryptedP2PState:
             # Responding, place their public key encoding first.
             return TaggedHash("bip324_ellswift_xonly_ecdh", ellswift_theirs + ellswift_ours + ecdh_point_x32)
 
-    def generate_keypair_and_garbage(self):
+    def generate_keypair_and_garbage(self, garbage_len=None):
         """Generates ellswift keypair and 4095 bytes garbage at max"""
         self.privkey_ours, self.ellswift_ours = ellswift_create()
-        garbage_len = random.randrange(MAX_GARBAGE_LEN + 1)
+        if garbage_len is None:
+            garbage_len = random.randrange(MAX_GARBAGE_LEN + 1)
         self.sent_garbage = random.randbytes(garbage_len)
         logger.debug(f"sending {garbage_len} bytes of garbage data")
         return self.ellswift_ours + self.sent_garbage

The overruled method of initiate_v2_handshake in the test could then call this with the parameter set if needed, e.g. for the EXCESS_GARBAGE case:

     def initiate_v2_handshake(self):
         if self.test_type == TestType.EARLY_KEY_RESPONSE:
             .....
         elif self.test_type == TestType.EXCESS_GARBAGE:
             # send > 4095 bytes garbage
             return self.generate_keypair_and_garbage(garbage_len=MAX_GARBAGE_LEN + random.randrange(1, 10))

(For the WRONG_GARBAGE test, bit-flipping self.sent_garbage right after calling generate_keypair_and_garbage should still be sufficient for causing a mismatch.)

@sr-gi
Copy link
Member

sr-gi commented May 6, 2024

I find the approach fairly hard to follow here. Having all the logic in the constructor with if/elses based on the connection type instead of having different constructors/a test for each type of failure feels really error-prone and difficult to make sure if a test is doing what we expect. I think it'd be better to have multiple classes that build from v2_p2p and modify what's needed, even if the file gets larger.

Just an example:

In the WRONG_GARBAGE_TERMINATOR case, we are modifying the gargabe in two ways. First, we are making it smaller than expected, but also, we are randomly flipping some of its bits. Turns out if you removed both changes, the tests still passes, but I'm not sure why. I would expect this not to be the case, since removing the relevant parts should make the class equal to the expected v2_p2p.

@stratospher
Copy link
Contributor Author

I think it'd be better to have multiple classes that build from v2_p2p and modify what's needed, even if the file gets larger.

that's a great suggestion @sr-gi! it's possible to design those classes in such a way that it avoids code duplication and the file is almost the same size!

Turns out if you removed both changes, the tests still passes, but I'm not sure why.

because we don't send a version message to ensure that the disconnection happens in the v2 handshake phase. in your case (different from what we are testing here), we send the correct garbage terminator but because we don't send a version message - disconnection is still expected.

Currently, we log the number of bytes of garbage when it is
generated. The log is a better fit for when the garbage
actually gets sent to the transport layer.
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25085973641

Prior to this commit, TestEncryptedP2PState would always
send initial_v2_handshake bytes in 2 parts (as required
by early key response test).

For generalising this test and having different v2 handshake
behaviour based on the test type, special behaviours like
sending initial_v2_handshake bytes in 2 parts are executed
only if test_type is set to EARLY_KEY_RESPONSE.
This test type is represented using EXCESS_GARBAGE.
…is sent

This test type is represented using WRONG_GARBAGE_TERMINATOR.
since the wrong garbage terminator is sent to TestNode, TestNode
will interpret all of the gabage bytes, wrong garbage terminator,
decoy messages and version packet it receives as garbage bytes.

If the length of all these is more than 4095 + 16, it will result
in a missing garbage terminator error. otherwise, it will result
in a V2 handshake timeout error.

Send only MAX_GARBAGE_LEN//2 bytes of garbage data to TestNode
so that the total length received by the TestNode is at max
= (MAX_GARBAGE_LEN//2) + 16 + 10*120 + 20 = 3283 bytes
(which is less than 4095 + 16 bytes) and we get a consistent
V2 handshake timeout error message.

If we do not limit the garbage length sent, we will intermittently
get both missing garbage terminator error and V2 handshake
timeout error based on the garbage length and decoy packets length
which are chosen at random.
… different

This test type is represented using WRONG_GARBAGE.
Here, garbage bytes sent to TestNode are assumed to be tampered with and
do not correspond to the garbage bytes which P2PInterface calculated and
uses.
This test type is represented using SEND_NO_AAD. If AAD of the first encrypted packet
sent after the garbage terminator (optional decoy packet/version packet) hasn't been
filled, disconnection happens.
…tion happens

This test type is represented using SEND_NON_EMPTY_VERSION_PACKET.
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK c9dacd9

I'd prefer combining some of the first commits with the respective test commits that actually make use of the change, but that's just my opinion ad maybe a matter of taste.

@DrahtBot DrahtBot requested a review from sr-gi June 25, 2024 18:46
@DrahtBot DrahtBot mentioned this pull request Jun 25, 2024
Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK c9dacd9

Nice approach using a EncryptedP2PState subclass for each test scenario, I agree with other reviewers that this looks much cleaner now and is easier to follow.

"""64 bytes ellswift is sent in 2 parts during `initial_v2_handshake()`"""
self.v2_state = TestEncryptedP2PState()
if self.test_type == TestType.EARLY_KEY_RESPONSE:
self.v2_state = EarlyKeyResponseState(initiating=True, net='regtest')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit, potential follow-up: could set 'regtest' as default value for the net parameter in the EncryptedP2PState class constructor (__init__), so it doesn't have to be specified repeatedly here.

@achow101
Copy link
Member

achow101 commented Jul 9, 2024

ACK c9dacd9

@achow101 achow101 merged commit c51c694 into bitcoin:master Jul 9, 2024
@fanquake
Copy link
Member

Looks like this is causing intermittent CI failures. See #30419.

@hebasto
Copy link
Member

hebasto commented Jul 14, 2024

Ported to the CMake-based build system in hebasto#264.

@stratospher stratospher deleted the more-v2-tests branch July 17, 2024 06:39
kwvg added a commit to kwvg/dash that referenced this pull request Oct 15, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 23, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 23, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 23, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 24, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 24, 2024
PastaPastaPasta added a commit to dashpay/dash that referenced this pull request Oct 24, 2024
, bitcoin#29372, bitcoin#29460, bitcoin#29358, bitcoin#29511, bitcoin#29390, bitcoin#29431, bitcoin-core/gui#788 (BIP324 backports: part 4)

4735b82 merge bitcoin#29431: disconnection scenarios during v2 handshake (Kittywhiskers Van Gogh)
cc6b88e merge bitcoin-core/gui#788: update session ID tooltip (Kittywhiskers Van Gogh)
2455862 merge bitcoin#29390: speedup bip324_cipher.py unit test (Kittywhiskers Van Gogh)
062aaf1 merge bitcoin#29511: Fix intermittent failure in rpc_net.py --v2transport (Kittywhiskers Van Gogh)
54972e8 merge bitcoin#29358: use v2 everywhere for P2PConnection if --v2transport is enabled (Kittywhiskers Van Gogh)
4cce72f test: add missing debug log assertion in `p2p_invalid_messages.py` (Kittywhiskers Van Gogh)
bd2fe61 merge bitcoin#29460: assert rpc error for addnode v2transport not enabled (Kittywhiskers Van Gogh)
5ee15fa merge bitcoin#29372: fix intermittent failure in `rpc_setban.py --v2transport` (Kittywhiskers Van Gogh)
e278818 merge bitcoin#29352: fix intermittent failure in p2p_v2_earlykeyresponse (Kittywhiskers Van Gogh)
6b2a8b5 merge bitcoin#24748: functional tests for v2 P2P encryption (Kittywhiskers Van Gogh)
32500f2 merge bitcoin#27653: add unit test coverage for Python ECDSA implementation (Kittywhiskers Van Gogh)
9f476c6 net: add Dash network message short IDs, allocate range 128 onwards (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Depends on #6329

  * Dash-specific P2P messages have been allocated short IDs after 128 based on a prior suggestion ([source](#6280 (comment))) as there are 255 potential short IDs (ID `0` is reserved for V1 fallback, [source](https://github.com/dashpay/dash/blob/a7bbcc823d800b293e9ec54dff518fa9929c763c/src/net.cpp#L1019)) and upstream uses 28 short IDs (though Dash has left ID `5` blank as we do not implement the `FEEFILTER` message, [source](https://github.com/dashpay/dash/blob/a7bbcc823d800b293e9ec54dff518fa9929c763c/src/net.cpp#L1024)).

    As it is unlikely that upstream will utilize more than 127 short IDs (and the spec refers to IDs after 32 as "undefined", [source](https://github.com/bitcoin/bips/blob/52894e1aa7af27f4ae7836a391643124c71e2639/bip-0324.mediawiki#v2-bitcoin-p2p-message-structure)), there shouldn't be an adverse effect to utilizing IDs >=128. The unified array of short IDs are accessible through `V2ShortIDs()`.

    * As there are checks to see if an ID *can* be valid by checking against the array size (which wouldn't work here as we create an array of 256 entries combining both upstream and Dash's allocated short IDs, filling the rest with blank values and we cannot ignore blank values to know if a value is unallocated as the blanking could also signal a reservation, [source](https://github.com/dashpay/dash/blob/a7bbcc823d800b293e9ec54dff518fa9929c763c/src/net.cpp#L1048-L1052)), such a check needs to be done by using `IsValidV2ShortID()`.
    * `V2ShortIDs()` isn't as elegant as desired as `std::fill` and `std::copy` are not `constexpr` until C++20 ([source](https://en.cppreference.com/w/cpp/algorithm/fill), [source](https://en.cppreference.com/w/cpp/algorithm/copy)) and until we drop C++17 support, we have to be mindful of that.

  * `masternode connect` will now _attempt_ to establish a P2Pv2 connection if the node *initiating* the connection has opted-in using the new argument (`v2transport`) and the node was started with P2Pv2 enabled (using the launch argument, `-v2transport`).

    This mirrors changes to behavior to `addconnection` introduced in [bitcoin#24748](bitcoin#24748)

  * The oversized payload test in `p2p_invalid_messages.py` will expect an excessively large message of size of `3145729` bytes (and in P2Pv2, `3145742` bytes), as opposed to upstream's `4000001` and `4000014` bytes respectively as Dash has a lower protocol limit of 3MiB ([source](https://github.com/dashpay/dash/blob/a7bbcc823d800b293e9ec54dff518fa9929c763c/src/net.h#L80-L81)) vs Bitcoin's 4MB ([source](https://github.com/bitcoin/bitcoin/blob/225718eda89d65a7041c33e1994d3bf7bd71bbdd/src/net.h#L62-L63))

  ## Breaking Changes

  None expected.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK 4735b82
  UdjinM6:
    light ACK 4735b82

Tree-SHA512: 4a9d586133633c109e6a8f20a6c6c5e4b24185fb616bcd8568e546b6e9f886e7a8707e811fd070bbe32c40df269f89a56343a24b67242c6147f9df27275af599
@bitcoin bitcoin locked and limited conversation to collaborators Jul 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants