Skip to content

Conversation

troygiorshev
Copy link
Contributor

This PR touches only the p2p_invalid_messages.py functional test module. There are two main goals accomplished here. First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons. Second, it refactors the file into a single consistent style. This file appears to have originally had two authors, with different styles and some test duplication.

It should now be easier and quicker to understand this module, anticipating the upcoming BIP324 and AltNet changes.

This should probably go in ahead of #19107, but the two are not strictly related.

@fanquake fanquake added the Tests label Jun 5, 2020
@maflcko maflcko changed the title p2p: Fix and clean p2p_invalid_messages functional tests test: Fix and clean p2p_invalid_messages functional tests Jun 5, 2020
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting out the test changes. Though I am not sure about removing the _tweak_msg_data_size without replacement.
We should test the following cases:

header indicates size a, but payload is of size a-1
header indicates size a, but payload is of size a+1

Is this tested anywhere else?

Concept ACK 2719499 📇

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Concept ACK 271949980d 📇
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUg9/wv/TOZ2pZ6cdN9wpOnvPdWaLFpI7dfr6MT5LQnkmi499eC/DUrCy788Xc9D
rSQd0M4Ex8qe8sRDxHwVZtk6NaXMGCSRThMhWq0loc/cz2U266JT8s4DXXW1tkRP
Dm+iIgjArC/8kkFPB7Ci9CuNWprEPbhjOCtwYMcrfk2U7WyBY+/+MMSJNVhK+Jqc
n/4wNKZ+zR5cDTJjttcTe6LyU7DPRkLSFBxCbE9BzPxuFwq0jqqNzW3PUgO8Y3Ly
SxfVyBFOqniKDWIuf7Q4+98/wnI9BpYAIb+9JChEuc8SShix1knYkhiXq4yehPui
ruTC7WlV0xmpQnpu3WTnhInizljR49w0UYTFT2EXc0ZPZNrSb3DbAxgzkQJZdA+9
GfwY+rvkAtdHg5SnFPJP1moFFQdleNjZYLS7cGkpsKmovsyL74p6qB/yKLTNxJzW
XcYgl3ju5ykOIPtzIptZYrJJn3Llkn128W6C+21X2uaF0NtlsXDn3mP/Qwm3B0yF
YXE9F8ud
=nUN3
-----END PGP SIGNATURE-----

Timestamp of file with hash d78ad05b778745908e2ab5b35cdd924d160c6acceba435f073770d6a7bdc6aac -

@jnewbery
Copy link
Contributor

jnewbery commented Jun 5, 2020

@MarcoFalke

We should test the following cases:

  1. header indicates size a, but payload is of size a-1
  2. header indicates size a, but payload is of size a+1

These aren't strictly header errors. In (1), the message won't finish being read, because the deserializer is waiting for another byte. When the following message is received:

  • the first byte of the header will be read as the final byte of the payload of the previous message
  • that payload may or may not succeed deserialization and processing in net_processing
  • the following message header will fail deserialization because it's misaligned. It'll fail the message start bytes check and be disconnected.

In (2), the following message header will be misaligned for similar reasons and will result in disconnect.

In both cases, it's the subsequent message that causes the disconnect. I don't think we necessarily need to test this, because invalid headers are tested elsewhere.

@troygiorshev troygiorshev force-pushed the p2p-refactor-fix-tests branch 2 times, most recently from eaaa04e to da3a874 Compare June 8, 2020 04:46
Test 1 is a duplicate of test_size() later in the file.  Inexplicably,
this test does not work on macOS, whereas test_size() does.

Test 2 is problematic for two reasons.  First, it always fails with an
invalid checksum, which is probably not what was intended.  Second, it's
not defined at this layer what the behavior should be.  Hypothetically,
if this test was fixed so that it gave messages with valid checksums,
then the message would pass successfully thought the network layer and
fail only in the processing layer.  A priori the network layer has no
idea what the size of a message "actually" is.

The "Why does behavior change at 78 bytes" is because of the following:

print(len(node.p2p.build_message(msg))) # 125
=> Payload size = 125 - 24 = 101
If we take 77 bytes, then there are 101 - 77 = 24 left
That's exactly the size of a header
So, bitcoind deserializes the header and rejects it for some other reason
(Almost always an invalid size (too large))
But, if we take 78 bytes, then there are 101 - 78 = 23 left
That's not enough to fill a header, so the socket stays open waiting for
more data.  That's why we sometimes have to push additional data in
order for the peer to disconnect.

Additionally, both of these tests use the "conn" variable.  For fun, go
look at where it's declared.  (Hint: test_large_inv().  Don't we all
love python's idea of scope?)
As well, this renames those variables to match PEP8 and this clears up
the comment relating to VALID_DATA_LIMIT.

Admittedly, this commit is mainly to make the following ones cleaner.
This test originally made a message with an invalid stated length, and
an invalid checksum.  This was because only the header was changed, but
the checksum stayed the same.  This was fine for now because we check
the header first to see if it has a valid stated size, and we disconnect
if it does not, so we never end up checking for the checksum.  If this
behavior was to change, this test would become a problem.  (Indeed I
discovered this when playing around with this behavior).  By instead
creating a message with an oversized payload from the start, we create a
message with an invalid stated length but a valid checksum, as intended.

Additionally, this takes advantage to the newly module-global
VALID_DATA_LIMIT as opposed to the magic 0x02000000.  Yes, 4MB < 32MiB,
but at the moment when receiving a message we check both, so this makes
the test tighter.
@troygiorshev troygiorshev force-pushed the p2p-refactor-fix-tests branch from da3a874 to f7a2ef0 Compare June 8, 2020 04:57
@troygiorshev
Copy link
Contributor Author

Rebased and made suggested changes. Will continue thinking about what the correct test should be for a malicious indicated size.

This is a simple refactor of the specified test.  It is now brought in
line with the rest of the tests in the module.  This should make things
easier to debug, as all of the tests are now grouped together at the
top.
@troygiorshev troygiorshev force-pushed the p2p-refactor-fix-tests branch from f7a2ef0 to af2a145 Compare June 8, 2020 12:52
@maflcko
Copy link
Member

maflcko commented Jun 8, 2020

I don't see the downside of having a test that does:

  • Send a corrupt msg with wrongly indicates size (off by one in either way)
  • Send a valid msg and observe a disconnect

It should be possible to write such a test in 5-10 lines of python. Is there any reason I am missing that such a test would be inappropriate?

@troygiorshev
Copy link
Contributor Author

Regarding the situation where the payload is too large, consider the following test. A node can not tell the difference between these two "messages", they're equivalent. As opposed to testing for an oversized payload, we should test separately for:

  • Checking that a node responds well to a message with an invalid checksum
  • Checking that a node responds well to <24 bytes of complete junk

Sending >=24 bytes of junk (enough to fill a header or more) is covered by the rest of the tests in this module.

The situation where the payload is too small is effectively the same. The beginning of the next message will be lost, and we'll be back to receiving junk (the rest of the next message).

An incorrect payload size is IMO too high up the chain of events to be included. It's what results from an incorrect payload size that we should be checking for.

def test_equivalent_large(self):
    '''Make a message with an oversized payload
    Or, make a smaller message with the wrong checksum, then send a junk byte
    '''
    conn = self.nodes[0].add_p2p_connection(P2PDataStore())
    msg_base = msg_generic(b'badmsg', data=b'd'*5)
    msg_1 = conn.build_message(msg_base)
    cut_len = (
        4 +  # magic
        12  # msgtype
    )
    msg_1 = msg_1[:cut_len] + struct.pack("<I", len(msg_base.data) - 1) + msg_1[cut_len + 4:]
    self.nodes[0].p2p.send_raw_message(msg_1)
    with self.nodes[0].assert_debug_log(['CHECKSUM ERROR (badmsg, 4 bytes), expected a359d53a was cdbcd284']):
        # The usual connected check no longer works because the messages are misaligned
        #conn.sync_with_ping(timeout=1)
        assert self.nodes[0].p2p.is_connected
        self.nodes[0].disconnect_p2ps()
    #
    # Equivalent
    #
    conn = self.nodes[0].add_p2p_connection(P2PDataStore())
    msg_base = msg_generic(b'badmsg', data=b'd'*(5-1))
    msg_1 = conn.build_message(msg_base)
    cut_len = (
        4 +  # magic
        12 +  # msgtype
        4 # size
    )
    msg_1 = msg_1[:cut_len] + b'\xcd\xbc\xd2\x84' + msg_1[cut_len + 4:]
    self.nodes[0].p2p.send_raw_message(msg_1)
    self.nodes[0].p2p.send_raw_message(b'd')
    with self.nodes[0].assert_debug_log(['CHECKSUM ERROR (badmsg, 4 bytes), expected a359d53a was cdbcd284']):
        # The usual connected check no longer works because the messages are misaligned
        #conn.sync_with_ping(timeout=1)
        assert self.nodes[0].p2p.is_connected
        self.nodes[0].disconnect_p2ps()

@troygiorshev
Copy link
Contributor Author

Just for fun, I'll point out that we wont always disconnect on a corrupted payload size. I've made a proof of concept where the corruption neither results in a disconnect nor a misalignment. All it needs is two normal bitcoin messages and two single bit flips (one in the first message's payload size, and the other in the second message's command). It can be found here.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 10, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Coverage

Coverage Change (pull 19177, d8ef251) Reference (master, 19e9192)
Lines +0.0373 % 90.6714 %
Functions +0.0000 % 85.6033 %
Branches +0.0194 % 51.9388 %

Updated at: 2020-06-12T18:13:45.902164.

@jnewbery
Copy link
Contributor

jnewbery commented Jun 12, 2020

ACK af2a145

This is a good change and removes a lot of confused and confusing test code that isn't testing what it purports to.

Future PRs could clean this test further:

  • the "resource_exhaustion" test should be removed/renamed from resource exhaustion. This was originally added to verify that receiving large messages wouldn't result in memory blowup, but that was a bad idea (tests: add invalid P2P message tests #14522 (comment)) and had to be removed.
  • the "even though the node is being hammered by nonsense from one connection, it can still service other peers in a timely way." comment is wrong. All outgoing messages from the Python message thread are serial, so by the time we get to sending the pings, all of the previous large messages have been processed already.
  • I can't see any reason for launching a new async task to test the magic bytes. Why not just send a message with invalid magic bytes using send_raw_message? I see this was merged without review (qa: Make swap_magic_bytes in p2p_invalid_messages atomic #15697) and there's no indication of why it's needed.
  • the msg_unrecognized class can be removed and replaced with a msg_generic. The serialized length of a msg_unrecognized message seems to have caused confusion in the past (see VALID_DATA_LIMIT ) because people haven't realised the length of the length field for a serialized string

EDIT: struck out point 2. It was untrue as pointed out by Marco below.

@maflcko
Copy link
Member

maflcko commented Jun 12, 2020

all of the previous large messages have been processed already.

I don't think this is true. At least on travis the test failed when the ping was sent on a different connection. See #19177 (comment)

@jnewbery
Copy link
Contributor

I don't think this is true. At least on travis the test failed when the ping was sent on a different connection. See #19177 (comment)

You're right. I was wrong. When I run this locally, the message sending appears to be serial because bitcoind is able to keep up with the messages as quickly as they're delivered. When bitcoind is running more slowly, such as under sanitizers, it can't keep up and Python's asyncio is able to write to both the sockets for conn and conn2 while bitcoind is processing the large invalid messages. See https://travis-ci.org/github/bitcoin/bitcoin/jobs/695869826#L3577 for example.

@maflcko
Copy link
Member

maflcko commented Jun 12, 2020

I don't think we necessarily need to test this, because invalid headers are tested elsewhere.

Where?

In general, I still don't understand why it is ok to remove test coverage without any replacement. We have never done this in the past and I don't see why this pull should be an exception. Don't get me wrong, I like the cleanups here and I'd like to see it merged, but I think this should and can be done without removing coverage.

If the functional tests are the wrong place to put such a test, great then move it to the unit tests, but simply removing should be the option of last resort.

For reference (master):

Screenshot_2020-06-12 LCOV - total_coverage info - src net cpp(1)

this pull:

Screenshot_2020-06-12 LCOV - total_coverage info - src net cpp

@maflcko
Copy link
Member

maflcko commented Jun 12, 2020

Sorry, false alarm. The "red" line will only become "blue" when the socket buffer is filled in two chunks or more. This might happen non-deterministically based on network-congestion on the test machine and has nothing to do with the changes in this pull request.

@maflcko
Copy link
Member

maflcko commented Jun 12, 2020

re-ACK af2a145 🍦

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

re-ACK af2a145e57 🍦
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUiFBwv+KRk8EKRDiTcHW/1+SbEEwKl26fup1QVoLiueaW5IWuwuG6k/vgx0ZIIA
SIUVmXEXgJ/wJ+8oWtT6rR/ulh5i1S6vR38d0o1HYg6vR+sUOeuAuC/LYoKxJMHD
/AVdf9tXHk0pxIJiPOfP4DJITc/DYqcIpEDVcj22I5HJ8nrkHJcd9prMmDmYs60e
qniyKvw6c6GufEr4ziSNgs6r9Gzng7hDV7wFX/0bzy24259tedShoLOZQPdEldqa
vVyWiKufdpXb5UUhxBt68fmYDTFW5ivWhwujZGiWjNNMmiDmqwMIIglxfKjm9o3a
lEqe/19jVDf6VGcG8uo2IZap/5G319q76298GhTaLyd8qA7kvLe/f418QKeKUoVk
NZa6LW1jx6A49xQ1iwcQKWYkklNOK8OTCJi1AlmlFVwPy5+PKztvEl8+myYAfuyo
qUJ+eY0sqdCjjJJ4KsHYPcWD1uWTP3XxDgRGGA3NWbdYaQyAMLWfryc7MGj/UJU6
qm7Tuj6b
=hxKG
-----END PGP SIGNATURE-----

Timestamp of file with hash c5c47520e34b1feab503ae488c8195cfa5311aa8dfb1311b7826eb27cebc4497 -

@maflcko maflcko merged commit f154071 into bitcoin:master Jun 12, 2020
@jnewbery
Copy link
Contributor

I can't see any reason for launching a new async task to test the magic bytes. Why not just send a message with invalid magic bytes using send_raw_message?

Done in #19264

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 13, 2020
…nal tests

af2a145 Refactor resource exhaustion test (Troy Giorshev)
5c4648d Fix "invalid message size" test (Troy Giorshev)
ff1e7b8 Move size limits to module-global (Troy Giorshev)
57890ab Remove two unneeded tests (Troy Giorshev)

Pull request description:

  This PR touches only the p2p_invalid_messages.py functional test module.  There are two main goals accomplished here.  First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons.  Second, it refactors the file into a single consistent style.  This file appears to have originally had two authors, with different styles and some test duplication.

  It should now be easier and quicker to understand this module, anticipating the upcoming [BIP324](bitcoin#18242) and [AltNet](bitcoin#18989) changes.

  This should probably go in ahead of bitcoin#19107, but the two are not strictly related.

ACKs for top commit:
  jnewbery:
    ACK af2a145
  MarcoFalke:
    re-ACK af2a145 🍦

Tree-SHA512: 9b57561e142c5eaefac5665f7355c8651670400b4db1a89525d2dfdd20e872d6873c4f6175c4222b6f5a8e5210cf5d6a52da69b925b673a2e2ac30a15d670d1c
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request May 13, 2021
…tests

Summary:
af2a145e575f23c64909e6cf1fb323c603bda7ca Refactor resource exhaustion test (Troy Giorshev)
5c4648d17ba18e4194959963994cc6b37053f127 Fix "invalid message size" test (Troy Giorshev)
ff1e7b884447a5ba10553b2d964625f94e255bdc Move size limits to module-global (Troy Giorshev)
57890abf2c7919eddfec36178b1136cd44ffe883 Remove two unneeded tests (Troy Giorshev)

Pull request description:

  This PR touches only the p2p_invalid_messages.py functional test module.  There are two main goals accomplished here.  First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons.  Second, it refactors the file into a single consistent style.  This file appears to have originally had two authors, with different styles and some test duplication.

  It should now be easier and quicker to understand this module, anticipating the upcoming [BIP324](bitcoin/bitcoin#18242) and [AltNet](bitcoin/bitcoin#18989) changes.

  This should probably go in ahead of #19107, but the two are not strictly related.

---

Depends on D9511

Backport of [[bitcoin/bitcoin#19177 | core#19177]]

Test Plan:
  ninja all && ./test/functional/test_runner.py p2p_invalid_messages

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Subscribers: PiRK

Differential Revision: https://reviews.bitcoinabc.org/D9519
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Jun 28, 2021
…t framework update)

7b04c0a [Test] Clean duplicate connections creation in mining_pos_coldStaking.py (furszy)
15a799e [Test] MAX_PROTOCOL_MESSAGE_LENGTH PIVXified. (furszy)
0aedf35 [tests] Don't import asyncio to test magic bytes (John Newbery)
9bb5309 Refactor resource exhaustion test (Troy Giorshev)
589a780 Fix "invalid message size" test (Troy Giorshev)
8a0c12b Move size limits to module-global (Troy Giorshev)
b3391c5 Remove two unneeded tests (Troy Giorshev)
1404e68 test: Add various low-level p2p tests (furszy)
8aaf7e1 test: Remove fragile assert_memory_usage_stable (MarcoFalke)
f68e22c [Test] p2p_invalid_messages.py do not change msg.command in test_command and raise sync_with_ping timeout (furszy)
c851c0b Test framework: Wait for verack by default on every new connection (furszy)
c02e9a0 [Test] move framework subversion to string (furszy)
3472a39 Replace coroutine with async def in p2p_invalid_messages.py (Elichai Turkel)
33c7b19 net_processing: align msg checksum check properly. (furszy)
7f71b1b Hash P2P messages as they are received instead of at process-time (furszy)
215a638 test: Skip flaky p2p_invalid_messages test on macOS (Fabian Jahr)
c11f565 qa: Make swap_magic_bytes in p2p_invalid_messages atomic (MarcoFalke)
be979ad test: Fix race in p2p_invalid_messages (MarcoFalke)
6a72f0c qa: Add tests for invalid message headers (MarcoFalke)
51ddd3d Introduce p2p_invalid_messages.py test (furszy)
55a37b5 net: fix missing jump line in "Oversized message from peer.." error. (furszy)
0edfce5 test_node: get_mem_rss fixups (MarcoFalke)
6f21213 tests: add P2PConnection.send_raw_message (James O'Beirne)
ae68c6e tests: add utility to assert node memory usage hasn't increased (James O'Beirne)
8469afe test: forward timeouts properly in send_blocks_and_test (James O'Beirne)
db28a53 Skip is_closing() check when not available. (Daniel Kraft)
be9dacb tests: fixes mininode's P2PConnection sending messages on closing transport (marcoagner)
53599c8 qa: Avoid start/stop of the network thread mid-test (furszy)
688190c [qa] mininode: Expose connection state through is_connected (MarcoFalke)

Pull request description:

  Part of the deep and long net and ser work that I'm doing (and Tor v3 onion addresses support). Friend of #2359.

  Focused on the end goal of implementing the `p2p_invalid_messages` functional test which validates that invalid msg headers, msg types, oversized payloads and inventory msgs aren't accepted nor can cause a resource exhaustion. And an extra covered scenario, in `p2p_invalid_tx.py`, for the orphan pool overflow.

  Plus, to get up to the goal, had to work over the functional test framework as well.

  So.. adapted list:

  * bitcoin#9045.
  * bitcoin#13512.
  * bitcoin#13517.
  * bitcoin#13715.
  * bitcoin#13747.
  * bitcoin#14456.
  * bitcoin#14522.
  * bitcoin#14672.
  * bitcoin#14693.
  * bitcoin#14812.
  * bitcoin#15246.
  * bitcoin#15330.
  * bitcoin#15697.
  * bitcoin#16445.
  * bitcoin#17931.
  * bitcoin#17469.
  * bitcoin#18628 (only `p2p_invalid_tx.py` and `p2p_invalid_messages.py`. We don't support the other tests yet).
  * bitcoin#19177.
  * bitcoin#19264.

ACKs for top commit:
  random-zebra:
    utACK 7b04c0a and merging...

Tree-SHA512: 48d1f1a6acc24a9abce033fbf1f281ba4392147da39d118a694a09d63c3e0610cc1a29d711b434b16cc0d0e030dd9f1a89843870091b6999b862b9ab18a20679
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants