-
Notifications
You must be signed in to change notification settings - Fork 37.7k
test: Fix and clean p2p_invalid_messages functional tests #19177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting out the test changes. Though I am not sure about removing the _tweak_msg_data_size without replacement.
We should test the following cases:
header indicates size a, but payload is of size a-1
header indicates size a, but payload is of size a+1
Is this tested anywhere else?
Concept ACK 2719499 📇
Show signature and timestamp
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Concept ACK 271949980d 📇
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUg9/wv/TOZ2pZ6cdN9wpOnvPdWaLFpI7dfr6MT5LQnkmi499eC/DUrCy788Xc9D
rSQd0M4Ex8qe8sRDxHwVZtk6NaXMGCSRThMhWq0loc/cz2U266JT8s4DXXW1tkRP
Dm+iIgjArC/8kkFPB7Ci9CuNWprEPbhjOCtwYMcrfk2U7WyBY+/+MMSJNVhK+Jqc
n/4wNKZ+zR5cDTJjttcTe6LyU7DPRkLSFBxCbE9BzPxuFwq0jqqNzW3PUgO8Y3Ly
SxfVyBFOqniKDWIuf7Q4+98/wnI9BpYAIb+9JChEuc8SShix1knYkhiXq4yehPui
ruTC7WlV0xmpQnpu3WTnhInizljR49w0UYTFT2EXc0ZPZNrSb3DbAxgzkQJZdA+9
GfwY+rvkAtdHg5SnFPJP1moFFQdleNjZYLS7cGkpsKmovsyL74p6qB/yKLTNxJzW
XcYgl3ju5ykOIPtzIptZYrJJn3Llkn128W6C+21X2uaF0NtlsXDn3mP/Qwm3B0yF
YXE9F8ud
=nUN3
-----END PGP SIGNATURE-----
Timestamp of file with hash d78ad05b778745908e2ab5b35cdd924d160c6acceba435f073770d6a7bdc6aac -
These aren't strictly header errors. In (1), the message won't finish being read, because the deserializer is waiting for another byte. When the following message is received:
In (2), the following message header will be misaligned for similar reasons and will result in disconnect. In both cases, it's the subsequent message that causes the disconnect. I don't think we necessarily need to test this, because invalid headers are tested elsewhere. |
eaaa04e
to
da3a874
Compare
Test 1 is a duplicate of test_size() later in the file. Inexplicably, this test does not work on macOS, whereas test_size() does. Test 2 is problematic for two reasons. First, it always fails with an invalid checksum, which is probably not what was intended. Second, it's not defined at this layer what the behavior should be. Hypothetically, if this test was fixed so that it gave messages with valid checksums, then the message would pass successfully thought the network layer and fail only in the processing layer. A priori the network layer has no idea what the size of a message "actually" is. The "Why does behavior change at 78 bytes" is because of the following: print(len(node.p2p.build_message(msg))) # 125 => Payload size = 125 - 24 = 101 If we take 77 bytes, then there are 101 - 77 = 24 left That's exactly the size of a header So, bitcoind deserializes the header and rejects it for some other reason (Almost always an invalid size (too large)) But, if we take 78 bytes, then there are 101 - 78 = 23 left That's not enough to fill a header, so the socket stays open waiting for more data. That's why we sometimes have to push additional data in order for the peer to disconnect. Additionally, both of these tests use the "conn" variable. For fun, go look at where it's declared. (Hint: test_large_inv(). Don't we all love python's idea of scope?)
As well, this renames those variables to match PEP8 and this clears up the comment relating to VALID_DATA_LIMIT. Admittedly, this commit is mainly to make the following ones cleaner.
This test originally made a message with an invalid stated length, and an invalid checksum. This was because only the header was changed, but the checksum stayed the same. This was fine for now because we check the header first to see if it has a valid stated size, and we disconnect if it does not, so we never end up checking for the checksum. If this behavior was to change, this test would become a problem. (Indeed I discovered this when playing around with this behavior). By instead creating a message with an oversized payload from the start, we create a message with an invalid stated length but a valid checksum, as intended. Additionally, this takes advantage to the newly module-global VALID_DATA_LIMIT as opposed to the magic 0x02000000. Yes, 4MB < 32MiB, but at the moment when receiving a message we check both, so this makes the test tighter.
da3a874
to
f7a2ef0
Compare
Rebased and made suggested changes. Will continue thinking about what the correct test should be for a malicious indicated size. |
This is a simple refactor of the specified test. It is now brought in line with the rest of the tests in the module. This should make things easier to debug, as all of the tests are now grouped together at the top.
f7a2ef0
to
af2a145
Compare
I don't see the downside of having a test that does:
It should be possible to write such a test in 5-10 lines of python. Is there any reason I am missing that such a test would be inappropriate? |
Regarding the situation where the payload is too large, consider the following test. A node can not tell the difference between these two "messages", they're equivalent. As opposed to testing for an oversized payload, we should test separately for:
Sending >=24 bytes of junk (enough to fill a header or more) is covered by the rest of the tests in this module. The situation where the payload is too small is effectively the same. The beginning of the next message will be lost, and we'll be back to receiving junk (the rest of the next message). An incorrect payload size is IMO too high up the chain of events to be included. It's what results from an incorrect payload size that we should be checking for. def test_equivalent_large(self):
'''Make a message with an oversized payload
Or, make a smaller message with the wrong checksum, then send a junk byte
'''
conn = self.nodes[0].add_p2p_connection(P2PDataStore())
msg_base = msg_generic(b'badmsg', data=b'd'*5)
msg_1 = conn.build_message(msg_base)
cut_len = (
4 + # magic
12 # msgtype
)
msg_1 = msg_1[:cut_len] + struct.pack("<I", len(msg_base.data) - 1) + msg_1[cut_len + 4:]
self.nodes[0].p2p.send_raw_message(msg_1)
with self.nodes[0].assert_debug_log(['CHECKSUM ERROR (badmsg, 4 bytes), expected a359d53a was cdbcd284']):
# The usual connected check no longer works because the messages are misaligned
#conn.sync_with_ping(timeout=1)
assert self.nodes[0].p2p.is_connected
self.nodes[0].disconnect_p2ps()
#
# Equivalent
#
conn = self.nodes[0].add_p2p_connection(P2PDataStore())
msg_base = msg_generic(b'badmsg', data=b'd'*(5-1))
msg_1 = conn.build_message(msg_base)
cut_len = (
4 + # magic
12 + # msgtype
4 # size
)
msg_1 = msg_1[:cut_len] + b'\xcd\xbc\xd2\x84' + msg_1[cut_len + 4:]
self.nodes[0].p2p.send_raw_message(msg_1)
self.nodes[0].p2p.send_raw_message(b'd')
with self.nodes[0].assert_debug_log(['CHECKSUM ERROR (badmsg, 4 bytes), expected a359d53a was cdbcd284']):
# The usual connected check no longer works because the messages are misaligned
#conn.sync_with_ping(timeout=1)
assert self.nodes[0].p2p.is_connected
self.nodes[0].disconnect_p2ps() |
Just for fun, I'll point out that we wont always disconnect on a corrupted payload size. I've made a proof of concept where the corruption neither results in a disconnect nor a misalignment. All it needs is two normal bitcoin messages and two single bit flips (one in the first message's payload size, and the other in the second message's command). It can be found here. |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. Coverage
Updated at: 2020-06-12T18:13:45.902164. |
ACK af2a145 This is a good change and removes a lot of confused and confusing test code that isn't testing what it purports to. Future PRs could clean this test further:
EDIT: struck out point 2. It was untrue as pointed out by Marco below. |
I don't think this is true. At least on travis the test failed when the ping was sent on a different connection. See #19177 (comment) |
You're right. I was wrong. When I run this locally, the message sending appears to be serial because bitcoind is able to keep up with the messages as quickly as they're delivered. When bitcoind is running more slowly, such as under sanitizers, it can't keep up and Python's asyncio is able to write to both the sockets for conn and conn2 while bitcoind is processing the large invalid messages. See https://travis-ci.org/github/bitcoin/bitcoin/jobs/695869826#L3577 for example. |
Where? In general, I still don't understand why it is ok to remove test coverage without any replacement. We have never done this in the past and I don't see why this pull should be an exception. Don't get me wrong, I like the cleanups here and I'd like to see it merged, but I think this should and can be done without removing coverage. If the functional tests are the wrong place to put such a test, great then move it to the unit tests, but simply removing should be the option of last resort. For reference (master): this pull: |
Sorry, false alarm. The "red" line will only become "blue" when the socket buffer is filled in two chunks or more. This might happen non-deterministically based on network-congestion on the test machine and has nothing to do with the changes in this pull request. |
re-ACK af2a145 🍦 Show signature and timestampSignature:
Timestamp of file with hash |
Done in #19264 |
…nal tests af2a145 Refactor resource exhaustion test (Troy Giorshev) 5c4648d Fix "invalid message size" test (Troy Giorshev) ff1e7b8 Move size limits to module-global (Troy Giorshev) 57890ab Remove two unneeded tests (Troy Giorshev) Pull request description: This PR touches only the p2p_invalid_messages.py functional test module. There are two main goals accomplished here. First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons. Second, it refactors the file into a single consistent style. This file appears to have originally had two authors, with different styles and some test duplication. It should now be easier and quicker to understand this module, anticipating the upcoming [BIP324](bitcoin#18242) and [AltNet](bitcoin#18989) changes. This should probably go in ahead of bitcoin#19107, but the two are not strictly related. ACKs for top commit: jnewbery: ACK af2a145 MarcoFalke: re-ACK af2a145 🍦 Tree-SHA512: 9b57561e142c5eaefac5665f7355c8651670400b4db1a89525d2dfdd20e872d6873c4f6175c4222b6f5a8e5210cf5d6a52da69b925b673a2e2ac30a15d670d1c
…tests Summary: af2a145e575f23c64909e6cf1fb323c603bda7ca Refactor resource exhaustion test (Troy Giorshev) 5c4648d17ba18e4194959963994cc6b37053f127 Fix "invalid message size" test (Troy Giorshev) ff1e7b884447a5ba10553b2d964625f94e255bdc Move size limits to module-global (Troy Giorshev) 57890abf2c7919eddfec36178b1136cd44ffe883 Remove two unneeded tests (Troy Giorshev) Pull request description: This PR touches only the p2p_invalid_messages.py functional test module. There are two main goals accomplished here. First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons. Second, it refactors the file into a single consistent style. This file appears to have originally had two authors, with different styles and some test duplication. It should now be easier and quicker to understand this module, anticipating the upcoming [BIP324](bitcoin/bitcoin#18242) and [AltNet](bitcoin/bitcoin#18989) changes. This should probably go in ahead of #19107, but the two are not strictly related. --- Depends on D9511 Backport of [[bitcoin/bitcoin#19177 | core#19177]] Test Plan: ninja all && ./test/functional/test_runner.py p2p_invalid_messages Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Subscribers: PiRK Differential Revision: https://reviews.bitcoinabc.org/D9519
…t framework update) 7b04c0a [Test] Clean duplicate connections creation in mining_pos_coldStaking.py (furszy) 15a799e [Test] MAX_PROTOCOL_MESSAGE_LENGTH PIVXified. (furszy) 0aedf35 [tests] Don't import asyncio to test magic bytes (John Newbery) 9bb5309 Refactor resource exhaustion test (Troy Giorshev) 589a780 Fix "invalid message size" test (Troy Giorshev) 8a0c12b Move size limits to module-global (Troy Giorshev) b3391c5 Remove two unneeded tests (Troy Giorshev) 1404e68 test: Add various low-level p2p tests (furszy) 8aaf7e1 test: Remove fragile assert_memory_usage_stable (MarcoFalke) f68e22c [Test] p2p_invalid_messages.py do not change msg.command in test_command and raise sync_with_ping timeout (furszy) c851c0b Test framework: Wait for verack by default on every new connection (furszy) c02e9a0 [Test] move framework subversion to string (furszy) 3472a39 Replace coroutine with async def in p2p_invalid_messages.py (Elichai Turkel) 33c7b19 net_processing: align msg checksum check properly. (furszy) 7f71b1b Hash P2P messages as they are received instead of at process-time (furszy) 215a638 test: Skip flaky p2p_invalid_messages test on macOS (Fabian Jahr) c11f565 qa: Make swap_magic_bytes in p2p_invalid_messages atomic (MarcoFalke) be979ad test: Fix race in p2p_invalid_messages (MarcoFalke) 6a72f0c qa: Add tests for invalid message headers (MarcoFalke) 51ddd3d Introduce p2p_invalid_messages.py test (furszy) 55a37b5 net: fix missing jump line in "Oversized message from peer.." error. (furszy) 0edfce5 test_node: get_mem_rss fixups (MarcoFalke) 6f21213 tests: add P2PConnection.send_raw_message (James O'Beirne) ae68c6e tests: add utility to assert node memory usage hasn't increased (James O'Beirne) 8469afe test: forward timeouts properly in send_blocks_and_test (James O'Beirne) db28a53 Skip is_closing() check when not available. (Daniel Kraft) be9dacb tests: fixes mininode's P2PConnection sending messages on closing transport (marcoagner) 53599c8 qa: Avoid start/stop of the network thread mid-test (furszy) 688190c [qa] mininode: Expose connection state through is_connected (MarcoFalke) Pull request description: Part of the deep and long net and ser work that I'm doing (and Tor v3 onion addresses support). Friend of #2359. Focused on the end goal of implementing the `p2p_invalid_messages` functional test which validates that invalid msg headers, msg types, oversized payloads and inventory msgs aren't accepted nor can cause a resource exhaustion. And an extra covered scenario, in `p2p_invalid_tx.py`, for the orphan pool overflow. Plus, to get up to the goal, had to work over the functional test framework as well. So.. adapted list: * bitcoin#9045. * bitcoin#13512. * bitcoin#13517. * bitcoin#13715. * bitcoin#13747. * bitcoin#14456. * bitcoin#14522. * bitcoin#14672. * bitcoin#14693. * bitcoin#14812. * bitcoin#15246. * bitcoin#15330. * bitcoin#15697. * bitcoin#16445. * bitcoin#17931. * bitcoin#17469. * bitcoin#18628 (only `p2p_invalid_tx.py` and `p2p_invalid_messages.py`. We don't support the other tests yet). * bitcoin#19177. * bitcoin#19264. ACKs for top commit: random-zebra: utACK 7b04c0a and merging... Tree-SHA512: 48d1f1a6acc24a9abce033fbf1f281ba4392147da39d118a694a09d63c3e0610cc1a29d711b434b16cc0d0e030dd9f1a89843870091b6999b862b9ab18a20679
This PR touches only the p2p_invalid_messages.py functional test module. There are two main goals accomplished here. First, it fixes the "invalid message size" test, which previously made a message that was invalid for multiple reasons. Second, it refactors the file into a single consistent style. This file appears to have originally had two authors, with different styles and some test duplication.
It should now be easier and quicker to understand this module, anticipating the upcoming BIP324 and AltNet changes.
This should probably go in ahead of #19107, but the two are not strictly related.