Skip to content

Conversation

maflcko
Copy link
Member

@maflcko maflcko commented Apr 22, 2024

Currently the test passes, but may fail during shutdown, because blocks and transactions are synced with NUM_INBOUND * self.num_nodes peers, which may take a long time.

There is no need for this test to have this amount of inbounds.

So avoid the extraneous inbounds to speed up the test and avoid the intermittent test failures.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 22, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK instagibbs, fjahr, theStack, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

@maflcko maflcko force-pushed the 2404-test-fix-p2p_tx_download-timeout- branch from fa003f2 to fa6c300 Compare April 22, 2024 13:03
@fjahr
Copy link
Contributor

fjahr commented Apr 22, 2024

@maflcko did you see #29926?

@maflcko
Copy link
Member Author

maflcko commented Apr 22, 2024

@maflcko did you see #29926?

It looks like 29926 is adding a call to setmocktime in the test. As the issue happens after all tests have passed, this can not fix the issue. Or am I missing something?

To explain it a bit more, the traceback is:

  File "/test/functional/p2p_tx_download.py", line 302, in <module>
    TxDownloadTest().main()
  File "/test/functional/test_framework/test_framework.py", line 155, in main
    exit_code = self.shutdown()
                ^^^^^^^^^^^^^^^
  File "/test/functional/test_framework/test_framework.py", line 314, in shutdown
    self.network_thread.close()
  File "/test/functional/test_framework/p2p.py", line 732, in close
    wait_until_helper_internal(lambda: not self.network_event_loop.is_running(), timeout=timeout)
  File "/test/functional/test_framework/util.py", line 293, in wait_until_helper_internal
    raise AssertionError("Predicate {} not true after {} seconds".format(predicate_source, timeout))
AssertionError: Predicate ''''
        wait_until_helper_internal(lambda: not self.network_event_loop.is_running(), timeout=timeout)
''' not true after 10.0 seconds

Which is after the test has passed successfully.

@fjahr
Copy link
Contributor

fjahr commented Apr 22, 2024

Hm, do you have an example failure where you have seen that traceback? Or is this from reproducing it locally? Because I haven't seen this, at least not the shutdown part. Here is one of the ones I have seen and it seems to be a different failure: https://cirrus-ci.com/task/5622109341220864

@maflcko
Copy link
Member Author

maflcko commented Apr 22, 2024

To get the full error log on Cirrus CI, you'll have to use the "Open Full Logs" button to see the full log. Usually the output is:

... (dots from the test runner)
traceback (from the failed test)
Full combined logs (from the failed test)
Full test summary (from the test runner)

@glozow
Copy link
Member

glozow commented Apr 22, 2024

I can see the self.shutdown() traceback in this one: https://api.cirrus-ci.com/v1/task/6740372997537792/logs/ci.log

@maflcko
Copy link
Member Author

maflcko commented Apr 22, 2024

I can see the self.shutdown() traceback in this one: https://api.cirrus-ci.com/v1/task/6740372997537792/logs/ci.log

Thanks, in this case it seems the issue is transactions relayed to the extraneous peers. I've edited the pull request description.

To see what message is the cause for the block, you can inspect the full combined log. In this example:

...
�[0;36m test  2024-04-19T18:08:46.732000Z TestFramework.p2p (DEBUG): Received message from 127.0.0.1:17361: msg_tx(tx=CTransaction(nVersion=2 vin=[CTxIn(prevout=COutPoint(hash=25a2ce431090120926452cfbf24c43f9385a3228cee487df3dc952985a5a0e4f n=0) scriptSig= nSequence=0)] vout=[CTxOut(nValue=24.92720000 scriptPubKey=51202913b252fe537830f843bfdc5fa7d20ba48639a87c86ff837b92d083c55ad7c1), CTxOut(nValue=0.00000000 scriptPubKey=6a4e6d070100010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010... (msg truncated) �[0m
�[0;36m test  2024-04-19T18:08:46.861000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: '''' �[0m
�[0;36m                                           wait_until_helper_internal(lambda: not self.network_event_loop.is_running(), timeout=timeout)�[0m
�[0;36m                                   '''�[0m
�[0;36m test  2024-04-19T18:08:47.029000Z TestFramework.p2p (DEBUG): Received message from 127.0.0.1:17361: msg_tx(tx=CTransaction(nVersion=2 vin=[CTxIn(prevout=COutPoint(hash=a21e32ecb778b77cb1a46e21c3ae4e31a893dbf0d3dedae257a89dbe14bacf99 n=0) scriptSig= nSequence=0)] vout=[CTxOut(nValue=24.92330000 scriptPubKey=51202913b252fe537830f843bfdc5fa7d20ba48639a87c86ff837b92d083c55ad7c1), CTxOut(nValue=0.00000000 
...

@instagibbs
Copy link
Member

makes sense

ACK fa6c300

@fjahr
Copy link
Contributor

fjahr commented Apr 22, 2024

We should probably re-run the multiprocess job at least 2 more times manually here to ensure this actually fixes it and it wasn't just lucky this one time.

@maflcko
Copy link
Member Author

maflcko commented Apr 22, 2024

It is already re-run 10 times. You can double check by navigating to the Cirrus CI page for this pull request.

@fjahr
Copy link
Contributor

fjahr commented Apr 22, 2024

It is already re-run 10 times. You can double check by navigating to the Cirrus CI page for this pull request.

Thanks, ACK fa6c300

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa6c300

Thanks for fixing!

@achow101
Copy link
Member

ACK fa6c300

@achow101 achow101 merged commit 10bd32a into bitcoin:master Apr 22, 2024
@maflcko maflcko deleted the 2404-test-fix-p2p_tx_download-timeout- branch April 23, 2024 05:00
@glozow
Copy link
Member

glozow commented Apr 23, 2024

Thank you @maflcko!

@bitcoin bitcoin locked and limited conversation to collaborators Apr 23, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants