test: Implicitly sync after generate*, unless opted out #23300

maflcko · 2021-10-18T11:27:51Z

The most frequent failure in functional tests are intermittent races. Fixing such bugs is cumbersome because it involves:

Noticing the failure
Fetching and reading the log to determine the test case that failed
Adding a self.sync_all() where it was forgotten
Spamming out a pr and waiting for review, which is already sparse

Also, writing a linter to catch those is not possible, nor is review effective in finding these bugs prior to merge.

Fix all future intermittent races caused by a missing sync_block call by calling sync_all implicitly after each generate*, unless opted out. This ensures that the code is race-free (with regards to blocks) when the tests pass once, instead of our current approach where the code can never be guaranteed to be race-free.

There are some scripted-diff cleanups (see #22567), but they will be submitted in a follow-up to reduce the conflicts in this pull.

practicalswift · 2021-10-18T12:28:27Z

Concept ACK

DrahtBot · 2021-10-18T20:48:05Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#23371 (test: MiniWallet: add P2TR support and use it per default by theStack)
#23075 (refactoring: Fee estimation functional test cleanups by darosior)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

luke-jr · 2021-10-29T15:46:16Z

test/functional/test_framework/test_framework.py

+    def no_op(self):
+        pass
+
+    def generate(self, generator, *args, sync_fun=None, **kwargs):


Suggest making it a required parameter, otherwise there's a high risk of backports breaking.

I don't think we'll backport any major test changes to 22.x (or earlier) at this point, and hopefully this pull is merged before the next major release, so backports to 23.x aren't an issue.

If reviewers insist that this should be done, I suggest doing it in a follow up as a scripted-diff to make it easy to review and keep the conflicts in this pull low.

hebasto · 2021-11-08T11:24:37Z

Concept ACK.

lsilva01

tACK facc352 on Ubuntu 20.04

brunoerg

tACK facc352 on MacOS 11.6

Nice. It solves the failure I was getting when running test_runner. I think there are some other tests you could apply it like rpc_packages.py.

maflcko · 2021-11-09T09:33:50Z

I think there are some other tests you could apply it like rpc_packages.py.

Can you explain this a bit more? Or maybe create a follow-up yourself?

hebasto · 2021-11-09T09:38:48Z

FWIW, was testing this PR, and with the following patch:

diff --git a/.cirrus.yml b/.cirrus.yml
index f7962e951..93866de06 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -164,7 +164,7 @@ task:
     - netsh int ipv4 set dynamicport tcp start=1025 num=64511
     - netsh int ipv6 set dynamicport tcp start=1025 num=64511
     # Exclude feature_dbcrash for now due to timeout
-    - python test\functional\test_runner.py --nocleanup --ci --quiet --combinedlogslen=4000 --jobs=4 --timeout-factor=8 --failfast --extended --exclude feature_dbcrash
+    - python test\functional\test_runner.py --nocleanup --ci --quiet --combinedlogslen=4000 --jobs=4 --failfast --extended --exclude feature_dbcrash
 
 task:
   name: 'ARM [unit tests, no functional tests] [bullseye]'
diff --git a/ci/test/00_setup_env.sh b/ci/test/00_setup_env.sh
index 8a9d808f5..0685d1f0c 100755
--- a/ci/test/00_setup_env.sh
+++ b/ci/test/00_setup_env.sh
@@ -41,7 +41,7 @@ export RUN_SECURITY_TESTS=${RUN_SECURITY_TESTS:-false}
 # By how much to scale the test_runner timeouts (option --timeout-factor).
 # This is needed because some ci machines have slow CPU or disk, so sanitizers
 # might be slow or a reindex might be waiting on disk IO.
-export TEST_RUNNER_TIMEOUT_FACTOR=${TEST_RUNNER_TIMEOUT_FACTOR:-40}
+export TEST_RUNNER_TIMEOUT_FACTOR=${TEST_RUNNER_TIMEOUT_FACTOR:-1}
 export TEST_RUNNER_ENV=${TEST_RUNNER_ENV:-}
 export RUN_FUZZ_TESTS=${RUN_FUZZ_TESTS:-false}
 export EXPECTED_TESTS_DURATION_IN_SECONDS=${EXPECTED_TESTS_DURATION_IN_SECONDS:-1000}

all tasks, except for the native Windows one, pass.

In the Windows task two tests fail due timeouts:

feature_pruning.py                                 | ✖ Failed  | 1381 s
rpc_misc.py                                        | ✖ Failed  | 63 s

maflcko · 2021-11-09T09:49:58Z

Thanks for testing. Do you have logs for the two timeouts?

hebasto · 2021-11-09T10:09:44Z

Thanks for testing. Do you have logs for the two timeouts?

https://api.cirrus-ci.com/v1/task/4512194625798144/logs/functional_tests.log

maflcko · 2021-11-09T10:23:26Z

The pruning one has a reorg and a call to verifychain, which may take a long time, so this is expected:

                                   test_framework.authproxy.JSONRPCException: 'verifychain' RPC took longer than 60.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)

The rpc_misc one looks like an unrelated issue.

hebasto · 2021-11-09T10:25:56Z

The rpc_misc one looks like an unrelated issue.

Could be easy tuned:

@@ -84,7 +84,7 @@ class RpcMiscTest(BitcoinTestFramework):
 
         # Restart the node with indices and wait for them to sync
         self.restart_node(0, ["-txindex", "-blockfilterindex", "-coinstatsindex"])
-        self.wait_until(lambda: all(i["synced"] for i in node.getindexinfo().values()))
+        self.wait_until(lambda: all(i["synced"] for i in node.getindexinfo().values()), timeout=120)
 
         # Returns a list of all running indices by default
         values = {"synced": True, "best_block_height": 200}

maflcko · 2021-11-09T10:27:29Z

The sync threads all exit within a second, so I don't understand why raising the timeout would fix this issue.

hebasto · 2021-11-09T11:31:33Z

@MarcoFalke

The sync threads all exit within a second...

That is not the case in CI environment. See my demo branch, and its CI log:

2021-11-09T11:28:10.405000Z TestFramework (INFO): Total time taken by `wait_until` call: 52.64929151535034 seconds.

maflcko · 2021-11-09T11:39:58Z

Oh you are right.

Any idea why coinstatsindex takes more than a minute on windows, but finishes in less than a second on linux?

hebasto · 2021-11-09T12:25:25Z

Any idea why coinstatsindex takes more than a minute on windows, but finishes in less than a second on linux?

Have no idea.

Locally, on Windows 10 Pro 21H1, it takes about 30 seconds.

…ted out

…on native Windows CI task 23c3dcb ci: Drop --failfast in functional tests on native Windows CI task (Hennadii Stepanov) Pull request description: As it was [discussed](bitcoin/bitcoin#22980 (comment)) in bitcoin/bitcoin#22980: > seeing _all_ of the failed tests can give us a useful hint for debugging (was thinking about that while reviewing and testing bitcoin/bitcoin#23300). There is a [concern](bitcoin/bitcoin#22980 (comment)) about such approach: > If there is a CI failure, it will be good to know the result as early as possible after opening the pull request. But, [OTOH](bitcoin/bitcoin#22980 (comment)): > the average amount of saved time for such an approach [using `--failfast`] is less significant than it could appear. ACKs for top commit: MarcoFalke: cr ACK 23c3dcb seems fine to give this a try Tree-SHA512: d28f5712b4edfdbcef48b0633017da9172cef1835bcea51eaeeabf15c133f6bb6888227afc130279c3899365a4fd0f200fb9b0c4cb1ff80f21cbc766b8907764

DrahtBot added the Tests label Oct 18, 2021

DrahtBot mentioned this pull request Oct 18, 2021

test: Fee estimation functional test cleanups #23075

Merged

maflcko force-pushed the 2110-testSyncImp branch from fa2b65f to fa7962b Compare October 19, 2021 08:52

DrahtBot mentioned this pull request Oct 27, 2021

test: MiniWallet: add P2TR support and use it per default #23371

Closed

test: Implicitly sync after generate*, unless opted out

facc352

maflcko force-pushed the 2110-testSyncImp branch from fa7962b to facc352 Compare October 29, 2021 11:35

luke-jr reviewed Oct 29, 2021

View reviewed changes

maflcko mentioned this pull request Nov 8, 2021

ci: Increase --timeout-factor in the native Windows CI task #23463

Closed

lsilva01 approved these changes Nov 8, 2021

View reviewed changes

lsilva01 mentioned this pull request Nov 8, 2021

CI failure: TestFramework.utils (ERROR): wait_until() failed. Predicate: '''' #23399

Closed

brunoerg approved these changes Nov 8, 2021

View reviewed changes

maflcko merged commit 94db963 into bitcoin:master Nov 9, 2021

maflcko deleted the 2110-testSyncImp branch November 9, 2021 09:01

hebasto mentioned this pull request Nov 9, 2021

qa: Ignore --failfast option of test_runner.py on Windows #22980

Closed

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 9, 2021

Merge bitcoin#23300: test: Implicitly sync after generate*, unless op…

26504de

…ted out

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 9, 2021

Merge bitcoin#23300: test: Implicitly sync after generate*, unless op…

6cca5e7

…ted out

hebasto mentioned this pull request Nov 15, 2021

ci: Drop --failfast in functional tests on native Windows CI task #23513

Merged

kiminuo mentioned this pull request Nov 23, 2021

Add feerate histogram to getmempoolinfo #21422

Closed

maflcko mentioned this pull request Jan 11, 2022

test: MiniWallet: support default from_node for sending/creating txs #24025

Closed

bitcoin locked and limited conversation to collaborators Nov 9, 2022

test: Implicitly sync after generate*, unless opted out #23300

test: Implicitly sync after generate*, unless opted out #23300

Uh oh!

Conversation

maflcko commented Oct 18, 2021

Uh oh!

practicalswift commented Oct 18, 2021

Uh oh!

DrahtBot commented Oct 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

luke-jr Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

maflcko Oct 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hebasto commented Nov 8, 2021

Uh oh!

lsilva01 left a comment

Choose a reason for hiding this comment

Uh oh!

brunoerg left a comment

Choose a reason for hiding this comment

Uh oh!

maflcko commented Nov 9, 2021

Uh oh!

hebasto commented Nov 9, 2021

Uh oh!

maflcko commented Nov 9, 2021

Uh oh!

hebasto commented Nov 9, 2021

Uh oh!

maflcko commented Nov 9, 2021

Uh oh!

hebasto commented Nov 9, 2021

Uh oh!

maflcko commented Nov 9, 2021

Uh oh!

hebasto commented Nov 9, 2021

Uh oh!

maflcko commented Nov 9, 2021

Uh oh!

hebasto commented Nov 9, 2021

Uh oh!

Uh oh!

DrahtBot commented Oct 18, 2021 •

edited

Loading

maflcko Oct 29, 2021 •

edited

Loading