-
Notifications
You must be signed in to change notification settings - Fork 37.7k
[WIP] Run unit tests in parallel #12831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TODO:
|
Concept ACK Very nice! |
Perhaps a stupid question but why the need of a static |
http://www.boost.org/doc/libs/1_60_0/libs/test/doc/html/boost_test/utf_reference/rt_param_reference/list_content.html |
@MarcoFalke The output from |
If we go with the static text file That way Travis would automatically catch the case where someone adds a test and forgets to manually run |
@@ -0,0 +1,864 @@ | |||
#!/usr/bin/env python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be python3
?
def __init__(self, output_dir): | ||
if sys.stdout.isatty(): | ||
# stdout needs to be unbuffered since the output is interactive. | ||
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be fine in python 2.7 but is a problem in 3.x
ValueError: can't have unbuffered text I/O
The 0 buffer is only valid for byte streams in python 3, from docs the default buffer policy for interactive text files is line buffering, which is probably fine.
src/test/test_list.txt
Outdated
base58_tests/base58_EncodeBase58 | ||
base58_tests/base58_DecodeBase58 | ||
base64_tests/base64_testvectors | ||
abc/bech32_tests/bip173_testvectors_valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why abc/ is appended to path. Getting error:
Test setup error: no test cases matching filter
Removing abc/ lets tests run and pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also poses an interesting problem. If the test case isn't found, its listed as FAILED TESTS
, which is somewhat confusing, because bech32_tests/bip173_testvectors_valid
would pass, it's just the path listed is wrong.
Maybe the initial reader of FILE_TEST_LIST
can verify the path exists before handing it off to workers.
parser.print_usage() | ||
sys.exit(1) | ||
|
||
if options.shard_count < 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: if you are going to verify the input is valid, you can check some other fields as well
Right now you could input a negative for workers
, repeat
, timeout
, etc. without complaint
parser.add_option('--shard_count', type='int', default=1, | ||
help='total number of shards (for sharding test execution ' | ||
'between multiple machines)') | ||
parser.add_option('--shard_index', type='int', default=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it makes sense to give shard_index
a default. I think you want to ensure that shard_count
and shard_index
are used in combination, so if shard_count
is used and options.shard_index
is None
, you print proper usage. Right now options.shard_index
will just default to 0, so you can't tell if it's being used properly with shard_count
.
task = self.tasks[task_id] | ||
|
||
if self.running_groups is not None: | ||
test_group = task.test_name.split('.')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be split('/')[0]
?
Right now task.test_name
is a line from src/test/test_list.txt
like
bloom_tests/rolling_bloom
So the split will always return that full path, since there is no .
- making option.serialize_test_cases
just run everything in parallel anyway.
parser.add_option('--timeout', type='int', default=None, | ||
help='Interrupt all remaining processes after the given ' | ||
'time (in seconds).') | ||
parser.add_option('--serialize_test_cases', action='store_true', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this option is currently incompatible with sharding, since it shards tests in a round-robin type fashion, thus splitting up test_groups between shards. This is probably desired, but just need to document it, or try to prevent the two options being used in combination.
Concept ACK - Tested it out and left some initial feedback. Realize it's WIP so just left broad comments. |
4292954
to
2697e9f
Compare
Thanks for looking at this! I kept the patches to the gtest-parallel script minimal. Feedback not about my patches should be submitted upstream: https://github.com/google/gtest-parallel Also, if someone knows more about autotools, help is very much appreciated to make it run on |
@theuni Mind to give a Concept ACK/NACK or some general comments? |
I have an idea for a simpler approach, where you could tell the test binary there are N processes, and which one out of those it is. It would then partition the tests randomly into N groups, and only run one of them. If there's interest, I'll try to implement that soon. |
@sipa That wouldn't help running the most time-expensive test first or avoiding that two expensive tests end up in the same group? |
@MarcoFalke No, but I think that's an independent problem. If some tests take exorbitantly more time than others, perhaps those tests need to be split up. |
See #10026 for an (outdated) list of slow unit tests. I haven't checked how practical it is to split them up, but there will always be tests that run slower compared to others. |
fd67998
to
4209ec2
Compare
The currently longest running test on my machine seems to be "test_big_witness_transaction":
|
|
Repeating concept ACK Automatic linting comment: |
2b3062c
to
ed03835
Compare
NACK. Prefering #12926 :-) |
7ef9cd8 Increase entropy in test temp directory name (Pieter Wuille) f6dfb0f Reorder travis builds (Pieter Wuille) 156db42 tests: run tests in parallel (Cory Fields) 66f3255 tests: split up actual tests and helper files (Cory Fields) Pull request description: This runs the unit tests (`src/test/test_bitcoin`) in 4 separate simultaneous processes, significantly speeding up some Travis runs (over 2x for win32). This uses an approach by @theuni that relies on `make` as the mechanism for distributing tests over processes (through `-j`). For every test .cpp file, we search for `BOOST_FIXTURE_TEST_SUITE` or `BOOST_AUTO_TEST_SUITE`, and then invoke the test binary for just that suite (using `-t`). The (verbose) output is stored in a temporary file, and only shown in the case of failure. Some makefile reshuffling is necessary to avoid trying to run tests from `src/test/test_bitcoin.cpp` for example, which contains framework/utility code but no real tests. Finally, order the Travis jobs from slow to fast (apart from the arm/doc job which goes first, for fast failure). This should help reducing the total wall clock time before opening a PR and finishing Travis, in case where not all jobs are started simultaneously. This is an alternative to #12831. Tree-SHA512: 9f82eb4ade14ac859618da533c7d9df2aa9f5592a076dcc4939beeffd109eda33f7d5480d8f50c0d8b23bf3099759e9f3a2d4c78efb5b66b04569b39b354c185
As of commit 9ae552468cf096cb281d1ab7c87d9baea56e86c9 google/gtest-parallel@9ae5524
ed03835
to
7785663
Compare
Rebased |
7ef9cd8 Increase entropy in test temp directory name (Pieter Wuille) f6dfb0f Reorder travis builds (Pieter Wuille) 156db42 tests: run tests in parallel (Cory Fields) 66f3255 tests: split up actual tests and helper files (Cory Fields) Pull request description: This runs the unit tests (`src/test/test_bitcoin`) in 4 separate simultaneous processes, significantly speeding up some Travis runs (over 2x for win32). This uses an approach by @theuni that relies on `make` as the mechanism for distributing tests over processes (through `-j`). For every test .cpp file, we search for `BOOST_FIXTURE_TEST_SUITE` or `BOOST_AUTO_TEST_SUITE`, and then invoke the test binary for just that suite (using `-t`). The (verbose) output is stored in a temporary file, and only shown in the case of failure. Some makefile reshuffling is necessary to avoid trying to run tests from `src/test/test_bitcoin.cpp` for example, which contains framework/utility code but no real tests. Finally, order the Travis jobs from slow to fast (apart from the arm/doc job which goes first, for fast failure). This should help reducing the total wall clock time before opening a PR and finishing Travis, in case where not all jobs are started simultaneously. This is an alternative to bitcoin#12831. Tree-SHA512: 9f82eb4ade14ac859618da533c7d9df2aa9f5592a076dcc4939beeffd109eda33f7d5480d8f50c0d8b23bf3099759e9f3a2d4c78efb5b66b04569b39b354c185
7ef9cd8 Increase entropy in test temp directory name (Pieter Wuille) f6dfb0f Reorder travis builds (Pieter Wuille) 156db42 tests: run tests in parallel (Cory Fields) 66f3255 tests: split up actual tests and helper files (Cory Fields) Pull request description: This runs the unit tests (`src/test/test_bitcoin`) in 4 separate simultaneous processes, significantly speeding up some Travis runs (over 2x for win32). This uses an approach by @theuni that relies on `make` as the mechanism for distributing tests over processes (through `-j`). For every test .cpp file, we search for `BOOST_FIXTURE_TEST_SUITE` or `BOOST_AUTO_TEST_SUITE`, and then invoke the test binary for just that suite (using `-t`). The (verbose) output is stored in a temporary file, and only shown in the case of failure. Some makefile reshuffling is necessary to avoid trying to run tests from `src/test/test_bitcoin.cpp` for example, which contains framework/utility code but no real tests. Finally, order the Travis jobs from slow to fast (apart from the arm/doc job which goes first, for fast failure). This should help reducing the total wall clock time before opening a PR and finishing Travis, in case where not all jobs are started simultaneously. This is an alternative to bitcoin#12831. Tree-SHA512: 9f82eb4ade14ac859618da533c7d9df2aa9f5592a076dcc4939beeffd109eda33f7d5480d8f50c0d8b23bf3099759e9f3a2d4c78efb5b66b04569b39b354c185
7ef9cd8 Increase entropy in test temp directory name (Pieter Wuille) f6dfb0f Reorder travis builds (Pieter Wuille) 156db42 tests: run tests in parallel (Cory Fields) 66f3255 tests: split up actual tests and helper files (Cory Fields) Pull request description: This runs the unit tests (`src/test/test_bitcoin`) in 4 separate simultaneous processes, significantly speeding up some Travis runs (over 2x for win32). This uses an approach by @theuni that relies on `make` as the mechanism for distributing tests over processes (through `-j`). For every test .cpp file, we search for `BOOST_FIXTURE_TEST_SUITE` or `BOOST_AUTO_TEST_SUITE`, and then invoke the test binary for just that suite (using `-t`). The (verbose) output is stored in a temporary file, and only shown in the case of failure. Some makefile reshuffling is necessary to avoid trying to run tests from `src/test/test_bitcoin.cpp` for example, which contains framework/utility code but no real tests. Finally, order the Travis jobs from slow to fast (apart from the arm/doc job which goes first, for fast failure). This should help reducing the total wall clock time before opening a PR and finishing Travis, in case where not all jobs are started simultaneously. This is an alternative to bitcoin#12831. Tree-SHA512: 9f82eb4ade14ac859618da533c7d9df2aa9f5592a076dcc4939beeffd109eda33f7d5480d8f50c0d8b23bf3099759e9f3a2d4c78efb5b66b04569b39b354c185
Unit tests can be run in parallel on systems with more than one cpu and thus run faster.
Since each test case is run separately from all others, test cases will no longer share the same global variables.