These changes based on `optimized-20170205` (which should've been
`optimized-20170215`) remove the remaining subshells and pipelines that
have a measurable impact without altering functionality. All pipelines
have been refactored away, and the only subshells that remain include:
- The call to `resolve_link` from `abs_dirname` in `libexec/bats`, since
it needs to invoke one of `greadlink`, `readlink`, or `true`. However,
it's been refactored to eliminate its own command substitution
containing a pipeline, and searches for the appropriate system command
only on the first call.
- The invocation of `tput cols` in `update_screen_width` from
`libexec/bats-format-tap-stream`.
- The invocation of the command passed as the arguments to `run` from
`libexec/bats-exec-test`, naturally, since the command must execute in
its own process and have its output captured.
- The invocation of the command passed as arguments to `buffer` from
`libexec/bats-format-tap-stream`. It not only avoids an annoying
flicker in the test output compared to removing it altogether, but
somehow actually causes this formatting module to perform slightly
better compared to no buffering at all. My guess as to why it performs
better is that in this case, making several terminal I/O calls with
lots of formatting codes may be more expensive than capturing all the
text and formatting codes in subshells and emitting a single terminal
I/O call. Also, using a `while read` loop to append to a string didn't
appear to provide a noticeable performance benefit, so the original
implementation remains unchanged.
While the performance improvements are not as dramatic as those from
`optimized-20170205`, they are still significant, and only required a
few straightforward changes to the existing code.
All times were collected on a MacBook Pro with a 2.9GHz Intel Core i5
CPU and 8GB 1867MHz DDR3 RAM. The macOS times were collected on macOS
10.12.3. Times for other operating systems were collected on the same
machine, running under VMmare Fusion 8.5.5.
These first two sets of times are for the current Bats test suite on
macOS, which now runs O(1sec) or ~23% faster.
macOS/Bash 3.2.57(1)-release before:
46 tests, 0 failures
real 0m4.429s
user 0m2.627s
sys 0m1.473s
After:
real 0m3.404s
user 0m2.163s
sys 0m0.850s
macOS/Bash 4.4.12(1)-release before:
46 tests, 0 failures
real 0m4.497s
user 0m2.526s
sys 0m1.529s
After:
real 0m3.433s
user 0m2.038s
sys 0m0.890s
The remaining times are for the https://github.com/mbland/go-script-bash
test suite at commit c150736fcec0d21a83510d8c6a92e22da733c369 (from
https://github.com/mbland/go-script-bash/pull/166). Improvements are in
the 7%-8% range across the board on most platforms, but as expected, are
significantly greater on some Windows implementations, reaching up to
12%-13.5%.
macOS/Bash 3.2.57(1)-release before:
789 tests, 0 failures, 2 skipped
real 1m25.806s
user 1m1.779s
sys 0m18.484s
After (~7% faster):
real 1m19.487s
user 0m57.632s
sys 0m15.909s
macOS/Bash 4.4.12(1)-release before:
789 tests, 0 failures, 3 skipped
real 1m20.796s
user 0m55.405s
sys 0m18.684s
After (~6% faster):
real 1m14.554s
user 0m51.306s
sys 0m16.101s
Ubuntu Linux 16.10 (yakkety)/Bash 4.3.46(1)-release before:
789 tests, 0 failures, 3 skipped
real 1m9.426s
user 0m31.220s
sys 0m4.224s
After (~8% faster):
real 1m3.911s
user 0m29.856s
sys 0m3.948s
Arch Linux/Bash 4.4.12(1)-release before:
789 tests, 0 failures, 4 skipped
real 0m46.714s
user 0m30.503s
sys 0m3.713s
After (~9% faster):
real 0m42.468s
user 0m28.643s
sys 0m3.693s
Windows 10/Windows Subsystem for Linux/Bash 4.3.11(1)-release before:
789 tests, 0 failures, 3 skipped
real 3m37.525s
user 0m36.969s
sys 3m0.891s
After (~7.5% faster):
real 3m21.062s
user 0m34.078s
sys 2m44.313s
Windows 10/Cygwin/Bash 4.4.12(3)-release before:
789 tests, 0 failures, 3 skipped
real 5m9.691s
user 1m33.075s
sys 2m44.395s
After (~12% faster):
real 4m32.413s
user 1m22.036s
sys 2m22.253s
Windows 10/Git for Windows/Bash 4.3.46(2)-release run from the Windows
Command Prompt (not the Git for Windows MSYS2 terminal) before:
789 tests, 0 failures, 14 skipped
real 5m24.973s
user 1m34.273s
sys 2m47.360s
After (~13.5% faster):
real 4m41.114s
user 1m23.613s
sys 2m21.771s