-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Reduce TestIntegration#restart_does_not_drop_connections
restart frequency
#3589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce TestIntegration#restart_does_not_drop_connections
restart frequency
#3589
Conversation
TestIntegration#restart_does_not_drop_connections
restart frequency
1732a15
to
ac6bdca
Compare
cfaf63a
to
d6155e8
Compare
def test_phased_restart_does_not_drop_connections_threads_fork_worker | ||
restart_does_not_drop_connections num_threads: 10, total_requests: 3_000, | ||
signal: :USR1 #, config: 'fork_worker', log: true | ||
signal: :USR1, config: 'fork_worker' | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only test that needs sleep 1
in restart_does_not_drop_connections
, what about passing that in from the test? Making it more clear why we need to sleep 1 second
Or does it always make sense to sleep 1 second? I'm not sure what we're waiting for, that should probably be explained better in the comment above the sleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does it always make sense to sleep 1 second? I'm not sure what we're waiting for, that should probably be explained better in the comment above the sleep.
IMO it always makes sense to sleep 1 second. The sleep is just how long we wait before signalling another restart. All the tests that use #restart_does_not_drop_connections
at most set a total_requests
of 3000. With a sleep of 0.15 we restart ~10 times, so something like 300 requests before each restart. With a sleep of 1, ~3 times, so 1000 requests before each restart.
Seeing as the tests are to ensure connections aren't dropped:
- I don't think the number of restarts really matters, as long as there's at least one
- I don't think (hot/phased) restarting puma every 300 requests consecutively is a realistic scenario anyway
Essentially, this is just an arbitrary threshold that just so happens to time out one particular test, and I don't think adjusting it to the new value for all tests affects their accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were still 1/2 macOS failures each run with the increased sleep. Tried increasing it further and also added an assertion for restart count to ensure there are always at least 2 restarts and the assertion started failing i.e. with that high a sleep we only restart once. I instead went for an exponential back-off which seems to be yield a more consistent and desirable result.
The remaining failure seems to be unrelated, would appreciate if someone could rerun to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks.
Yes, unrelated: #3478 (comment)
eb47e82
to
e2191b5
Compare
e2191b5
to
b1f5f48
Compare
Good day. Thanks for working on this. The test probably evolved from code tested locally. That often can be an issue with GHA CI. I'm sure you've had test code that ran fine locally, but failed when run in GHA. I did a revamp of the code, where the number of restarts can be specified. See: master...MSP-Greg:puma:00-hot-restart I've run it in CI a few times, and it seems more reliable than the current code. Any thoughts? Feel free to use the code. I haven't worked on the |
Description
Context:
With
fork_worker
un-commented and the originalsleep
the test times out for me locally, but doesn't on this branch. We go from ~10 restarts to ~3 restarts in a non-fork_worker
test like this one, which I think is perfectly reasonable for what these tests are asserting.Fixing this is necessary to unblock #3297.
cc/ @MSP-Greg
Your checklist for this pull request
[ci skip]
to the title of the PR.#issue
" to the PR description or my commit messages.