ci(fix): resolve flakiness in upd::fuzz_test #2689

boquan-fang · 2025-07-01T01:31:08Z

Release Summary:

Resolved issues:

Description of changes:

The daily scheduled CI run detected flakiness in udp::fuzz_test. I changed the test in my forked repo with the failed input. The test would run successfully locally in my EC2 instance, but failed in GithubAction. The changed code can be found in this snippet. I believe that the reason for such failure in GithubAction is because the EC2 instance runs faster than the one that GithubAction is ran on. I then increase the test duration from 120 seconds to 300 seconds, and the test succeeded:

test stream::tests::request_response::udp::fuzz_test has been running for over 60 seconds
...
test stream::tests::request_response::udp::fuzz_test ... ok
test result: ok. 327 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 253.08s

I then run the same test but update the test duration to 180 seconds, and the test succeeded as well. Hence, I conclude that the udp::fuzz_test flakiness is due to the test duration. Bolero sometimes generate a client and server that needs a long time to run the test, which is causing the flakiness. Hence, increase the test duration will mitigate that.

Call-outs:

I intentionally choose 180 seconds over 300 seconds because we should increase the duration by the smallest amount which will make the test works. If that is proven to be not enough, then we can increase it again.
I also fixed a typo in the comment that I detected.

Testing:

Already mentioned the test method in the section of Description of changes.

Analysis:

The failed input looks like:

(
    Client {
        delays: Delays {
            read: 1.058095ms,
            write: 1.560203ms,
            shutdown_write: 949.615µs,
            shutdown_read: 1.532273ms,
            drop: 1.61057ms,
        },
        count: 3,
        concurrency: 2,
        max_read_len: 29444,
        max_mtu: Some(
            3153,
        ),
    },
    Server {
        delays: Delays {
            read: 569.478µs,
            write: 1.385789ms,
            shutdown_write: 1.391925ms,
            shutdown_read: 746.348µs,
            drop: 800.292µs,
        },
        count: 4,
        max_read_len: 1,
        max_mtu: Some(
            2252,
        ),
    },
    [
        Request {
            count: 5,
            request_size: 98312,
            response_size: 95544,
        },
    ],
)

The server's max_read_len is set to 1 due to fuzz test randomness, which makes the server to read extremely slow. I believe that's why this test is taking a long time.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

maddeleine · 2025-07-01T17:25:25Z

Are you able to reliably reproduce this failure using that bolero input in a github actions environment?

Edit: Sounds like the answer is yes.

boquan-fang · 2025-07-01T17:37:38Z

This PR is pending quic-attack run. We can merge it after the quic-attack is checked on this PR.

boquan-fang · 2025-07-02T20:32:24Z

This is the quic-attck job running against this PR. This run is successful.

ci(fix): resolve flakiness in upd::fuzz_test

d76b37b

boquan-fang requested review from maddeleine and WesleyRosenblum July 1, 2025 01:31

maddeleine approved these changes Jul 1, 2025

View reviewed changes

boquan-fang merged commit 1ff931d into aws:main Jul 2, 2025
117 checks passed

boquan-fang deleted the upd-fuzz-test-flakiness branch July 10, 2025 00:17

boquan-fang mentioned this pull request Aug 22, 2025

refactor(s2n-quic-dc): increase fuzz_test's timeout #2773

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci(fix): resolve flakiness in upd::fuzz_test #2689

ci(fix): resolve flakiness in upd::fuzz_test #2689

Uh oh!

boquan-fang commented Jul 1, 2025

Uh oh!

maddeleine commented Jul 1, 2025 •

edited

Loading

Uh oh!

boquan-fang commented Jul 1, 2025

Uh oh!

boquan-fang commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

ci(fix): resolve flakiness in upd::fuzz_test #2689

ci(fix): resolve flakiness in upd::fuzz_test #2689

Uh oh!

Conversation

boquan-fang commented Jul 1, 2025

Release Summary:

Resolved issues:

Description of changes:

Call-outs:

Testing:

Analysis:

Uh oh!

maddeleine commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boquan-fang commented Jul 1, 2025

Uh oh!

boquan-fang commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

maddeleine commented Jul 1, 2025 •

edited

Loading