Skip to content

Conversation

gseddon
Copy link

@gseddon gseddon commented May 6, 2025

Hi,
A bigger change here but I hope it is valuable! We want to contribute HTTP/3 support to Oha. This is an initial CR which refactors the client.rs to use common functions for http1 and http2 load generation.
It also introduces a HttpWorkType enum which is matched on to choose which functionality to execute, rather than the previous is_http2() function.

This lays the groundwork for adding a H3 work type in a following PR, and a http3 load generation common function.
This will be feature gated due to its experimental nature.
This has been split off into a separate PR to make them easier to understand and review individually. It's nice to be able to have a refactor PR which doesn't need to change any tests - it is internal structural changes only.

This PR has had an initial review here gseddon#1
and here is the HTTP3 branch, will make that PR once this one is merged gseddon#2

@hatoo
Copy link
Owner

hatoo commented May 7, 2025

Hi, HttpWorkType looks good.
But please leave load generation functions original.
Your changes are bad for performance, and it's not working on my machine.

cargo run -- -z 10s http://127.0.0.1:3000
# stop after 50 requests ... 

@gseddon
Copy link
Author

gseddon commented May 12, 2025

Hi, I fixed the performance bug for HTTP1 with 'deadline' enabled. Good catch sorry. As far as performance impact, these changes should have minimal effect. I've done a few runs comparing 'before' and after' and found no difference at all.
The only part I've split out into functions is the setup the initial loop for each of the load generation patterns. These only get called once per load generation run, so the impact is essentially nothing.
Do you want me to perform some more benchmarking and come back with data?

@gseddon
Copy link
Author

gseddon commented May 21, 2025

Would it be better if I split this into smaller changes?

@hatoo
Copy link
Owner

hatoo commented May 21, 2025

Would it be better if I split this into smaller changes?

Sorry for delay.
Yeah it's very helpful if you split to smoller PRs.

@hatoo
Copy link
Owner

hatoo commented May 24, 2025

The only part I've split out into functions is the setup the initial loop for each of the load generation patterns. These only get called once per load generation run, so the impact is essentially nothing.

It's not true for fn work(). It is now using channel instead of AtomicUsize and it introduces some overheads.

In my environment, the difference are noticeable.

❯ cargo run --profile release-ci -- -n 10000000 -c 1000 http://localhost:3000
    Finished `release-ci` profile [optimized] target(s) in 0.15s
     Running `target/release-ci/oha -n 10000000 -c 1000 'http://localhost:3000'`
Summary:
  Success rate: 100.00%
  Total:        16.9223 secs
  Slowest:      0.1381 secs
  Fastest:      0.0000 secs
  Average:      0.0017 secs
  Requests/sec: 590935.7094
❯ oha -n 10000000 -c 1000 http://localhost:3000 # v1.8.0
Summary:
  Success rate: 100.00%
  Total:        16.3692 secs
  Slowest:      0.1259 secs
  Fastest:      0.0000 secs
  Average:      0.0016 secs
  Requests/sec: 610903.7964

Off course, the performance gap is small, but I care it.

@hatoo
Copy link
Owner

hatoo commented May 24, 2025

Could you make this PR only contain HttpWorkType stuffs?
We can discuss about refactoring on other PRs. I think general idea of refactoring is good. Thank you.

@gseddon
Copy link
Author

gseddon commented May 29, 2025

Hi, so I removed the refactoring for the work_until function that used the endless_emitter, because that had more of a slowdown than the other paths What are your thoughts on the latest changes?
I think having a 1% performance difference for the 'slow' path is not so bad, for the amount of code that is reduced. The 'fast path' should still be just as fast.
Also what server are you using to test oha against? I want to use the same one so I can make sure we're measuring the same thing

@hatoo
Copy link
Owner

hatoo commented May 30, 2025

Could you just contain HttpWorkType for this PR?
I really care about the performance for all path.

For this code,

async fn parallel_work_http1(
    n_connections: usize,
    rx: AsyncReceiver<Option<Instant>>,
    report_tx: kanal::Sender<Result<RequestResult, ClientError>>,
    client: Arc<Client>,
    deadline: Option<std::time::Instant>,
) -> Vec<tokio::task::JoinHandle<()>> {

The rx: AsyncReceiver<Option<Instant>> adds some overheads for non latency correction workers.
Because they always send None but the size of it isn't zero.

I test against https://github.com/hatoo/sandbag

@gseddon
Copy link
Author

gseddon commented May 30, 2025

Ok I can do that. One thought to discuss though - is it a good thing for oha to have different performance depending on the way a user invokes it? For example, all of the qps paths will perform say 1% worse because they use the channels. So the user might see a different performance of their application depending on whether they use -q or not, or even perhaps between the -n and -z flags. So is it better to have the same function used for all of these paths so that the user gets consistent performance? Also then if you performance optimise that one function, then all of the load tests automatically get faster.

@hatoo
Copy link
Owner

hatoo commented May 31, 2025

is it a good thing for oha to have different performance depending on the way a user invokes it? For example, all of the qps paths will perform say 1% worse because they use the channels. So the user might see a different performance of their application depending on whether they use -q or not, or even perhaps between the -n and -z flags. So is it better to have the same function used for all of these paths so that the user gets consistent performance?

Yeah, performance variation depending on command line flags is not good.
But keeping codes optimal is more important to me.

Also then if you performance optimise that one function, then all of the load tests automatically get faster.

I'm not saying I'll refuse all refactoring. I think we can refactoring work* to use common function while keeping the current performance by using generics.

@hatoo
Copy link
Owner

hatoo commented Jun 15, 2025

Done at #746

@hatoo hatoo closed this Jun 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants