-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add parallel streams throughput tests, and enable them in the EGW workflow #38027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parallel streams throughput tests, and enable them in the EGW workflow #38027
Conversation
/scale-egw |
944b9cf
to
b5ebc1d
Compare
/scale-egw |
Let's not enable the egress gateway functionality at all in the baseline test, rather than enabling it but not configuring any policy. This ensures that baseline Cilium CPU/memory usage are not influenced by any Egress Gateway related tasks. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
b5ebc1d
to
e8e36e8
Compare
Rebased onto main to drop the commit already merged via #37981 |
/scale-egw |
Extend the CLI performance tests to support measuring the throughput with multiple parallel streams, rather than a single one. This allows to bypass possible single flow limits due to resource exhaustion or cloud provider limits, such as [1]. Unfortunately, netperf does not support multiple flows out of the box, hence we need to be a bit creative to run multiple instances and parse the resulting output. [1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
e8e36e8
to
a4c8dfa
Compare
Following the extension of this logic to additionally support multiple parallel streams, let's add a unit test exercising the different paths, for increased confidence in its correctness, and to prevent future regressions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
So that we circumvent the single flow limitations imposed by AWS [1]. Additionally, let's change the node instance type to m5n.xlarge, as it supports up to 25Gbps burst bandwidth. [1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Gather the amount of traffic forwarded by the gateway node during the network performance test, and assert that it is close to 0 in the baseline test, and way higher in the EGW test. This serves as a sanity check that traffic is indeed flowing through the egress gateway when expected. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
a4c8dfa
to
f46aeb9
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giorio94 Nice! Do you have a sense of the before/after throughput gain
Single-stream flows are limited to 5 Gbps on AWS [1]. The multi-flow tests instead saturate the 25 Gb/s available bandwidth on these instances (except UDP + egress gateway, which tops at about 15 Gb/s due to CPU saturation). [1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html |
Extend the CLI performance tests to support measuring the throughput with multiple parallel streams, rather than a single one. This allows to bypass possible single flow limits due to resource exhaustion or cloud provider limits, such as [1]. Unfortunately, netperf does not support multiple flows out of the box, hence we need to be a bit creative to run multiple instances and parse the resulting output. Enable the new test as part of the egress gateway scale test, and change the node instance type to m5n.xlarge, as it supports up to 25Gbps burst bandwidth.
Please review commit by commit, and refer to the individual commit messages for additional details.
[1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html