Skip to content

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Nov 13, 2024

Add new --curl-parallel option to cilium connectivity test to allow testing with multiple parallel curl requests to add stress to agent/proxy synchronization. This is opt-in, so by default only one request will be sent by curl commands. Takes the number of concurrent requests, e.g., cilium connectivity test --curl-parallel=3

Add new agent command line option --proxy-max-concurrent-retries (default 128, helm value envoy.maxConcurrentRetries). This is an increase to the Envoy default of only 3 on each upstream cluster. Higher number is warranted due to the default maximum number of upstream connections being 1024, and the fact that Cilium configures upstream clusters that are shared for all local endpoints. The new default is applied also to Envoy Clusters configured via CEC/CCEC CRDs.

Finally, expose the agent command line option --http-retry-count (default 3) via a new helm value envoy.httpRetryCount to make tuning this easier.

Cilium-cli connectivity test now supports use of parallel requests with curl

@jrajahalme jrajahalme added release-note/misc This PR makes changes that have no direct user impact. cilium-cli-exclusive This PR only impacts cilium-cli binary labels Nov 13, 2024
@jrajahalme jrajahalme requested review from a team as code owners November 13, 2024 09:40
@github-actions github-actions bot added the cilium-cli This PR contains changes related with cilium-cli label Nov 13, 2024
@jrajahalme
Copy link
Member Author

/test

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is mainly targetted towards stress-testing agent<->proxy interaction, would it make sense to make this conditional on tests using l7 policies?

It also looks like there are some legitimate failures, mainly in tests involving l7 policies.

@jrajahalme
Copy link
Member Author

Given this is mainly targetted towards stress-testing agent<->proxy interaction, would it make sense to make this conditional on tests using l7 policies?

It also looks like there are some legitimate failures, mainly in tests involving l7 policies.

Yes, but toFQDN policies could be affected also without Envoy redirection, depending on how the requests land on the shared DNS client, or if some of the requests would get responses before the bpf maps get updated, etc.

Just looked at the EKS failures, some of the upstream connections get errno 111 (ECONNREFUSED), unfortunately those flows were not in the hubble-flows files (first flow is just after the last failure).

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from 3012312 to eff74d2 Compare November 13, 2024 14:39
@jrajahalme jrajahalme requested review from a team as code owners November 13, 2024 14:39
@jrajahalme jrajahalme requested a review from sayboras November 13, 2024 14:39
@jrajahalme jrajahalme marked this pull request as draft November 13, 2024 14:39
@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from eff74d2 to b42f93b Compare November 13, 2024 14:42
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch 2 times, most recently from 7a61ec1 to 5552c0d Compare November 13, 2024 16:27
@jrajahalme
Copy link
Member Author

/test

@jrajahalme
Copy link
Member Author

Found an issue with maxRetries not being set on the cluster circuit breakers. This defaults to 3 concurrent retries on any of our "policy enforcement clusters" and could be an issue since we share them for all the endpoints. Added change to make the limit to be 128 instead, which feels proportional to the default limit on concurrent connections (1024).

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from 5552c0d to c962cc6 Compare November 14, 2024 13:44
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from c962cc6 to b949b6f Compare November 14, 2024 15:48
@jrajahalme jrajahalme changed the title cilium-cli: Test with multiple parallel curl requests cilium-cli: Support testing with multiple parallel curl requests Nov 14, 2024
@jrajahalme jrajahalme marked this pull request as ready for review November 14, 2024 15:56
@jrajahalme jrajahalme requested a review from tklauser November 14, 2024 15:57
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from b949b6f to 64817c1 Compare November 14, 2024 17:48
@jrajahalme
Copy link
Member Author

Rebased for a CI fix

@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from 64817c1 to cce4572 Compare November 15, 2024 10:10
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme enabled auto-merge November 15, 2024 18:02
@tklauser tklauser added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Nov 22, 2024
@tklauser
Copy link
Member

@jrajahalme it looks like this picked up a merge conflict. Could you please rebase?

@sayboras could you please review on behalf of @cilium/sig-servicemesh and @cilium/envoy?

Add new 'connectivity test' option '--curl-parallel' which, when set to a
value greater than one causes each curl invocation to issue parallel
requests on concurrent connections. This can be used to add stress to
agent/proxy (DNS and/or Envoy) synchronization.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Envoy default limit on concurrent retries on a cluster is 3. This has
been seen hit on CI when tests use multiple parallel requests from
curl. Increase the limit to 128 via a new helm value
'envoy.maxConcurrentRetries' and Cilium Agent command line argument
'--proxy-max-concurrent-retries'.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Envoy HTTP retry count defaults to 3. Expose it via helm value
'envoy.httpRetryCount' to make tuning of it easier.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the pr/jrajahalme/cilium-cli-enable-curl-parallel branch from cce4572 to 42defdc Compare December 5, 2024 12:41
@jrajahalme
Copy link
Member Author

Rebased.

@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Dec 5, 2024
@jrajahalme jrajahalme added this pull request to the merge queue Dec 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 6, 2024
Merged via the queue into main with commit 550fdd6 Dec 6, 2024
286 checks passed
@jrajahalme jrajahalme deleted the pr/jrajahalme/cilium-cli-enable-curl-parallel branch December 6, 2024 05:11
sayboras added a commit that referenced this pull request Dec 16, 2024
Relates: #35949
Signed-off-by: Tam Mach <tam.mach@cilium.io>
github-merge-queue bot pushed a commit that referenced this pull request Dec 16, 2024
Relates: #35949
Signed-off-by: Tam Mach <tam.mach@cilium.io>
@joestringer joestringer removed the cilium-cli-exclusive This PR only impacts cilium-cli binary label Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cilium-cli This PR contains changes related with cilium-cli ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants