Skip to content

Conversation

jrajahalme
Copy link
Member

Delay listening on the Cilium xDS socket until endpoints have been restored. This eliminates the problem where Envoy times out on the initial fetch on the xDS streams.

Cilium agent now waits until endpoints have restored before starting accepting new xDS streams.

@jrajahalme jrajahalme added kind/bug This is a bug in the Cilium logic. area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/bug This PR fixes an issue in a previous release of Cilium. affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch release-blocker/1.16 This issue will prevent the release of the next version of Cilium. needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 14, 2024
@jrajahalme jrajahalme requested review from a team as code owners November 14, 2024 12:34
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme added affects/v1.16 This issue affects v1.16 branch and removed release-blocker/1.16 This issue will prevent the release of the next version of Cilium. labels Nov 14, 2024
@jrajahalme jrajahalme force-pushed the envoy-xDS-delay-listening branch from 9cb1867 to 86866f6 Compare November 14, 2024 17:50
@jrajahalme
Copy link
Member Author

Rebased for a CI fix

@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the envoy-xDS-delay-listening branch from 86866f6 to 594444e Compare November 14, 2024 18:54
@jrajahalme
Copy link
Member Author

/test

Initialize daemon.endpointRestoreComplete channel early so that it is not
possible for anyone to start waiting for it to get closed before it is
initialized, as that would lead to them waiting forever (receive from a
nil channel blocks forever).

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Delay listening on the Cilium xDS socket until endpoints have been
restored. This eliminates the problem where Envoy times out on the
initial fetch on the xDS streams.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the envoy-xDS-delay-listening branch from 594444e to c2fd6c6 Compare November 15, 2024 12:19
@jrajahalme jrajahalme requested a review from a team as a code owner November 15, 2024 12:19
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme removed affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch affects/v1.16 This issue affects v1.16 branch labels Nov 15, 2024
@jrajahalme jrajahalme enabled auto-merge November 15, 2024 18:02
@jrajahalme jrajahalme added this pull request to the merge queue Nov 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 16, 2024
restorer, err := s.restorerPromise.Await(ctx)
if err == nil && restorer != nil {
log.Infof("Envoy: Waiting for endpoint restoration before serving xDS resources...")
err = restorer.WaitForEndpointRestore(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, to confirm - is this err value that is intended to be checked below against context.Canceled?

Merged via the queue into cilium:main with commit a17d694 Nov 16, 2024
66 checks passed
@jrajahalme jrajahalme deleted the envoy-xDS-delay-listening branch November 16, 2024 00:31
@jrajahalme jrajahalme added backport/author The backport will be carried out by the author of the PR. backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 19, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport/author The backport will be carried out by the author of the PR. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants