Skip to content

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Nov 7, 2024

Use a controller to make sure the proxy port file is written on shutdown.

Move proxy port restoration to the cell just before the trigger for writing the proxy ports file is created. This removes the possibility that the trigger would run before the ports are restored.

Add a configurable time limit ('--restored-proxy-ports-age-limit', default 15 minutes) for the age of the restored proxy ports file to remove the chance that stale port numbers would be used e.g., on upgrade after a downgrade to Cilium 1.15 (which does not store the proxy ports to a file).

@jrajahalme jrajahalme added kind/bug This is a bug in the Cilium logic. area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. labels Nov 7, 2024
@jrajahalme jrajahalme requested a review from a team as a code owner November 7, 2024 13:04
@jrajahalme jrajahalme requested a review from sayboras November 7, 2024 13:04
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Nov 7, 2024
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme added needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch release-note/misc This PR makes changes that have no direct user impact. labels Nov 7, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 7, 2024
@jrajahalme jrajahalme force-pushed the proxy-ports-persistence-fixes branch from 9fc3b55 to 985cecf Compare November 7, 2024 17:57
@jrajahalme jrajahalme requested a review from a team as a code owner November 7, 2024 17:57
@jrajahalme
Copy link
Member Author

/test

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Move proxy port restoration to the cell just before the trigger for
writing the proxy ports file is created. This removes the possibility
that the trigger would run before the ports are restored.

Add a configurable time limit ('--restored-proxy-ports-age-limit',
default 15 minutes) for the age of the restored proxy ports file to
remove the chance that stale port numbers would be used e.g., on upgrade
after a downgrade to Cilium 1.15 (which does not store the proxy ports to
a file).

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
This commit wraps the proxy ports checkpoint function in a controller. This
has two effects:

  1. If the checkpoint fails due to I/O errors, the checkpoint is
     retried until it succeeds. Previously, we relied on it being
     triggered again for it to be re-ran (though so far we have not
     observed any I/O failures in CI, so this likely was not an issue)
  2. By using a controller `StopFunc` and calling `RemoveControllerAndWait`
     we ensure the checkpoint function is ran one last time during
     shutdown. This is important, since `Trigger.Shutdown` by itself does not
     run any pending triggers. Note that any errors during shutdown are
     only logged and the controller is not re-tried.

The proxy ports trigger is closed when the proxy cell is
stopped, which happens e.g. when SIGTERM is received.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the proxy-ports-persistence-fixes branch from 985cecf to c0a1d0d Compare November 9, 2024 13:32
@jrajahalme
Copy link
Member Author

Updated docs

@jrajahalme
Copy link
Member Author

/test

@jrajahalme
Copy link
Member Author

closing & reopening to kick CI again

@jrajahalme jrajahalme closed this Nov 9, 2024
@jrajahalme jrajahalme reopened this Nov 9, 2024
@jrajahalme
Copy link
Member Author

/test

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jrajahalme jrajahalme enabled auto-merge November 11, 2024 19:05
@jrajahalme jrajahalme added this pull request to the merge queue Nov 11, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Nov 11, 2024
@jrajahalme jrajahalme added this pull request to the merge queue Nov 12, 2024
Merged via the queue into cilium:main with commit 1dd7b63 Nov 12, 2024
94 of 109 checks passed
@jrajahalme jrajahalme deleted the proxy-ports-persistence-fixes branch November 12, 2024 09:37
@viktor-kurchenko viktor-kurchenko mentioned this pull request Nov 12, 2024
13 tasks
@viktor-kurchenko viktor-kurchenko added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 12, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 12, 2024
@github-actions github-actions bot added the backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. label Nov 13, 2024
@github-actions github-actions bot added the backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants