Skip to content

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Jun 8, 2024

Persist proxy ports in /var/run/cilium/state/proxy_ports_state.json so that proxy ports (including the DNS proxy port) can also be restored when BPF tproxy is used. This file is written each time we would upsert a TPROXY iptables rule and restored on restart. If the file does not exist proxy ports are restored from iptables rules instead. This is needed for upgrade compatibility.

Earlier commits in this series fix related issues in proxy port handling that are mainly meaningful for restoration with daemonset Envoy proxy:

  • Clear proxy port on failure so that we'll try another random port next time
  • Try previously used ports first when recreating a redirect
  • Rename AllocateProxyPort as AllocateCRDProxyPort, simplify prototype
  • Remove "localOnly", it is already always true

The last two commits introduce the main changes:

  • Restore all proxy ports from iptables rules (rather than only DNS proxy port)
  • Persist proxy ports in /var/run/cilium/state/proxy_ports_state.json

Since this mainly enhances use of daemonset Envoy proxy by reducing Listener churn on Cilium agent restarts and daemonset Envoy is enabled by default only from Cilium 1.16, there is no need to backport these changes.

@jrajahalme jrajahalme added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/misc This PR makes changes that have no direct user impact. dont-merge/blocked Another PR must be merged before this one. labels Jun 8, 2024
@jrajahalme jrajahalme requested review from a team as code owners June 8, 2024 16:44
@github-actions github-actions bot added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Jun 8, 2024
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the proxy-persist-proxy-ports branch from a62ba38 to 8c31a27 Compare June 9, 2024 11:27
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme force-pushed the proxy-persist-proxy-ports branch from 8c31a27 to 721fd19 Compare June 10, 2024 06:19
@jrajahalme jrajahalme removed the dont-merge/blocked Another PR must be merged before this one. label Jun 10, 2024
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme added the release-blocker/1.16 This issue will prevent the release of the next version of Cilium. label Jun 10, 2024
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Some questions and minor things left inline.

@jrajahalme jrajahalme force-pushed the proxy-persist-proxy-ports branch from 721fd19 to 6cb72cf Compare June 11, 2024 12:13
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! 🚀

Clear proxy port on failure so that we'll try another random port next
time.

We have two failure cases for creating new redirects:

- syncronous, for DNS proxy. Simply zero the proxy port in the retry
  loop.

- asynchronous, for Envoy proxy. Clear proxy port state on revert
  callback.

Noticed the need for this change when accidentally trying to use port 1
for Envoy when developing another feature. In this case Envoy can't bind
the port, and all further tries were also trying the same port, as the
proxy port was not cleared on revert.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
"rulesPort" is non-zero when we have a datapath (iptables) rules with a
specific port. Try re-creating a redirect with that port if no other port
is configured. This reduces churn on the datapath, as the existing rule
can be reused without a need for an update.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Rename AllocateProxyPort as AllocateCRDProxyPort to make it clear that it
is only to be used for CEC/CCEC CRD listeners. Also enforce the current
practice where CRD proxy ports are only accessed by name, and 'ingress'
member is always false for CRD ports.

Fix the unit test to verify that the proxy port was not reallocated for
the same listener name.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
We now only ever have 'localOnly = true' so that we always bind the proxy
listeners to localhost address. Remove support for all-hosts addresses.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Previously we only reused the DNS proxy port from the datapath. Change to
reuse the old proxy ports for all proxy redirects from datapath, if
possible. This reduces Listener churn on daemonset Envoy on agent
restart.

With this change the listener update in Envoy logs after agent restart
looks like this:

  begin add/update listener: name=cilium-http-egress:13563 hash=11560876577076369351
  duplicate/locked listener 'cilium-http-egress:13563'. no add/update
  lds: add/update listener 'cilium-http-egress:13563' skipped

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the proxy-persist-proxy-ports branch from 6cb72cf to f1e6866 Compare June 13, 2024 18:16
Write proxy ports into /var/cilium/state/proxy_ports_state.json and
restore from there. If the file is not available, only then restore from
iptables rules. Restoration from iptables is still needed for upgrades.

This fixes the issue with bpf TPROXY, as it does not rely on iptables
rules, so the proxy ports can not be recovered from them.

Note: File IO is patterned after similar methods for
      cache.CachingIdentityAllocator.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the proxy-persist-proxy-ports branch from f1e6866 to 43e9793 Compare June 13, 2024 18:34
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme requested a review from mhofstetter June 13, 2024 20:46
Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - only one suggestion left 🎉

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 14, 2024
@jrajahalme jrajahalme added this pull request to the merge queue Jun 14, 2024
Merged via the queue into cilium:main with commit 9f210b5 Jun 14, 2024
@jrajahalme jrajahalme deleted the proxy-persist-proxy-ports branch June 14, 2024 10:20
@jrajahalme jrajahalme mentioned this pull request Oct 31, 2024
1 task
@jrajahalme jrajahalme added the backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. label Oct 31, 2024
@jrajahalme jrajahalme mentioned this pull request Oct 31, 2024
1 task
@jrajahalme jrajahalme added the backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. label Oct 31, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.16 This issue will prevent the release of the next version of Cilium. release-note/misc This PR makes changes that have no direct user impact. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants