Skip to content

Conversation

pchaigno and others added 16 commits November 12, 2024 10:38
[ upstream commit 9bf29a6 ]

This change is enough to cover IPsec+KPR+Ingress because Ingress tests
are already included in the CLI.

To double check this change, I modified the pull request to upload
sysdumps even in case of success and ran a quick sanity check on those
sysdumps.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 84f38b6 ]

This commit adds coverage for GatewayAPI+IPsec in the Gateway API
workflow.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 64dc69d ]

The sysdump filename is not built with the same commands between its
creation and its uploading. This wasn't an issue before the previous
commit because both outputs would still match. However, the previous
commit extended the matrix and surfaced this issue.

This commit ensures the exact same command is used to generate the
sysdump filenames at creation and upload times.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 91a844c ]

This commit adds some basic checks from the Cilium CLI connectivity
tests:
- no-unexpected-packet-drops is trivial and will report any unexpected
  packet drops.
- the pod-to-pod and node-to-node encryption tests will ensure that
  traffic is encrypted or not according to the configuration. This is
  useful here since the workflow may now be running with IPsec enabled.
- allow-all-except-world will check connectivity with some basic network
  policy. This is also useful in the context of IPsec as some
  IPsec-related bugs can cause the source identity to be lost. Having a
  policy that allows everything but not world is typically enough to catch
  such cases.

The check-log-errors test should also be enabled, but it is currently
failing because there are errors in logs whenever enabling the Gateway
API. It can be added once those errors are fixed.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 9e651bc ]

IPsec now works in combination with kube-proxy replacement. It is
however still not supported with BPF Host Routing as we need to go
through the Linux stack to perform the actual encryption and decryption.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 0dc222d ]

This workflow doesn't run the CLI connectivity tests, so it's missing
some basic checks such as unexpected packet drops. This commit adds it.

The check-log-errors test should also be enabled, but it is currently
failing because there are errors in logs whenever enabling Ingress. It
can be added once those errors are fixed.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 06cca1b ]

WireGuard strict mode isn't supported with IPv6. At the moment, a
warning is emitted if both are enabled at the same time, to warn the
user that IPv6 traffic won't be protected.

There is however not much the user can do in this case to make the
warning go away. Gray also pointed out that users should know that
already as they've had to configure encryption.strictMode.cidr with an
IPv4 CIDR. It is also documented in the guide. This commit therefore
reduces the log to an Info level.

Suggested-by: Gray Liang <gray.liang@isovalent.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit a47a4d7 ]

The kernel requirement is actually >= 5.2. And as RHEL 8.6 also has
the relevant support, we can just remove the text that describes the
limitation.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 0495279 ]

When a node's public key changes, as is the case after a node restart,
all other nodes delete the peer entry for the old key and create a new
peer entry for the new key. However, the logic introduced by commit
3ebfa4d ("wireguard: Use dummy peer for allowed IP removal") fails
to reinitialize the allowed IPs state for the new entry, namely the
node IPs.

TestAgent_AllowedIPsRestoration actually did have test coverage for this
scenario, but passed with a false positive since k8s2PubKey and
k8s2PubKey2 were not different enough it seems. Setting k8s2PubKey2 to a
completely different key makes the unit test fail as it should have.

Reinitialize peer inside UpdatePeer() whenever a node's public key
changes to trigger a full resync of node and pod IPs for the new key,
and ensure that TestAgent_AllowedIPsRestoration properly tests for this
scenario.

Fixes: #35644
Fixes: #34612
Fixes: 3ebfa4d ("wireguard: Use dummy peer for allowed IP removal")

Signed-off-by: Jordan Rife <jrife@google.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 5264b70 ]

Pass the workflow step timeout (less 1 minute for incidentals) to go
test, as its default (10 minutes) has been exceeded causing unnecessary
test flakes.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 8272fd1 ]

Add Helm option 'envoy.initialFetchTimeoutSeconds' and Cilium CLI option
'proxy-initial-fetch-timeout' (seconds), defaulting to 30 seconds.
Envoy default is 15 seconds, which may go by while we wait for the
endpoints to have regenerated during restart.

[ Backporter's notes: conflicts resolved in the following files by
  running `make -C install/kubernetes && make -C Documentation update-helm-values`
  - install/kubernetes/cilium/README.md
  - Documentation/helm-values.rst
]

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit b724ab0 ]

When using iptables for masquerading, the HostFW egress path needs to
differentiate between "host-originating traffic" and "pod-originating
but masqueraded with a Host IP". This uses several mechanisms:
1. for host-originating traffic, the skb is marked with MARK_MAGIC_HOST
   by an iptables rule. The to-netdev program then selects the packet's
   **real** source identity as HOST_ID.
2. for other traffic, the resolve_srcid_ipv*() call in
   handle_to_netdev_ipv*() performs an ipcache lookup to resolve the
   "apparent" source identity.

The HostFW code then takes the following decisions:
1. for a packet that is **neither* originating from Host **nor** SNATed
   with a Host IP, skip the HostFW path.
2. for a packet that is **not** originating from Host **but** SNATed with
   a Host IP, create a CT entry but skip the HostFW policy. The CT entry
   is needed so that replies can match it and skip Ingress Network policy.
3. for a packet that is genuinely originating from Host, create a CT entry
   **and** apply HostFW policy.

Unfortunately this logic broke with commit
b0e0b0c ("bpf: propagate src sec id from ingress bpf_overlay to egress bpf_host").
With this change the to-netdev program now also extracts the actual
identity from the skb->mark for pod-originating packets. When
resolve_srcid_ipv*() receives this identity, it skips the ipcache lookup.
And therefore ipv*_host_policy_egress_lookup() determines that the packet
is not SNATed with a Host IP, and exits the HostFW path without creating
the expected CT entry.

In short we take case (1), when we should take case (2). Thus inbound
replies can't be matched against a CT entry, fall into scope of Ingress
Network policy and are potentially dropped.

Quick-fix this problem by manually over-riding the source identity in the
HostFW path, so that we again perform the needed ipcache lookup. This
isn't pretty, but clearly restores old behavior and is straight-forward
from a review perspective.

Fixes: #35535
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit db39851 ]

This tests IPv4 with iptables masquerading for the following cases:

1. masqueraded pod traffic
2. Host traffic with egress-deny policy
3. Host traffic with egress-allow policy and reply

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 25c125b ]

This warning is emitted when the kernel doesn't support socket-level
LB tracing. That feature is however enabled by default and only kernels
5.7+ support it. That means this warning will be displayed by default to
a lot of users.

This commit reduces the log level to Info to avoid that and to be
consistent with other advanced features that are enabled by default.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 66c4826 ]

This code seems to access a variable without holding a lock, when that
variable can be modified at runtime by other logic. Add locking to avoid
weird and wonderful race conditions.

Found by code inspection.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit 70ec9b3 ]

Move proxy port restoration to the cell just before the trigger for
writing the proxy ports file is created. This removes the possibility
that the trigger would run before the ports are restored.

Add a configurable time limit ('--restored-proxy-ports-age-limit',
default 15 minutes) for the age of the restored proxy ports file to
remove the chance that stale port numbers would be used e.g., on upgrade
after a downgrade to Cilium 1.15 (which does not store the proxy ports to
a file).

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
@viktor-kurchenko viktor-kurchenko added kind/backports This PR provides functionality previously merged into master. backport/1.16 This PR represents a backport for Cilium 1.16.x of a PR that was merged to main. labels Nov 12, 2024
Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on my PRs :-)

[ upstream commit 1dd7b63 ]

This commit wraps the proxy ports checkpoint function in a controller. This
has two effects:

  1. If the checkpoint fails due to I/O errors, the checkpoint is
     retried until it succeeds. Previously, we relied on it being
     triggered again for it to be re-ran (though so far we have not
     observed any I/O failures in CI, so this likely was not an issue)
  2. By using a controller `StopFunc` and calling `RemoveControllerAndWait`
     we ensure the checkpoint function is ran one last time during
     shutdown. This is important, since `Trigger.Shutdown` by itself does not
     run any pending triggers. Note that any errors during shutdown are
     only logged and the controller is not re-tried.

The proxy ports trigger is closed when the proxy cell is
stopped, which happens e.g. when SIGTERM is received.

[ Backporter's notes: import conflicts manually resolved in
  `pkg/proxy/cell.go` file. ]

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
@viktor-kurchenko viktor-kurchenko force-pushed the pr/v1.16-backport-2024-11-12-10-38 branch from e7f6066 to 58c469d Compare November 12, 2024 10:02
@viktor-kurchenko
Copy link
Contributor Author

/test-backport-1.16

@pchaigno pchaigno marked this pull request as ready for review November 12, 2024 11:24
@pchaigno pchaigno requested review from a team as code owners November 12, 2024 11:24
@pchaigno pchaigno requested a review from a team as a code owner November 12, 2024 11:24
@pchaigno pchaigno added this pull request to the merge queue Nov 12, 2024
Merged via the queue into v1.16 with commit 212a6ff Nov 12, 2024
296 checks passed
@pchaigno pchaigno deleted the pr/v1.16-backport-2024-11-12-10-38 branch November 12, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.16 This PR represents a backport for Cilium 1.16.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants