Skip to content

Conversation

YutaroHayakawa
Copy link
Member

@YutaroHayakawa YutaroHayakawa commented May 23, 2024

@YutaroHayakawa YutaroHayakawa added kind/backports This PR provides functionality previously merged into master. backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. labels May 23, 2024
Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! My commit looks good

Copy link
Contributor

@michi-covalent michi-covalent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty~!

Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Yutaro! My commits look good.

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for backport of my PR. I also glanced through my codeowners and nothing obvious seemed out of place (though I would delegate to the original authors if they have specific concerns)

@auto-committer auto-committer bot temporarily deployed to release-base-images May 23, 2024 16:48 Inactive
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.15-backport-2024-05-24-12-27 branch from 42f0011 to 8368e1e Compare May 24, 2024 01:02
@auto-committer auto-committer bot had a problem deploying to release-base-images May 24, 2024 01:18 Error
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.15-backport-2024-05-24-12-27 branch from e6114df to 93fab9b Compare May 24, 2024 01:53
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 24, 2024 01:53 — with GitHub Actions Inactive
joestringer and others added 21 commits May 25, 2024 00:43
[ upstream commit 715ed4c ]

This script would check for CHARTS_PATH and CHARTS_TOOL, but it never
actually invoked them. We've renamed the script now so it's out of date.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit a37fd43 ]

Signed-off-by: Michael Mykhaylov <32168861+mikemykhaylov@users.noreply.github.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 57db22b ]

Syft appears to offer higher quality results, including being able to identify
the versions of individual libraries in golang binaries, which is not something
that we were able to do efficiently with `bom`.

Syft can also generate the SBOMs in spdx-json format, which was a pending TODO
from our initial implementation of SBOM generation.

Signed-off-by: Feroz Salam <feroz.salam@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 104a302 ]

The clustermesh logic is currently affected by a possible, although
rare, race condition occurring if the cluster configuration is being
retrieved while the connection to the remote cluster is stopped.
Indeed, this operation stops two controllers -- the one handling the
connection to the remote cluster and the one responsible for the
retrieval of the cluster config. However, this causes the
getRemoteCluster function to possibly terminate before the termination
of the second controller, in turn leading to a panic due to send on
closed channel. Let's fix this issue by explicitly removing only the
first controller, and letting the other terminate normally due to the
parent context having been terminated. Hence, ensuring that the
controller has always terminated before closing the cfgch channel.

Fixes: #32179
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 4aa6a7f ]

[ backporter's note: lb_service.go and lb_service_test.go are renamed in
  the upstream. Move the logic from service.go and service_test.go. ]

Filtering out backends which are terminating when creating local
endpoint state. This will result in quicker route withdrawal if local
backends go into terminating state, without waiting for graceful
shutdown period of the pods.

Signed-off-by: harsimran pabla <hpabla@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 3989f02 ]

Adds a note to the docs that `kube-apiserver`
entity is not available for ingress traffic
in AKS, EKS and GKE.

Signed-off-by: darox <maderdario@gmail.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 580d31b ]

It's causing unnecessary flakes. Money is not a concern so let's not
bother using spot instances.

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 83e5a9e ]

Similar to EKS, money is not a concern here so let's not bother with
preemptible instances.

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit d15410d ]

[ backporter's note: Adjust the minor cilium-cli invocation conflict ]

it's fine to ignore the "No egress gateway found" drop reason as this may be
caused by the kind=echo pods sending traffic while the egressgw policy map is
still being populated.

The actual connectivity test will ensure that the map is in sync with the policy
and that egressgw traffic always go through the correct gateway

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit a1e5295 ]

We had the value, but forgot to plumb it in.

Fixes: #32497

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 9392745 ]

Sometimes, the L4LB tests timeout waiting for the docker (in docker) instance
to be ready.

```
+[06:55:18] docker run --privileged --name lb-node -d --network cilium-l4lb -v /lib/modules:/lib/modules docker:dind
ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271
+[06:55:18] docker exec -t lb-node mount bpffs /sys/fs/bpf -t bpf
+[06:55:18] docker run --name nginx -d --network cilium-l4lb nginx
544abbd0503584c582da50480d5f96eae5ecadb426d48d6fee078558b45451b2
+[06:55:18] docker exec -t lb-node docker ps
+[06:55:18] sleep 1
+[06:55:19] docker exec -t lb-node docker ps
+[06:55:19] sleep 1
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] sleep 1
+[06:55:21] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:21] sleep 1
+[06:55:22] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:22] sleep 1
+[06:55:23] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Unfortunately, fetching the LB logs after the failed test doesn't help either,
as this fails with the same error.

```
Run docker exec -t lb-node docker logs cilium-lb
  docker exec -t lb-node docker logs cilium-lb
...
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Therefore, this commit adds an additional job step that fetches the
status and logs of the docker instance itself.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 2b31c34 ]

The index used for ipv4PrivateIndex/ipv4PublicIndex was wrong in cases
where the address was skipped. Use the proper index to which the address
is appended.

Fixes: 7da6514 ("tables: Sort node addresses also by public vs private IP")
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 485b405 ]

No functional change, just to make further revision easier.

Signed-off-by: gray <gray.liang@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 9a68ec2 ]

This patch introduces a workaround to avoid kernel issue when deleting
xfrm states.

Let's start from the kernel issue.

After installing two xfrm states on the same host using below commands
(please note the differences on the mark, mask, src):

```
ip x s a src 10.244.1.43 dst 10.244.3.114 proto esp spi 0x00000003 reqid 1 mode tunnel replay-window 0 mark 0x2a450d00 mask 0xffff0f00 output-mark 0xd00 mask 0xffffff00 aead 'rfc4106(gcm(aes))' 0x42a89014074c49243219a20cc87cadd7b9c0d7d1 128 sel src 0.0.0.0/0 dst 0.0.0.0/0
ip x s a src 0.0.0.0 dst 10.244.3.114 proto esp spi 0x00000003 reqid 1 mode tunnel replay-window 0 mark 0xd00 mask 0xf00 output-mark 0xd00 mask 0xffffff00 aead 'rfc4106(gcm(aes))' 0x42a89014074c49243219a20cc87cadd7b9c0d7d1 128 sel src 0.0.0.0/0 dst 0.0.0.0/0
```

When trying to delete the first xfrm state using the following command,
Linux kernel will instead remove the second xfrm state but keep the
first:

```
ip x s d src 10.244.1.43 dst 10.244.3.114 proto esp spi 0x00000003 mark 0x2a450d00 mask 0xffff0f00
```

This causes troubles for cilium upgrade.

A real world scenario for cilium upgrade from 1.13.12 to 1.13.14 could
be like:
1. Before upgrade, the node has "old-style" xfrm state to catch mark
   "0xd00/0xf00" for ingress traffic; old bpf programs also set "0xd00"
   mark to ingress skbs;
2. Upgrade begins, bpf programs are reloaded to new version, thereafter
   ingress skbs are marked with "0xXXXX0d00";
3. After a short while, cilium-agent installs new xfrm states to catch
   traffic with specific mark "0xXXXX0d00";

During window between step 2 and 3, cilium relies on "old-style" xfrm
states "0xd00/0xf00" to catch traffic with specific mark "0xXXXX0d00".

So far so good.

However, in a large scale cluster it's inevitable to receive
NodeDeletion events during upgrade due to node churn. Once seeing a
NodeDeletion event, cilium-agent will remove the xfrm state for that
gone-away remote node.

Now we hit the aforementioned kernel issue: cilium-agent tries to delete
the xfrm state catching more specific mark, but kernel wrongly removes
the one catching general mark.

This causing traffic disruption until upgrade completes with all new xfrm
states installed.

This patch provides an elegant solution at low cost: if cilium-agent
wants to remove a xfrm state catching specific mark, it has to
temporarily remove the xfrm state catching general mark first and add it
back after:
1. Temporarily remove the xfrm states catching the general mark;
2. Remove the xfrm state we really care abot;
3. Add back the temporaily removed one on step 1;

Indeed there will be a small window between temporary removing and
adding back, but our past test shows the window lasts 200-900µs only, so short
that we shoudn't see many drops.

Suggested-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: gray <gray.liang@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit a9a5ca7 ]

[ backporter's note: The code structure of test/nat46x64/test.sh are
  different between main and v1.15. Take the v1.15's structure and
  apply restarting parameter change on top of it. ]

Docker in Docker container used within L4LB tests occasionally fails
to start due to a `sed: write error`.

log output:

```
Certificate request self-signature ok
/certs/server/cert.pem: OK
subject=CN = docker:dind server
Certificate request self-signature ok
subject=CN = docker:dind client
cat: can't open '/proc/net/arp_tables_names': No such file or directory
sed: write error
/certs/client/cert.pem: OK
iptables v1.8.10 (nf_tables)
```

To prevent this error from causing the entire test to fail,
this commit tries to fix this by restarting the container
in case of a failure up to 10 times.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 3f995c4 ]

This commit enhances the "fetch dind information" GH action step
from the L4LB test to output all containers (including stopped ones)
and details about the lb-node container.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit dd947b3 ]

Whenever GKE stopped supporting a particular version of GKE, we had to
manually remove it from all stable branches. Now instead of that, we
will dynamically check if it's supported and only then run the test.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit ae31ee9 ]

[ backporter's comment: Scale tests don't exist in the v1.15 branch.
  Exclude the changes for them. ]

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 87119e9 ]

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit f8afceb ]

The result of running

```
images/scripts/update-cni-version.sh 1.5.0
```

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 62117e7 ]

[ backporter's note: Executed scripts/update-cilium-runtime-image.sh ]

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.15-backport-2024-05-24-12-27 branch from 5f74d58 to b17ef43 Compare May 24, 2024 15:43
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 24, 2024 15:43 — with GitHub Actions Inactive
@YutaroHayakawa
Copy link
Member Author

Rebased on top of the latest v1.15 to pull-in #32707.

@YutaroHayakawa
Copy link
Member Author

/test-backport-1.15

Copy link
Contributor

@marseel marseel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like GKE tests work now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.