Skip to content

Conversation

YutaroHayakawa
Copy link
Member

@YutaroHayakawa YutaroHayakawa commented May 24, 2024

@YutaroHayakawa YutaroHayakawa added kind/backports This PR provides functionality previously merged into master. backport/1.13 labels May 24, 2024
@auto-committer auto-committer bot temporarily deployed to release-base-images May 24, 2024 07:02 Inactive
@YutaroHayakawa YutaroHayakawa marked this pull request as ready for review May 24, 2024 07:57
@YutaroHayakawa YutaroHayakawa requested review from a team as code owners May 24, 2024 07:57
@YutaroHayakawa YutaroHayakawa requested a review from rolinh May 24, 2024 07:57
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.13-backport-2024-05-24-01-59 branch from 2610ee3 to 12a8b5c Compare May 24, 2024 09:17
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 24, 2024 09:17 — with GitHub Actions Inactive
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you backported my PR twice, with the second backport being empty.

@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.13-backport-2024-05-24-01-59 branch from 12a8b5c to 78f449e Compare May 25, 2024 01:03
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 25, 2024 01:03 — with GitHub Actions Inactive
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 25, 2024 01:30 — with GitHub Actions Inactive
jibi and others added 10 commits May 27, 2024 14:55
[ upstream commit d15410d ]

[ backporter's note: .github/workflows/tests-e2e-upgrade.yaml doesn't
  exist on v1.13. Removed it. ]

it's fine to ignore the "No egress gateway found" drop reason as this may be
caused by the kind=echo pods sending traffic while the egressgw policy map is
still being populated.

The actual connectivity test will ensure that the map is in sync with the policy
and that egressgw traffic always go through the correct gateway

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit a1e5295 ]

We had the value, but forgot to plumb it in.

Fixes: #32497

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 9392745 ]

Sometimes, the L4LB tests timeout waiting for the docker (in docker) instance
to be ready.

```
+[06:55:18] docker run --privileged --name lb-node -d --network cilium-l4lb -v /lib/modules:/lib/modules docker:dind
ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271
+[06:55:18] docker exec -t lb-node mount bpffs /sys/fs/bpf -t bpf
+[06:55:18] docker run --name nginx -d --network cilium-l4lb nginx
544abbd0503584c582da50480d5f96eae5ecadb426d48d6fee078558b45451b2
+[06:55:18] docker exec -t lb-node docker ps
+[06:55:18] sleep 1
+[06:55:19] docker exec -t lb-node docker ps
+[06:55:19] sleep 1
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] sleep 1
+[06:55:21] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:21] sleep 1
+[06:55:22] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:22] sleep 1
+[06:55:23] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Unfortunately, fetching the LB logs after the failed test doesn't help either,
as this fails with the same error.

```
Run docker exec -t lb-node docker logs cilium-lb
  docker exec -t lb-node docker logs cilium-lb
...
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Therefore, this commit adds an additional job step that fetches the
status and logs of the docker instance itself.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit a9a5ca7 ]

[ backporter's note: Fixed minor conflict coming from the code structure
  difference. Remove unnecessary code coming from upstream and only pick
  retart flag. ]

Docker in Docker container used within L4LB tests occasionally fails
to start due to a `sed: write error`.

log output:

```
Certificate request self-signature ok
/certs/server/cert.pem: OK
subject=CN = docker:dind server
Certificate request self-signature ok
subject=CN = docker:dind client
cat: can't open '/proc/net/arp_tables_names': No such file or directory
sed: write error
/certs/client/cert.pem: OK
iptables v1.8.10 (nf_tables)
```

To prevent this error from causing the entire test to fail,
this commit tries to fix this by restarting the container
in case of a failure up to 10 times.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
[ upstream commit 3f995c4 ]

This commit enhances the "fetch dind information" GH action step
from the L4LB test to output all containers (including stopped ones)
and details about the lb-node container.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit dd947b3 ]

Whenever GKE stopped supporting a particular version of GKE, we had to
manually remove it from all stable branches. Now instead of that, we
will dynamically check if it's supported and only then run the test.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit ae31ee9 ]

[ backporter's note: .github/workflows/scale-test-100-gce.yaml and
  .github/workflows/scale-test-node-throughput-gce.yaml don't exist on
  v1.13. Remove them. ]

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
[ upstream commit 87119e9 ]

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit f8afceb ]

The result of running

```
images/scripts/update-cni-version.sh 1.5.0
```

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 62117e7 ]

[ backporter's comment: Resolve image hash conflict and execute make
  update-runtime-image and update-builder-image. ]

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.13-backport-2024-05-24-01-59 branch from f5783d3 to 6cbf76f Compare May 27, 2024 05:55
@YutaroHayakawa YutaroHayakawa temporarily deployed to release-base-images May 27, 2024 05:55 — with GitHub Actions Inactive
@YutaroHayakawa
Copy link
Member Author

Rebasing to pull-in bfbb9be.

@YutaroHayakawa
Copy link
Member Author

YutaroHayakawa commented May 27, 2024

/test-backport-1.13

Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed:

Click to show.

Test Name

K8sAgentPolicyTest Multi-node policy test with L7 policy using connectivity-check to check datapath

Failure Output

FAIL: cannot install connectivity-check

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.25-kernel-4.19/1226/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 so I can create one.

Then please upload the Jenkins artifacts to that issue.

Job 'Cilium-PR-K8s-1.17-kernel-4.19' failed:

Click to show.

Test Name

K8sAgentPolicyTest Multi-node policy test with L7 policy using connectivity-check to check datapath

Failure Output

FAIL: cannot install connectivity-check

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.17-kernel-4.19/513/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.17-kernel-4.19 so I can create one.

Then please upload the Jenkins artifacts to that issue.

Job 'Cilium-PR-K8s-1.17-kernel-4.19' failed:

Click to show.

Test Name

K8sAgentPolicyTest Multi-node policy test with L7 policy using connectivity-check to check datapath

Failure Output

FAIL: cannot install connectivity-check

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.17-kernel-4.19/514/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.17-kernel-4.19 so I can create one.

Then please upload the Jenkins artifacts to that issue.

Copy link
Contributor

@darox darox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my change looks good

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@YutaroHayakawa
Copy link
Member Author

Cilium IPsec upgrade: 1.1.1.1 rate limit
Conformance Cluster Mesh: google.com rate limit

@YutaroHayakawa
Copy link
Member Author

k8s-1.17-kernel-4.19: #13071
k8s-1.19-kernel-4.19: #13071
k8s-1.25-kernel-4.19: #13071

@YutaroHayakawa
Copy link
Member Author

/test-1.17-4.19

@YutaroHayakawa
Copy link
Member Author

/test-1.19-4.19

@YutaroHayakawa
Copy link
Member Author

/test-1.25-4.19

@YutaroHayakawa
Copy link
Member Author

k8s-1.17-kernel-4.19: #13071

@YutaroHayakawa
Copy link
Member Author

/test-1.17-4.19

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 28, 2024
@YutaroHayakawa YutaroHayakawa merged commit f28c498 into v1.13 May 28, 2024
@YutaroHayakawa YutaroHayakawa deleted the pr/v1.13-backport-2024-05-24-01-59 branch May 28, 2024 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.