Skip to content

CI: GKE e2e broken #20013

@aanm

Description

@aanm
2022-05-30T20:52:02.727624604Z level=debug msg="Compiling datapath" clang="clang version 10.0.0 (https://github.com/llvm/llvm-project.git 425a44e84b18e222cc38ac40603db1653a80a6cb)\nTarget: x86_64-unknown-linux-gnu\nThread model: posix\nInstalledDir: /usr/bin\n" debug=true llc="LLVM (http://llvm.org/):\n  LLVM version 10.0.0\n  Optimized build.\n  Default target: x86_64-unknown-linux-gnu\n  Host CPU: broadwell\n\n  Registered Targets:\n    bpf   - BPF (host endian)\n    bpfeb - BPF (big endian)\n    bpfel - BPF (little endian)\n" subsys=datapath-loader
2022-05-30T20:52:02.727653676Z level=debug msg="Launching compiler" args="[-emit-llvm -g -O2 -target bpf -std=gnu89 -nostdinc -D__NR_CPUS__=2 -Wall -Wextra -Werror -Wshadow -Wno-address-of-packed-member -Wno-unknown-warning-option -Wno-gnu-variable-sized-type-not-at-end -Wdeclaration-after-statement -Wimplicit-int-conversion -Wenum-conversion -I/var/run/cilium/state/globals -I/var/run/cilium/state/templates/8743fb584f8329a934ae5daf7c5e9e8d091fadabc143ffc04aa3be81963f43da -I/var/lib/cilium/bpf -I/var/lib/cilium/bpf/include -c /var/lib/cilium/bpf/bpf_host.c -o -]" subsys=datapath-loader target=clang
2022-05-30T20:52:02.910445133Z level=info msg="Initializing Cilium API" subsys=daemon
2022-05-30T20:52:02.910479123Z level=error msg="Failed to compile bpf_host.dbg.o: exit status 1" compiler-pid=329 linker-pid=330 subsys=datapath-loader
2022-05-30T20:52:02.910486313Z level=debug msg="/var/lib/cilium/bpf/bpf_host.c:420:10: error: implicit declaration of function 'lookup_ip4_remote_endpoint' [-Werror,-Wimplicit-function-declaration]" subsys=datapath-loader
2022-05-30T20:52:02.910523452Z level=debug msg="                info = lookup_ip4_remote_endpoint(ip4->saddr);" subsys=datapath-loader
2022-05-30T20:52:02.910532829Z level=debug msg="                       ^" subsys=datapath-loader
2022-05-30T20:52:02.910538147Z level=debug msg="/var/lib/cilium/bpf/bpf_host.c:420:10: note: did you mean 'lookup_ip4_endpoint'?" subsys=datapath-loader
2022-05-30T20:52:02.910542994Z level=debug msg="/var/lib/cilium/bpf/lib/eps.h:41:1: note: 'lookup_ip4_endpoint' declared here" subsys=datapath-loader
2022-05-30T20:52:02.910548533Z level=debug msg="lookup_ip4_endpoint(const struct iphdr *ip4)" subsys=datapath-loader
2022-05-30T20:52:02.910554932Z level=debug msg=^ subsys=datapath-loader
2022-05-30T20:52:02.910563230Z level=debug msg="/var/lib/cilium/bpf/bpf_host.c:420:8: error: incompatible integer to pointer conversion assigning to 'struct remote_endpoint_info *' from 'int' [-Werror,-Wint-conversion]" subsys=datapath-loader
2022-05-30T20:52:02.910570939Z level=debug msg="                info = lookup_ip4_remote_endpoint(ip4->saddr);" subsys=datapath-loader
2022-05-30T20:52:02.910578010Z level=debug msg="                     ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" subsys=datapath-loader
2022-05-30T20:52:02.910585280Z level=debug msg="2 errors generated." subsys=datapath-loader
2022-05-30T20:52:02.910610943Z level=debug msg="JoinEP: Failed to compile" debug=true error="Failed to compile bpf_host.dbg.o: exit status 1" params="&{Source:bpf_host.c Output:bpf_host.dbg.o OutputType:obj Options:[]}" subsys=datapath-loader
2022-05-30T20:52:02.910631079Z level=error msg="BPF template object creation failed" bpfHeaderfileHash=8743fb584f8329a934ae5daf7c5e9e8d091fadabc143ffc04aa3be81963f43da error="failed to compile template program /var/run/cilium/state/templates/8743fb584f8329a934ae5daf7c5e9e8d091fadabc143ffc04aa3be81963f43da: Failed to compile bpf_host.dbg.o: exit status 1" subsys=datapath-loader
2022-05-30T20:52:02.910636199Z level=warning msg="BPF template compilation unsuccessful" bpfHeaderfileHash=8743fb584f8329a934ae5daf7c5e9e8d091fadabc143ffc04aa3be81963f43da error="Could not locate previously compiled BPF template" subsys=datapath-loader

commit sha 6a8fe29

The issue goes away if the cilium container is restarted on the node. The issue only happens on new nodes.

Steps to reproduce:

CLUSTER_NAME=cluster1
CLUSTER_ZONE=europe-west6-a
image_type="COS_CONTAINERD"
gcloud container clusters create $CLUSTER_NAME --image-type ${image_type} --num-nodes 2 --zone $CLUSTER_ZONE \
--enable-ip-alias \
--create-subnetwork="" \
--image-type COS_CONTAINERD \
--num-nodes 2 \
--machine-type e2-custom-2-4096 \
--disk-type pd-standard \
--disk-size 10GB \
--node-taints node.cilium.io/agent-not-ready=true:NoExecute \
--preemptible \
--version=1.22.8-gke.201

gcloud container clusters get-credentials $CLUSTER_NAME --zone $CLUSTER_ZONE

cilium install --cluster-name=$CLUSTER_NAME --chart-directory=install/kubernetes/cilium --helm-set=image.repository=quay.io/cilium/cilium-ci --helm-set=image.useDigest=false --helm-set=image.tag=6a8fe290930f73da0311813259f93ac244b7004a --helm-set=operator.image.repository=quay.io/cilium/operator --helm-set=operator.image.suffix=-ci --helm-set=operator.image.tag=6a8fe290930f73da0311813259f93ac244b7004a --helm-set=operator.image.useDigest=false --helm-set=clustermesh.apiserver.image.repository=quay.io/cilium/clustermesh-apiserver-ci --helm-set=clustermesh.apiserver.image.tag=6a8fe290930f73da0311813259f93ac244b7004a --helm-set=clustermesh.apiserver.image.useDigest=false --helm-set=hubble.relay.image.repository=quay.io/cilium/hubble-relay-ci --helm-set=hubble.relay.image.tag=6a8fe290930f73da0311813259f93ac244b7004a --wait=false --rollback=false --config monitor-aggregation=none --base-version=v1.12 --config=debug=true --version=
$ cilium version
cilium-cli: v0.11.7 compiled with go1.18 on linux/amd64
cilium image (default): v1.11.5
cilium image (stable): v1.11.5
cilium image (running): -ci:6a8fe290930f73da0311813259f93ac244b7004a
$ uname -a
Linux gke-cluster1-default-pool-584196dd-rsbl 5.10.90+ #1 SMP Sat Mar 5 10:09:49 UTC 2022 x86_64 Intel(R) Xeon(R) CPU @ 2.20GHz GenuineIntel GNU/Linux

Metadata

Metadata

Assignees

Labels

area/CIContinuous Integration testing issue or flakeci/flakeThis is a known failure that occurs in the tree. Please investigate me!priority/highThis is considered vital to an upcoming release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions