Skip to content

CI: Suite-k8s-1.26.K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master #24687

@julianwiedmann

Description

@julianwiedmann

Test Name

K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master

Failure Output

Cilium "1.13.90" did not become ready in time

Stack Trace

/home/jenkins/workspace/Cilium-PR-K8s-1.26-kernel-net-next/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:515
Timed out after 247.395s.
Cilium "1.13.90" did not become ready in time
Expected
    <*errors.errorString | 0xc0013c0a30>: {
        s: "unable to retrieve daemonset kube-system/cilium: Exitcode: -1 \nErr: signal: killed\nStdout:\n \t \nStderr:\n \t \n",
    }
to be nil
/home/jenkins/workspace/Cilium-PR-K8s-1.26-kernel-net-next/src/github.com/cilium/cilium/test/k8s/updates.go:230

Standard Output

Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Cilium pods: []
Netpols loaded: 
CiliumNetworkPolicies loaded: 
Endpoint Policy Enforcement:
Pod   Ingress   Egress

Standard Error

18:45:40 STEP: Running BeforeAll block for EntireTestsuite K8sUpdates
18:45:40 STEP: Ensuring the namespace kube-system exists
18:45:40 STEP: WaitforPods(namespace="kube-system", filter="-l k8s-app=cilium-test-logs")
18:45:40 STEP: WaitforPods(namespace="kube-system", filter="-l k8s-app=cilium-test-logs") => <nil>
18:45:41 STEP: Waiting for pods to be terminated
18:45:47 STEP: Deleting Cilium and CoreDNS
18:45:47 STEP: Waiting for pods to be terminated
18:45:47 STEP: Cleaning Cilium state (db60bdbf42e18ba8404f3470b958148da8289370)
18:45:47 STEP: Cleaning up Cilium components
18:45:49 STEP: Waiting for Cilium to become ready
FAIL: Timed out after 247.395s.
Cilium "1.13.90" did not become ready in time
Expected
    <*errors.errorString | 0xc0013c0a30>: {
        s: "unable to retrieve daemonset kube-system/cilium: Exitcode: -1 \nErr: signal: killed\nStdout:\n \t \nStderr:\n \t \n",
    }
to be nil
=== Test Finished at 2023-04-02T18:49:57Z====
18:49:57 STEP: Running JustAfterEach block for EntireTestsuite K8sUpdates
===================== TEST FAILED =====================
18:50:17 STEP: Running AfterFailed block for EntireTestsuite K8sUpdates
cmd: kubectl get pods -o wide --all-namespaces
Exitcode: -1 
Err: signal: killed
Stdout:
 	 
Stderr:
 	 

Fetching command output from pods []
===================== Exiting AfterFailed =====================
18:51:57 STEP: Running AfterEach for block EntireTestsuite K8sUpdates
18:53:07 STEP: Cleaning up Cilium components
FAIL: terminating containers are not deleted after timeout
Expected
    <*fmt.wrapError | 0xc004212300>: {
        msg: "Pods are still not deleted after a timeout: 4m0s timeout expired: Failed to connect to apiserver: signal: killed",
        err: <*fmt.wrapError | 0xc004212200>{
            msg: "Failed to connect to apiserver: signal: killed",
            err: <*exec.ExitError | 0xc0042121e0>{
                ProcessState: {
                    pid: 3005,
                    status: 9,
                    rusage: {
                        Utime: {Sec: 0, Usec: 37971},
                        Stime: {Sec: 0, Usec: 25314},
                        Maxrss: 154428,
                        Ixrss: 0,
                        Idrss: 0,
                        Isrss: 0,
                        Minflt: 5152,
                        Majflt: 0,
                        Nswap: 0,
                        Inblock: 0,
                        Oublock: 0,
                        Msgsnd: 0,
                        Msgrcv: 0,
                        Nsignals: 0,
                        Nvcsw: 499,
                        Nivcsw: 10,
                    },
                },
                Stderr: nil,
            },
        },
    }
to be nil
FAIL: Timed out after 41.022s.
Cilium clean state "1.13.90" was not able to be deployed
The function passed to Eventually returned the following error:
Cannot retrieve Node IP for k8s1: cannot retrieve node to read IP: 
    <*errors.errorString | 0xc003e94470>: {
        s: "Cannot retrieve Node IP for k8s1: cannot retrieve node to read IP: ",
    }
18:58:28 STEP: Waiting for Cilium to become ready
FAIL: Timed out after 241.193s.
Cilium "1.13.90" did not become ready in time
Expected
    <*errors.errorString | 0xc004cdab80>: {
        s: "unable to retrieve daemonset kube-system/cilium: Exitcode: -1 \nErr: signal: killed\nStdout:\n \t \nStderr:\n \t \n",
    }
to be nil
FAIL: Cilium "1.13.90" was not able to be clean up environment
Expected
    <*errors.errorString | 0xc004e8e960>: {
        s: "Cilium Init Container was not able to initialize or had a successful run: 4m0s timeout expired",
    }
to be nil
FAIL: Cilium "1.13.90" was not able to be deleted
Expected command: helm delete cilium --namespace=kube-system 
To succeed, but it failed:
Exitcode: 1 
Err: exit status 1
Stdout:
 	 
Stderr:
 	 WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/jenkins/workspace/Cilium-PR-K8s-1.26-kernel-net-next/src/github.com/cilium/cilium/test/vagrant-kubeconfig
	 WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/jenkins/workspace/Cilium-PR-K8s-1.26-kernel-net-next/src/github.com/cilium/cilium/test/vagrant-kubeconfig
	 Error: Kubernetes cluster unreachable: Get "https://k8s1:9443/version?timeout=32s": net/http: TLS handshake timeout
	 

FAIL: terminating containers are not deleted after timeout
Expected
    <*fmt.wrapError | 0xc0026cc3a0>: {
        msg: "Pods are still not deleted after a timeout: 4m0s timeout expired: Failed to connect to apiserver: signal: killed",
        err: <*fmt.wrapError | 0xc0026cc2e0>{
            msg: "Failed to connect to apiserver: signal: killed",
            err: <*exec.ExitError | 0xc0026cc2c0>{
                ProcessState: {
                    pid: 5208,
                    status: 9,
                    rusage: {
                        Utime: {Sec: 0, Usec: 43510},
                        Stime: {Sec: 0, Usec: 19777},
                        Maxrss: 154428,
                        Ixrss: 0,
                        Idrss: 0,
                        Isrss: 0,
                        Minflt: 5393,
                        Majflt: 0,
                        Nswap: 0,
                        Inblock: 0,
                        Oublock: 0,
                        Msgsnd: 0,
                        Msgrcv: 0,
                        Nsignals: 0,
                        Nvcsw: 560,
                        Nivcsw: 3,
                    },
                },
                Stderr: nil,
            },
        },
    }
to be nil
FAIL: terminating containers are not deleted after timeout
Expected
    <*fmt.wrapError | 0xc0026cc500>: {
        msg: "Pods are still not deleted after a timeout: 4m0s timeout expired: Failed to connect to apiserver: signal: killed",
        err: <*fmt.wrapError | 0xc0026cc440>{
            msg: "Failed to connect to apiserver: signal: killed",
            err: <*exec.ExitError | 0xc0026cc420>{
                ProcessState: {
                    pid: 5916,
                    status: 9,
                    rusage: {
                        Utime: {Sec: 0, Usec: 51422},
                        Stime: {Sec: 0, Usec: 10284},
                        Maxrss: 154428,
                        Ixrss: 0,
                        Idrss: 0,
                        Isrss: 0,
                        Minflt: 5110,
                        Majflt: 0,
                        Nswap: 0,
                        Inblock: 0,
                        Oublock: 0,
                        Msgsnd: 0,
                        Msgrcv: 0,
                        Nsignals: 0,
                        Nvcsw: 418,
                        Nivcsw: 6,
                    },
                },
                Stderr: nil,
            },
        },
    }
to be nil
19:14:40 STEP: Running AfterEach for block EntireTestsuite

[[ATTACHMENT|58e89773_K8sUpdates_Tests_upgrade_and_downgrade_from_a_Cilium_stable_image_to_master.zip]]
19:14:40 STEP: Running AfterAll block for EntireTestsuite K8sUpdates
19:15:00 STEP: Cleaning up Cilium components
FAIL: terminating containers are not deleted after timeout
Expected
    <*fmt.wrapError | 0xc0020e4820>: {
        msg: "Pods are still not deleted after a timeout: 4m0s timeout expired: Failed to connect to apiserver: signal: killed",
        err: <*fmt.wrapError | 0xc0020e4760>{
            msg: "Failed to connect to apiserver: signal: killed",
            err: <*exec.ExitError | 0xc0020e4740>{
                ProcessState: {
                    pid: 6979,
                    status: 9,
                    rusage: {
                        Utime: {Sec: 0, Usec: 63094},
                        Stime: {Sec: 0, Usec: 0},
                        Maxrss: 154428,
                        Ixrss: 0,
                        Idrss: 0,
                        Isrss: 0,
                        Minflt: 5314,
                        Majflt: 0,
                        Nswap: 0,
                        Inblock: 0,
                        Oublock: 0,
                        Msgsnd: 0,
                        Msgrcv: 0,
                        Nsignals: 0,
                        Nvcsw: 434,
                        Nivcsw: 6,
                    },
                },
                Stderr: nil,
            },
        },
    }
to be nil

Resources

Anything else?

The branch (#24686) included #24676. So this is either a different bug or we didn't actually fix the regression?

Metadata

Metadata

Assignees

Labels

area/CIContinuous Integration testing issue or flakeci/flakeThis is a known failure that occurs in the tree. Please investigate me!

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions