Skip to content

CI: Host firewall: unable to upgrade connection #15455

@pchaigno

Description

@pchaigno

Failed a couple times in different PRs:

Stacktrace

/home/jenkins/workspace/Cilium-PR-K8s-1.20-kernel-4.9/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:518
Failed to reach 192.168.36.11:80 from testclient-vvxhw
Expected command: kubectl exec -n 202103232024k8sdatapathconfighostfirewallwithvxlan testclient-vvxhw -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://192.168.36.11:80/ -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" 
To succeed, but it failed:
Exitcode: 1 
Err: exit status 1
Stdout:
 	 
Stderr:
 	 error: unable to upgrade connection: Authorization error (user=kube-apiserver-kubelet-client, verb=create, resource=nodes, subresource=proxy)
	 

/home/jenkins/workspace/Cilium-PR-K8s-1.20-kernel-4.9/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:649

Standard Output

Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 2
Number of "level=error" in logs: 0
⚠️  Number of "level=warning" in logs: 24
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
Top 5 errors/warnings:
Unable to update ipcache map entry on pod add
failed to resolve identity
Unable to restore endpoint, ignoring
Unable to serve pprof API
github.com/cilium/cilium/pkg/k8s/watchers/endpoint_slice.go:90: watch of *v1beta1.EndpointSlice ended with: an error on the server (\
Cilium pods: [cilium-cn8kr cilium-v8t6w]
Netpols loaded: 
CiliumNetworkPolicies loaded: 
Endpoint Policy Enforcement:
Pod                           Ingress   Egress
grafana-d69c97b9b-hlvb6                 
prometheus-655fb888d7-rqk9s             
coredns-867bf6789f-26clv                
testclient-vvxhw                        
testclient-zfhh2                        
testserver-f7q56                        
testserver-szc2d                        
Cilium agent 'cilium-cn8kr': Status: Ok  Health: Ok Nodes "" ContinerRuntime:  Kubernetes: Ok KVstore: Ok Controllers: Total 42 Failed 0
Cilium agent 'cilium-v8t6w': Status: Ok  Health: Ok Nodes "" ContinerRuntime:  Kubernetes: Ok KVstore: Ok Controllers: Total 28 Failed 0

Standard Error

Click to show details
20:21:42 STEP: Installing Cilium
20:21:43 STEP: Waiting for Cilium to become ready
20:23:39 STEP: Validating if Kubernetes DNS is deployed
20:23:39 STEP: Checking if deployment is ready
20:23:39 STEP: Checking if kube-dns service is plumbed correctly
20:23:39 STEP: Checking if pods have identity
20:23:39 STEP: Checking if DNS can resolve
20:23:45 STEP: Kubernetes DNS is not ready: 5s timeout expired
20:23:45 STEP: Restarting Kubernetes DNS (-l k8s-app=kube-dns)
20:23:46 STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-cn8kr: unable to find service backend 10.0.0.67:53 in datapath of cilium pod cilium-cn8kr
20:24:01 STEP: Waiting for Kubernetes DNS to become operational
20:24:01 STEP: Checking if deployment is ready
20:24:01 STEP: Checking if kube-dns service is plumbed correctly
20:24:01 STEP: Checking if pods have identity
20:24:01 STEP: Checking if DNS can resolve
20:24:02 STEP: Validating Cilium Installation
20:24:02 STEP: Performing Cilium controllers preflight check
20:24:02 STEP: Performing Cilium status preflight check
20:24:02 STEP: Performing Cilium health check
20:24:03 STEP: Performing Cilium service preflight check
20:24:03 STEP: Performing K8s service preflight check
20:24:05 STEP: Waiting for cilium-operator to be ready
20:24:05 STEP: WaitforPods(namespace="kube-system", filter="-l name=cilium-operator")
20:24:05 STEP: WaitforPods(namespace="kube-system", filter="-l name=cilium-operator") => <nil>
20:24:05 STEP: Making sure all endpoints are in ready state
20:24:06 STEP: Creating namespace 202103232024k8sdatapathconfighostfirewallwithvxlan
20:24:06 STEP: Deploying demo_hostfw.yaml in namespace 202103232024k8sdatapathconfighostfirewallwithvxlan
20:24:06 STEP: Waiting for 4m0s for 8 pods of deployment demo_hostfw.yaml to become ready
20:24:06 STEP: WaitforNPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="")
20:24:21 STEP: WaitforNPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="") => <nil>
20:24:21 STEP: Applying policies /home/jenkins/workspace/Cilium-PR-K8s-1.20-kernel-4.9/src/github.com/cilium/cilium/test/k8sT/manifests/host-policies.yaml
20:24:33 STEP: Checking host policies on ingress from local pod
20:24:33 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testClient")
20:24:33 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testClient") => <nil>
20:24:33 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testServerHost")
20:24:33 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testServerHost") => <nil>
20:24:38 STEP: Checking host policies on ingress from remote pod
20:24:38 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testClient")
20:24:38 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testClient") => <nil>
20:24:38 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testServerHost")
20:24:38 STEP: WaitforPods(namespace="202103232024k8sdatapathconfighostfirewallwithvxlan", filter="-l zgroup=testServerHost") => <nil>
FAIL: Failed to reach 192.168.36.11:80 from testclient-vvxhw
Expected command: kubectl exec -n 202103232024k8sdatapathconfighostfirewallwithvxlan testclient-vvxhw -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://192.168.36.11:80/ -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" 
To succeed, but it failed:
Exitcode: 1 
Err: exit status 1
Stdout:
 	 
Stderr:
 	 error: unable to upgrade connection: Authorization error (user=kube-apiserver-kubelet-client, verb=create, resource=nodes, subresource=proxy)
	 

=== Test Finished at 2021-03-23T20:24:47Z====
20:24:47 STEP: Running JustAfterEach block for EntireTestsuite K8sDatapathConfig
===================== TEST FAILED =====================
20:24:47 STEP: Running AfterFailed block for EntireTestsuite K8sDatapathConfig
cmd: kubectl get pods -o wide --all-namespaces
Exitcode: 0 
Stdout:
 	 NAMESPACE                                            NAME                               READY   STATUS    RESTARTS   AGE    IP              NODE   NOMINATED NODE   READINESS GATES
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testclient-host-g8nnv              1/1     Running   0          44s    192.168.36.11   k8s1   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testclient-host-wlm48              1/1     Running   0          44s    192.168.36.12   k8s2   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testclient-vvxhw                   1/1     Running   0          44s    10.0.1.77       k8s2   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testclient-zfhh2                   1/1     Running   0          44s    10.0.0.76       k8s1   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testserver-f7q56                   2/2     Running   0          44s    10.0.1.177      k8s2   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testserver-host-n6rt9              2/2     Running   0          44s    192.168.36.11   k8s1   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testserver-host-q89jm              2/2     Running   0          44s    192.168.36.12   k8s2   <none>           <none>
	 202103232024k8sdatapathconfighostfirewallwithvxlan   testserver-szc2d                   2/2     Running   0          44s    10.0.0.86       k8s1   <none>           <none>
	 cilium-monitoring                                    grafana-d69c97b9b-hlvb6            1/1     Running   0          97m    10.0.1.214      k8s2   <none>           <none>
	 cilium-monitoring                                    prometheus-655fb888d7-rqk9s        1/1     Running   0          97m    10.0.1.93       k8s2   <none>           <none>
	 kube-system                                          cilium-cn8kr                       1/1     Running   0          3m7s   192.168.36.12   k8s2   <none>           <none>
	 kube-system                                          cilium-operator-5878c8bc5c-22nfr   1/1     Running   0          3m7s   192.168.36.12   k8s2   <none>           <none>
	 kube-system                                          cilium-operator-5878c8bc5c-85m7m   1/1     Running   0          3m7s   192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          cilium-v8t6w                       1/1     Running   0          3m7s   192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          coredns-867bf6789f-26clv           1/1     Running   0          66s    10.0.1.249      k8s2   <none>           <none>
	 kube-system                                          etcd-k8s1                          1/1     Running   0          99m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          kube-apiserver-k8s1                1/1     Running   0          99m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          kube-controller-manager-k8s1       1/1     Running   0          99m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          kube-proxy-bx8ds                   1/1     Running   0          99m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          kube-proxy-lbqpl                   1/1     Running   0          98m    192.168.36.12   k8s2   <none>           <none>
	 kube-system                                          kube-scheduler-k8s1                1/1     Running   0          99m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          log-gatherer-cnn8x                 1/1     Running   0          97m    192.168.36.11   k8s1   <none>           <none>
	 kube-system                                          log-gatherer-dvhg5                 1/1     Running   0          97m    192.168.36.12   k8s2   <none>           <none>
	 kube-system                                          registry-adder-fbxwg               1/1     Running   0          98m    192.168.36.12   k8s2   <none>           <none>
	 kube-system                                          registry-adder-fc8b8               1/1     Running   0          98m    192.168.36.11   k8s1   <none>           <none>
	 
Stderr:
 	 

Fetching command output from pods [cilium-cn8kr cilium-v8t6w]
cmd: kubectl exec -n kube-system cilium-cn8kr -- cilium status
Exitcode: 0 
Stdout:
 	 KVStore:                Ok   Disabled
	 Kubernetes:             Ok   1.20 (v1.20.4) [linux/amd64]
	 Kubernetes APIs:        ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1beta1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
	 KubeProxyReplacement:   Disabled   
	 Cilium:                 Ok   1.9.90 (v.1.9.90-r.a2a7594)
	 NodeMonitor:            Listening for events on 3 CPUs with 64x4096 of shared memory
	 Cilium health daemon:   Ok   
	 IPAM:                   IPv4: 7/255 allocated from 10.0.1.0/24, IPv6: 7/255 allocated from fd02::100/120
	 BandwidthManager:       Disabled
	 Host Routing:           Legacy
	 Masquerading:           IPTables [IPv4: Enabled, IPv6: Enabled]
	 Controller Status:      42/42 healthy
	 Proxy Status:           OK, ip 10.0.1.42, 0 redirects active on ports 10000-20000
	 Hubble:                 Ok              Current/Max Flows: 1193/4095 (29.13%), Flows/s: 14.04   Metrics: Disabled
	 Cluster health:         2/2 reachable   (2021-03-23T20:24:03Z)
	 
Stderr:
 	 

cmd: kubectl exec -n kube-system cilium-cn8kr -- cilium endpoint list
Exitcode: 0 
Stdout:
 	 ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                          IPv6        IPv4         STATUS   
	            ENFORCEMENT        ENFORCEMENT                                                                                                                                
	 203        Disabled           Disabled          32434      k8s:io.cilium.k8s.policy.cluster=default                                             fd02::12d   10.0.1.249   ready   
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=coredns                                                                       
	                                                            k8s:io.kubernetes.pod.namespace=kube-system                                                                           
	                                                            k8s:k8s-app=kube-dns                                                                                                  
	 262        Enabled            Enabled           1          k8s:cilium.io/ci-node=k8s2                                                                                    ready   
	                                                            reserved:host                                                                                                         
	 309        Disabled           Disabled          4          reserved:health                                                                      fd02::12c   10.0.1.175   ready   
	 729        Disabled           Disabled          26056      k8s:io.cilium.k8s.policy.cluster=default                                             fd02::15a   10.0.1.77    ready   
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=default                                                                       
	                                                            k8s:io.kubernetes.pod.namespace=202103232024k8sdatapathconfighostfirewallwithvxlan                                    
	                                                            k8s:zgroup=testClient                                                                                                 
	 1334       Disabled           Disabled          25739      k8s:app=prometheus                                                                   fd02::1af   10.0.1.93    ready   
	                                                            k8s:io.cilium.k8s.policy.cluster=default                                                                              
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=prometheus-k8s                                                                
	                                                            k8s:io.kubernetes.pod.namespace=cilium-monitoring                                                                     
	 2226       Disabled           Disabled          10726      k8s:io.cilium.k8s.policy.cluster=default                                             fd02::14c   10.0.1.177   ready   
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=default                                                                       
	                                                            k8s:io.kubernetes.pod.namespace=202103232024k8sdatapathconfighostfirewallwithvxlan                                    
	                                                            k8s:zgroup=testServer                                                                                                 
	 2608       Disabled           Disabled          3821       k8s:app=grafana                                                                      fd02::1ed   10.0.1.214   ready   
	                                                            k8s:io.cilium.k8s.policy.cluster=default                                                                              
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=default                                                                       
	                                                            k8s:io.kubernetes.pod.namespace=cilium-monitoring                                                                     
	 
Stderr:
 	 

cmd: kubectl exec -n kube-system cilium-v8t6w -- cilium status
Exitcode: 0 
Stdout:
 	 KVStore:                Ok   Disabled
	 Kubernetes:             Ok   1.20 (v1.20.4) [linux/amd64]
	 Kubernetes APIs:        ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1beta1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
	 KubeProxyReplacement:   Disabled   
	 Cilium:                 Ok   1.9.90 (v.1.9.90-r.a2a7594)
	 NodeMonitor:            Listening for events on 3 CPUs with 64x4096 of shared memory
	 Cilium health daemon:   Ok   
	 IPAM:                   IPv4: 4/255 allocated from 10.0.0.0/24, IPv6: 4/255 allocated from fd02::/120
	 BandwidthManager:       Disabled
	 Host Routing:           Legacy
	 Masquerading:           IPTables [IPv4: Enabled, IPv6: Enabled]
	 Controller Status:      28/28 healthy
	 Proxy Status:           OK, ip 10.0.0.96, 0 redirects active on ports 10000-20000
	 Hubble:                 Ok              Current/Max Flows: 1907/4095 (46.57%), Flows/s: 10.04   Metrics: Disabled
	 Cluster health:         2/2 reachable   (2021-03-23T20:24:04Z)
	 
Stderr:
 	 

cmd: kubectl exec -n kube-system cilium-v8t6w -- cilium endpoint list
Exitcode: 0 
Stdout:
 	 ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                          IPv6       IPv4        STATUS   
	            ENFORCEMENT        ENFORCEMENT                                                                                                                              
	 494        Enabled            Enabled           1          k8s:cilium.io/ci-node=k8s1                                                                                  ready   
	                                                            k8s:node-role.kubernetes.io/control-plane                                                                           
	                                                            k8s:node-role.kubernetes.io/master                                                                                  
	                                                            reserved:host                                                                                                       
	 787        Disabled           Disabled          10726      k8s:io.cilium.k8s.policy.cluster=default                                             fd02::60   10.0.0.86   ready   
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=default                                                                     
	                                                            k8s:io.kubernetes.pod.namespace=202103232024k8sdatapathconfighostfirewallwithvxlan                                  
	                                                            k8s:zgroup=testServer                                                                                               
	 1051       Disabled           Disabled          4          reserved:health                                                                      fd02::25   10.0.0.18   ready   
	 1661       Disabled           Disabled          26056      k8s:io.cilium.k8s.policy.cluster=default                                             fd02::2    10.0.0.76   ready   
	                                                            k8s:io.cilium.k8s.policy.serviceaccount=default                                                                     
	                                                            k8s:io.kubernetes.pod.namespace=202103232024k8sdatapathconfighostfirewallwithvxlan                                  
	                                                            k8s:zgroup=testClient                                                                                               
	 
Stderr:
 	 

===================== Exiting AfterFailed =====================
20:25:01 STEP: Running AfterEach for block EntireTestsuite K8sDatapathConfig Host firewall
20:25:02 STEP: Running AfterEach for block EntireTestsuite K8sDatapathConfig
20:25:02 STEP: Deleting deployment demo_hostfw.yaml
20:25:02 STEP: Deleting namespace 202103232024k8sdatapathconfighostfirewallwithvxlan
20:25:02 STEP: Deleting namespace 202103232024k8sdatapathconfighostfirewallwithvxlan
20:25:17 STEP: Running AfterEach for block EntireTestsuite

Metadata

Metadata

Labels

area/CIContinuous Integration testing issue or flakearea/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/host-firewallImpacts the host firewall or the host endpoint.ci/flakeThis is a known failure that occurs in the tree. Please investigate me!

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions