-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Closed
Copy link
Description
Bug report
General Information
Trying to use the cilium-node-init
's restartPods
functionality doesn't work on AKS with ubuntu 18.04 images. Connectivity test pods created before the reconfigureKubelet
completes fail to become ready (pass). Looking at the logs for cilium-node-init the restart isn't happening.
I believe the issue is that the none of the branches in this if statement match on the azure images:
- the one I think should match:
if grep -q 'docker' /etc/crictl.yaml; then
doesn't because the file/etc/crictl.yaml
doesn't exist so grep errors. - I think we could improve it with
if [ ! -f /etc/crictl.yaml ] || grep -q 'docker' /etc/crictl.yaml; then
(check for the existence of the file). In my testing this works for me; I can see the "Restarting kubenet managed pods" msg in the cilium-node-init logs.
How to reproduce the issue
- Cilium version: 1.8.2
- Run aks install (filling in bash vars with your own)
- Helm install cilium
- Immediately apply the connectivity-check.yaml
- note this might need to be reapplied a couple times to get the cilium network policy
- creating these pods before the cilium-node-init finishes is important so they get created under a kubenet policy before the reconfigure changes kubenet->cni and restarts the kubelet. There are other ways this bug shows up, but this is the clearest way to demonstrate it.
aksargs=(
--subscription "$SUB"
--resource-group "$RG"
--name "$NAME"
--kubernetes-version 1.17.7
--vm-set-type "VirtualMachineScaleSets"
# Causes it to not create a public IP for the api-server
--enable-private-cluster
# don't use Azure CNI; but we will overwrite this later w/ cilium
--network-plugin kubenet
--load-balancer-sku "standard"
--vnet-subnet-id "$SUBNET_ID"
# Not really used; but needs to be defined
--docker-bridge-address="172.17.8.1/23"
# Internal IPs of kubernetes services
--service-cidr "172.17.0.0/21"
--dns-service-ip="172.17.0.10" # They ask for it to be `.10`... sure
--pod-cidr "172.17.32.0/19"
--service-principal "$APPID"
--client-secret "$APPPWD"
#https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#generation-2-virtual-machines-preview
# importantly this triggers ubuntu 18.04 images
--aks-custom-headers "usegen2vm=true"
)
ciliumhelmargs=(
--version 1.8.2
--namespace cilium
--set config.ipam=kubernetes
# Rewrite kubelet config file to enable CNI w/ the node-init DaemonSet.
--set global.nodeinit.enabled=true
--set nodeinit.reconfigureKubelet=true
--set nodeinit.removeCbrBridge=true
# Any pods that already running won't get the above changes; have nodeinit restart them
# this doesn't actually work right now.
--set nodeinit.restartPods=true
# Use cilium native routing
--set global.tunnel=disabled
--set global.endpointRoutes.enabled=true
--set global.nativeRoutingCIDR=172.17.32.0/19
)
az aks create "${aksargs[@]}"
az aks get-credentials --resource-group $RG --name $NAME
kubectl create ns cilium
helm install cilium cilium/cilium "${ciliumargs[@]}"
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.8.2/examples/kubernetes/connectivity-check/connectivity-check.yaml
Metadata
Metadata
Assignees
Labels
No labels