Skip to content

increased pilot pushes after istiod upgrade from 1.11.8 to 1.12.8 #39652

@turbotankist

Description

@turbotankist

Bug Description

Istio version 1.11.8 is End of Support so we decided to upgrade our clusters to 1.12.8
We use istio-operator for deploy it.
The upgrade was successful but in one time we are figure out that some jobs fails because istio-proxy was not ready. We checked istiod resource usage and it was huge increased.
I had tried to downgrade only the istiod deployment to 1.11.8 and all restored to normal usage. In 20 minutes I changed the version to 1.12.8 again and usage was increased.
image
istiod has no other logs just they much more than early version

...
{"level":"info","time":"2022-06-27T18:36:36.676194Z","scope":"ads","msg":"CDS: PUSH for node:istio-ingressgateway-dps-7cc486466b-lnpr4.istio-system resources:6977 size:6.3MB cached:3497/3499"}
{"level":"info","time":"2022-06-27T18:36:36.677275Z","scope":"ads","msg":"EDS: PUSH for node:panel-7c6659d478-vcjs5.panel-dev resources:3061 size:772.2kB empty:0 cached:3061/3061"}
{"level":"info","time":"2022-06-27T18:36:36.715449Z","scope":"ads","msg":"LDS: PUSH for node:panel-7c6659d478-vcjs5.panel-dev resources:551 size:2.9MB"}
{"level":"info","time":"2022-06-27T18:36:37.100065Z","scope":"ads","msg":"NDS: PUSH for node:delay-queue-handler-stable-374-546894796f-8qppx.data-ceh-stable resources:1 size:297.4kB"}
{"level":"info","time":"2022-06-27T18:36:37.136090Z","scope":"ads","msg":"RDS: PUSH for node:clientpool-6c6c6bc77c-2vx8w.core-clientpool-dev resources:263 size:2.0MB cached:0/263"}
{"level":"info","time":"2022-06-27T18:36:37.141757Z","scope":"ads","msg":"NDS: PUSH for node:clientpool-6c6c6bc77c-2vx8w.core-clientpool-dev resources:1 size:297.4kB"}
{"level":"info","time":"2022-06-27T18:36:37.150535Z","scope":"ads","msg":"EDS: PUSH for node:istio-ingressgateway-vpn-ccb9bd5fb-29j5h.istio-system resources:6944 size:1.7MB empty:80 cached:6738/6944"}
...

image

I've tried to upgrade to 1.13.5 version but without any improvement.
May be we have any legacy configs that could has this affect?

Version

istioctl version
client version: 1.12.8
control plane version: 1.12.8
data plane version: 1.12.8 (495 proxies), 1.11.8 (511 proxies)

kubectl version --short
Client Version: v1.21.1
Server Version: v1.21.12-eks-a64ea69

Additional Information

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: istio-state
  namespace: istio-system
spec:
  addonComponents:
    istiocoredns:
      enabled: false
  components:
    base:
      enabled: true
    cni:
      enabled: false
    egressGateways:
      ...
    ingressGateways:
      ...
  
    pilot:
      enabled: true
      k8s:
        env:
          - name: PILOT_ENABLE_VIRTUAL_SERVICE_DELEGATE
            value: 'true'
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
        hpaSpec:
          maxReplicas: 4
          metrics:
            - resource:
                name: cpu
                targetAverageUtilization: 100
              type: Resource
          minReplicas: 2
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 5
        resources:
          requests:
            cpu: 500m
            memory: 2048Mi
        strategy:
          rollingUpdate:
            maxSurge: 100%
            maxUnavailable: 25%
  hub: docker.io/istio
  meshConfig:
    accessLogEncoding: JSON
    accessLogFile: /dev/stdout
    accessLogFormat: ''
    certificates: []
    defaultConfig:
      proxyMetadata:
        ISTIO_META_DNS_AUTO_ALLOCATE: 'false'
        ISTIO_META_DNS_CAPTURE: 'true'
      tracing:
        zipkin:
          address: 'jaeger-es-collector.jaeger-prod.svc.cluster.local:9411'
    enablePrometheusMerge: false
    enableTracing: true
    localityLbSetting:
      enabled: true
    outboundTrafficPolicy:
      mode: ALLOW_ANY
    protocolDetectionTimeout: 0s
    trustDomain: cluster.local
  profile: default
  tag: 1.12.8
  values:
    gateways:
      istio-egressgateway:
        autoscaleEnabled: true
        name: istio-egressgateway
        secretVolumes:
          - mountPath: /etc/istio/egressgateway-certs
            name: egressgateway-certs
            secretName: istio-egressgateway-certs
          - mountPath: /etc/istio/egressgateway-ca-certs
            name: egressgateway-ca-certs
            secretName: istio-egressgateway-ca-certs
        type: ClusterIP
      istio-ingressgateway:
        autoscaleEnabled: true
        env: {}
        name: istio-ingressgateway
        secretVolumes:
          - mountPath: /etc/istio/wafgateway-certs
            name: wafgateway-certs
            secretName: istio-wafgateway-certs
          - mountPath: /etc/istio/wafgateway-ca-certs
            name: wafgateway-ca-certs
            secretName: istio-wafgateway-ca-certs
          - mountPath: /etc/istio/nexus-certs
            name: nexus-certs
            secretName: nexus-certs
        type: NodePort
        zvpn:
          enabled: false
    global:
      arch:
        amd64: 2
      configValidation: true
      defaultNodeSelector: {}
      defaultPodDisruptionBudget:
        enabled: true
      defaultResources:
        requests:
          cpu: 100m
      imagePullPolicy: IfNotPresent
      imagePullSecrets: []
      istioNamespace: istio-system
      jwtPolicy: third-party-jwt
      logAsJson: true
      logging:
        level: 'default:info'
      meshID: dev
      meshNetworks: {}
      mountMtlsCerts: false
      multiCluster:
        clusterName: eks-dev
        enabled: true
      network: "eks-cluster-one"
      omitSidecarInjectorConfigMap: false
      oneNamespace: false
      operatorManageWebhooks: false
      pilotCertProvider: istiod
      priorityClassName: ''
      proxy:
        autoInject: enabled
        clusterDomain: cluster.local
        componentLogLevel: 'misc:error'
        enableCoreDump: false
        excludeIPRanges: 169.254.169.254/32
        excludeInboundPorts: ''
        excludeOutboundPorts: '9093,9094'
        image: proxyv2
        includeIPRanges: '*'
        lifecycle:
          preStop:
            exec:
              command:
                - sh
                - '-c'
                - sleep 10
        logLevel: warning
        privileged: false
        readinessFailureThreshold: 30
        readinessInitialDelaySeconds: 1
        readinessPeriodSeconds: 2
        resources:
          limits:
            cpu: 2000m
            memory: 1224Mi
          requests:
            cpu: 50m
            memory: 200Mi
        statusPort: 15020
        tracer: zipkin
      proxy_init:
        image: proxyv2
        resources:
          limits:
            cpu: 500m
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 10Mi

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions