Skip to content

Strange behaviour about LocalityLoadBalancerSetting.failoverPriority config #40198

@ktalg

Description

@ktalg

Bug Description

There are 3 simple workloads: curl,httpbin-v1(hv1),httpbin-v2(hv2), assuming that the httpbin is the server workload and curl is client.

apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
    - name: http
      port: 8000
      targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin-v1
spec:
  replicas: 2
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
        a: a
        b: b
    spec:
      containers:
        - image: docker.io/kennethreitz/httpbin
          imagePullPolicy: IfNotPresent
          name: httpbin
          ports:
            - containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin-v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v2
  template:
    metadata:
      labels:
        app: httpbin
        version: v2
        a: a
    spec:
      containers:
        - image: docker.io/kennethreitz/httpbin
          imagePullPolicy: IfNotPresent
          name: httpbin
          ports:
            - containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: curl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: curl
      version: v1
  template:
    metadata:
      labels:
        app: curl
        version: v1
        a: a
        b: b
    spec:
      containers:
        - image: docker.io/curlimages/curl
          imagePullPolicy: IfNotPresent
          name: curl
          command:
            - sleep
            - 365d
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: dr
spec:
  host: httpbin.test.svc.cluster.local
  trafficPolicy:
    outlierDetection:
      consecutive5xxErrors: 20
      interval: 5m
      baseEjectionTime: 15m
    loadBalancer:
      localityLbSetting:
        failoverPriority:
          - "b"
        enabled: true
---

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: vs
spec:
  hosts:
    - h.vs
  http:
    - route:
        - destination:
            host: httpbin.test.svc.cluster.local
            port:
              number: 8000

And I got

│ httpbin-v1-84f69dfc6-4sg4h   172.22.128.222   
│ httpbin-v1-84f69dfc6-gccv7   172.22.128.223
│ httpbin-v2-5fcbfd7b4c-v4p7l  172.22.128.66    

According
to destinationRule.LocalityLoadBalancerSetting.failoverPriority :It can be any label specified on both client and server workloads "b" is used here as failoverPriority key. For curl workload, hv1's 2 pods should have the highest lb
priority (matching b: b), while hv2 has lower priority, but the opposite is true, the 2 pods of hv1 have lower
priority. Envoy's endpoint information(from pod curl) is as follows(deleted some unimportant fields).

    {
  "endpoint_config": {
    "endpoints": [
      {
        "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "172.22.128.66",
                  "port_value": 80
                }
              },
              "health_check_config": {}
            },
            "health_status": "HEALTHY",
            "load_balancing_weight": 1
          }
        ],
        "load_balancing_weight": 1
      },
      {
        "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "172.22.128.222",
                  "port_value": 80
                }
              },
              "health_check_config": {}
            },
            "health_status": "HEALTHY",
            "load_balancing_weight": 1
          },
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "172.22.128.223",
                  "port_value": 80
                }
              },
              "health_check_config": {}
            },
            "health_status": "HEALTHY",
            "load_balancing_weight": 1
          }
        ],
        "load_balancing_weight": 2,
        "priority": 1
      }
    ],
    "policy": {
      "overprovisioning_factor": 140
    }
  }
}

Even stranger, when the number of copies of hv1 is changed to 1 (no other configuration changes), the priority setting
works again!Envoy's endpoint information(from pod curl) is as follows(deleted some unimportant fields):

{
  "endpoint_config": {
    "endpoints": [
      {
        "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "172.22.128.222",
                  "port_value": 80
                }
              },
              "health_check_config": {}
            },
            "health_status": "HEALTHY",
            "load_balancing_weight": 1
          }
        ],
        "load_balancing_weight": 1
      },
      {
        "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "172.22.128.66",
                  "port_value": 80
                }
              },
              "health_check_config": {}
            },
            "health_status": "HEALTHY",
            "load_balancing_weight": 1
          }
        ],
        "load_balancing_weight": 1,
        "priority": 1
      }
    ],
    "policy": {
      "overprovisioning_factor": 140
    }
  }
}

Very stange!

Version

   ~  istioctl version                                                                                                            
client version: 1.12.8
control plane version: 1.12.8
data plane version: 1.12.8 (30 proxies), 1.11.7 (19 proxies)

   ~  kubectl version --short                                                                                                     
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.24.1
Kustomize Version: v4.5.4
Server Version: v1.20.11
WARNING: version difference between client (1.24) and server (1.20) exceeds the supported minor version skew of +/-1

Additional Information

No response

Affected product area

  • Docs
  • Installation
  • Networking
  • Performance and Scalability
  • Extensions and Telemetry
  • Security
  • Test and Release
  • User Experience
  • Developer Infrastructure
  • Upgrade
  • Multi Cluster
  • Virtual Machine
  • Control Plane Revisions

Is this the right place to submit this?

  • This is not a security vulnerability
  • This is not a question about how to use Istio

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions