Skip to content

Jaerger Ingester - Unable to create consumer","error":"kafka: client has run out of available brokers to talk to #3059

@dgoscn

Description

@dgoscn

Describe the bug
So, me and my team are trying to up a Jaeger Ingester on EKS and we are passing Jaeger flags as args. When the pod is created we can check on the logs that it wasn't able to create a kafka consumer. In order to troubleshoot we developed a python script that is able to connect to the kafka cluster.

To Reproduce
Steps to reproduce the behavior:

  1. This is deployment file that we use to deploy.
    kubectl apply -f jaeger-ingester.yml
 apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: jaeger
    component: ingester
  name: jaeger-ingester
  namespace: jaeger
spec:
  selector:
    matchLabels:
      app: jaeger
      component: ingester
  replicas: 1
  template:
    metadata:
      labels:
        app: jaeger
        component: ingester
    spec:
     containers:
        - name: jaeger-ingester
          args:
          - --log-level=debug
          - --span-storage.type=elasticsearch
          - --es.server-urls=https://vpc-es-cluster-ZZZ-XXX.us-east-1.es.amazonaws.com/
          - --kafka.consumer.brokers=kafka-url:31101
          - --kafka.consumer.protocol-version=SSL
          - --kafka.consumer.tls.enabled=true
          - --kafka.consumer.tls.cert=/opt/keys/SOME-cert.pem
          - --kafka.consumer.tls.key=/opt/keys/SOME-key.pem
          - --kafka.consumer.tls.ca=/opt/keys/CARoot
          - --kafka.consumer.topic=observability-topic
          - --kafka.consumer.group-id=group_consumer_jaeger
          - --ingester.deadlockInterval=2m
          - --ingester.parallelism=5
          image: jaegertracing/jaeger-ingester:1.22.0 #we use our own ECR Repository
          env:
            - name: "SPAN_STORAGE_TYPE"
              value: "elasticsearch"
          ports:
            # IngesterAdminHTTP is the default admin HTTP port (health check, metrics, etc.)
            - name: jaeger-ingester
              containerPort: 14270
          # volume secrets
          volumeMounts:
          - name: kafka-ing
            mountPath: "/opt/keys"
            readOnly: true
     volumes:
     - name: kafka-ing
       secret:
         secretName: kafka-key
         items:
         - key: SOME-key.pem
           path: ./SOME-key.pem
         - key: SOME-cert.pem
           path: ./SOME-cert.pem
         - key: CARoot
           path: ./CARoot
    # WE HAVE THESE SOME-CERT inside of our manifests in local directory when we run the deployment.
    #kubectl create secret generic kafka-key -n jaeger --from-file=./SOME-cert.pem --from-file=./SOME-cert.pem --from-file=./CARoot
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: jaeger
    component: ingester
  name: jaeger-ingester
  namespace: jaeger
spec:
  ports:
    - name: jaeger-ingester
      port: 14270
      protocol: TCP
      targetPort: 14270
  selector:
    app: jaeger
    component: ingester
  type: ClusterIP
  1. kubectl logs pods/jaeger-pods-generated -n jaeger

Expected behavior
kubectl logs pods/jaeger-ingester-POD -n jaeger

021/06/04 22:01:00 maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined {"level":"info","ts":1622844060.9307475,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"} {"level":"info","ts":1622844060.9308012,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"} {"level":"info","ts":1622844060.9309163,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"} {"level":"info","ts":1622844060.9309664,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":14270"} {"level":"info","ts":1622844060.930975,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:14270","health-status":"unavailable"} {"level":"info","ts":1622844061.055035,"caller":"config/config.go:189","msg":"Elasticsearch detected","version":7} {"level":"debug","ts":1622844062.216302,"caller":"consumer/deadlock_detector.go:147","msg":"Global deadlock detector disabled"} {"level":"info","ts":1622844062.216349,"caller":"healthcheck/handler.go:129","msg":"Health Check state change","status":"ready"} {"level":"info","ts":1622844062.2163684,"caller":"consumer/consumer.go:79","msg":"Starting main loop"} {"level":"info","ts":1622844064.5668762,"caller":"consumer/consumer.go:167","msg":"Starting error handler","partition":1} {"level":"info","ts":1622844064.5669289,"caller":"consumer/consumer.go:110","msg":"Starting message handler","partition":1} {"level":"debug","ts":1622844064.5670233,"caller":"consumer/deadlock_detector.go:98","msg":"Partition deadlock detector disabled"} {"level":"debug","ts":1622844064.5750434,"caller":"consumer/consumer.go:138","msg":"Got msg","msg":{"Headers":null,"Timestamp":"0001-01-01T00:00:00Z","BlockTimestamp":"0001-01-01T00:00:00Z","Key":"zKSauSG41837BRHUE==","Value":"KZ01Vd3Flenc9Iiwib3BlcmF0aW9uTmFtZSI6IkdldERyaXZlciIsInJlZ","Topic":"observability-topic","Partition":1,"Offset":4332888}}

Version (please complete the following information):

  • OS: cat /etc/os-release
    NAME="Amazon Linux"
    VERSION="2"
    ID="amzn"
    ID_LIKE="centos rhel fedora"
    VERSION_ID="2"
    PRETTY_NAME="Amazon Linux 2"
    ANSI_COLOR="0;33"
    CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
    HOME_URL="https://amazonlinux.com/"

  • Jaeger version: 1.22.0

  • Deployment: Kubernetes, 1.17

What troubleshooting steps did you try?
We tried using some telnet connections, and it's worked, so, we tried using python code to test connectivity with kafka broker

from kafka import KafkaConsumer

try:
    consumer = KafkaConsumer(
                                bootstrap_servers=kafka-url:31101',
                                group_id='group_consumer_jaeger',
                                ssl_cafile='/opt/keys/CARoot',
                                ssl_certfile='/opt/keys/SOME-cert.pem',
                                ssl_keyfile='/opt/keys/SOME-key.pem',
                                security_protocol='SSL'
                            )
except Exception as e: print('Error creating consumer: %s', str(e))

consumer.subscribe('observability-topic')
print(consumer.partitions_for_topic('observability-topic')

this is the output: python test-kafka-brokers.py set([0, 1, 2])

We also commented the jaeger flags presents on the deployment, but, we received the same error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions