Skip to content

Hubble UI fails to render service map after helm deploy #18732

@thejosephstevens

Description

@thejosephstevens

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I deployed Cilium in AWS (EKS) with helm (using the template from the below command).

helm template cilium cilium/cilium --version 1.11.0 \
  --namespace kube-system \
  --set ipam.mode=cluster-pool \
  --set tunnel=vxlan \
  --set localRedirectPolicy=true \
  --set egressMasqueradeInterfaces=eth+ \
  --set nodeinit.enabled=false \
  --set hubble.tls.auto.method="cronJob" \
  --set hubble.listenAddress=":4244" \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set upgradeCompatibility="1.9" \
  --set encryption.enabled=true \
  --set encryption.nodeEncryption=false \
  --set encryption.type=ipsec \
  --set prometheus.enabled=true \
  --set operator.prometheus.enabled=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}" \
  --set labels="k8s:io.kubernetes.pod.namespace k8s:k8s-app k8s:app k8s:name k8s:spark-role"

The entire cluster is healthy (we're using a private registry, but these are just copies of the images referenced by helm 1.11.0)

❯ cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         OK
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 2, Ready: 2/2, Available: 2/2
Deployment        cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
Deployment        hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
Deployment        hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
Containers:       cilium             Running: 2
                  cilium-operator
                  hubble-relay       Running: 1
                  hubble-ui          Running: 1
Cluster Pods:     44/44 managed by Cilium
Image versions    hubble-relay    quay.io/ascendio/hubble-relay:v1.11.0: 1
                  hubble-ui       quay.io/ascendio/hubble-ui:v0.8.3: 1
                  hubble-ui       quay.io/ascendio/hubble-ui-backend:v0.8.3: 1
                  hubble-ui       quay.io/ascendio/envoy:v1.18.4: 1
                  cilium          quay.io/ascendio/cilium:v1.11.0: 2

I run kubectl port-forward -n kube-system hubble-ui-5b7f99fcb6-5qqf2 :8080 and navigate to the output address in my browser and see this error
image

I see some connection retries in hubble-relay (although they seem to stabilize after single retries). Neither hubble backend, frontend, nor proxy is producing any errors.

Cilium Version

❯ cilium version
cilium-cli: v0.10.0 compiled with go1.17.4 on darwin/amd64
cilium image (default): v1.11.0
cilium image (stable): v1.11.1
cilium image (running): v1.11.0

Kernel Version

Linux version 5.4.156-83.273.amzn2.x86_64 (mockbuild@ip-10-0-39-220) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-13) (GCC)) #1 SMP Sat Oct 30 12:59:07 UTC 2021

Kubernetes Version

❯ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:10:45Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.5-eks-bc4871b", GitCommit:"5236faf39f1b7a7dabea8df12726f25608131aa9", GitTreeState:"clean", BuildDate:"2021-10-29T23:32:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

cilium-sysdump-20220207-230512.zip

Relevant log output

# hubble relay
level=warning msg="Failed to create gRPC client" address="192.168.9.213:4244" error="connection error: desc = \"transport: error while dialing: dial tcp 192.168.9.213:4244: connect: connection refused\"" hubble-tls=true next-try-in=10s peer=ip-192-168-9-213.ec2.internal subsys=hubble-relay
level=info msg=Connecting address="192.168.9.213:4244" hubble-tls=true peer=ip-192-168-9-213.ec2.internal subsys=hubble-relay
level=info msg=Connected address="192.168.9.213:4244" hubble-tls=true peer=ip-192-168-9-213.ec2.internal subsys=hubble-relay

# hubble-ui backend
level=info msg="initialized with TLS disabled\n" subsys=config
level=info msg="listening at: 0.0.0.0:8090\n" subsys=ui-backend

# hubble-ui proxy
❯ kubectl logs -n kube-system hubble-ui-5b7f99fcb6-fx9j7 -c proxy | grep -v info
[2022-01-25 18:51:33.139][1][warning][main] [source/server/server.cc:506] No admin address given, so no admin HTTP server started.
  - name: base
    static_layer:
      {}
  - name: admin
    admin_layer:
      {}
[2022-01-25 18:51:33.151][1][warning][main] [source/server/server.cc:642] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections

Anything else?

I'm seeing this same issue in EKS, AKS, and GKE, so I'm assuming that it's either a bug or an issue of my config (we're using very similar config in all 3 clouds)

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions