-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
At the moment, Hubble Relay's Kubernetes readiness and liveness check are simply checking if the TCP port is open:
readinessProbe: | |
tcpSocket: | |
port: grpc | |
livenessProbe: | |
tcpSocket: | |
port: grpc |
This kind of check is not particularly useful in determining if Hubble is ready. Instead, the user is required to manually inspect Relay's logs for common errors, such as: No access to peer service, mTLS authentication failures, pod2host connectivity broken and no connection to Hubble Observers.
We should introduce a more meaningful health-check which performs some basic checks, for example:
- Have we received cluster information from the peer service?
- Are we connected to a Hubble Observer on at least one node?
- (Bonus): Do the connected Hubble Observers contain any flows in their ring buffer?
@kaworu mentioned that this health check could be exposed via the gRPC endpoint, which an upcoming Kubernetes version seems to support. This means we do not have to create a custom HTTP server just for this. @kaworu feel free to provide more details.
For older versions of Kubernetes, we could provide a hubble-relay healthz
sub-command and use a command
based readiness/liveness probe.