Skip to content

An error has occurred while serving routing_http_client metrics #9891

@lidel

Description

@lidel

Version

0.20.0

Config

"Routing": {
		"Methods": null,
		"Routers": null,
		"Type": "auto"
	},

Description

http://127.0.0.1:5001/debug/metrics/prometheus ends up in broken state after a multiple days of uptime:

An error has occurred while serving metrics:

14 error(s) occurred:
* collected metric "routing_http_client_length" { label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:17760 sample_sum:67.00000000000001 bucket:<cumulative_count:17741 upper_bound:1 > bucket:<cumulative_count:17742 upper_bound:2 > bucket:<cumulative_count:17753 upper_bound:5 > bucket:<cumulative_count:17760 upper_bound:10 > bucket:<cumulative_count:17760 upper_bound:11 > bucket:<cumulative_count:17760 upper_bound:12 > bucket:<cumulative_count:17760 upper_bound:15 > bucket:<cumulative_count:17760 upper_bound:20 > bucket:<cumulative_count:17760 upper_bound:50 > bucket:<cumulative_count:17760 upper_bound:100 > bucket:<cumulative_count:17760 upper_bound:200 > bucket:<cumulative_count:17760 upper_bound:500 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"DNSTimeout" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:2 sample_sum:17000 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:0 upper_bound:200 > bucket:<cumulative_count:0 upper_bound:500 > bucket:<cumulative_count:0 upper_bound:1000 > bucket:<cumulative_count:0 upper_bound:2000 > bucket:<cumulative_count:0 upper_bound:5000 > bucket:<cumulative_count:2 upper_bound:10000 > bucket:<cumulative_count:2 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"404" > label:<name:"error" value:"None" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:17741 sample_sum:1.5591060000000058e+06 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:3130 upper_bound:10 > bucket:<cumulative_count:6917 upper_bound:20 > bucket:<cumulative_count:13167 upper_bound:50 > bucket:<cumulative_count:13207 upper_bound:100 > bucket:<cumulative_count:14781 upper_bound:200 > bucket:<cumulative_count:17124 upper_bound:500 > bucket:<cumulative_count:17739 upper_bound:1000 > bucket:<cumulative_count:17740 upper_bound:2000 > bucket:<cumulative_count:17741 upper_bound:5000 > bucket:<cumulative_count:17741 upper_bound:10000 > bucket:<cumulative_count:17741 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"Canceled" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:490 sample_sum:33772.00000000005 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:3 upper_bound:20 > bucket:<cumulative_count:251 upper_bound:50 > bucket:<cumulative_count:415 upper_bound:100 > bucket:<cumulative_count:470 upper_bound:200 > bucket:<cumulative_count:490 upper_bound:500 > bucket:<cumulative_count:490 upper_bound:1000 > bucket:<cumulative_count:490 upper_bound:2000 > bucket:<cumulative_count:490 upper_bound:5000 > bucket:<cumulative_count:490 upper_bound:10000 > bucket:<cumulative_count:490 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"200" > label:<name:"error" value:"None" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:19 sample_sum:4727 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:5 upper_bound:200 > bucket:<cumulative_count:18 upper_bound:500 > bucket:<cumulative_count:19 upper_bound:1000 > bucket:<cumulative_count:19 upper_bound:2000 > bucket:<cumulative_count:19 upper_bound:5000 > bucket:<cumulative_count:19 upper_bound:10000 > bucket:<cumulative_count:19 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"DeadlineExceeded" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:79 sample_sum:789989 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:0 upper_bound:200 > bucket:<cumulative_count:0 upper_bound:500 > bucket:<cumulative_count:0 upper_bound:1000 > bucket:<cumulative_count:0 upper_bound:2000 > bucket:<cumulative_count:0 upper_bound:5000 > bucket:<cumulative_count:23 upper_bound:10000 > bucket:<cumulative_count:79 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"Net" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:30 sample_sum:41185 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:9 upper_bound:10 > bucket:<cumulative_count:19 upper_bound:20 > bucket:<cumulative_count:21 upper_bound:50 > bucket:<cumulative_count:21 upper_bound:100 > bucket:<cumulative_count:21 upper_bound:200 > bucket:<cumulative_count:21 upper_bound:500 > bucket:<cumulative_count:21 upper_bound:1000 > bucket:<cumulative_count:22 upper_bound:2000 > bucket:<cumulative_count:26 upper_bound:5000 > bucket:<cumulative_count:30 upper_bound:10000 > bucket:<cumulative_count:30 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"404" > label:<name:"error" value:"None" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:17741 sample_sum:1.5591060000000058e+06 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:3130 upper_bound:10 > bucket:<cumulative_count:6917 upper_bound:20 > bucket:<cumulative_count:13167 upper_bound:50 > bucket:<cumulative_count:13207 upper_bound:100 > bucket:<cumulative_count:14781 upper_bound:200 > bucket:<cumulative_count:17124 upper_bound:500 > bucket:<cumulative_count:17739 upper_bound:1000 > bucket:<cumulative_count:17740 upper_bound:2000 > bucket:<cumulative_count:17741 upper_bound:5000 > bucket:<cumulative_count:17741 upper_bound:10000 > bucket:<cumulative_count:17741 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"Canceled" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:490 sample_sum:33772.00000000005 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:3 upper_bound:20 > bucket:<cumulative_count:251 upper_bound:50 > bucket:<cumulative_count:415 upper_bound:100 > bucket:<cumulative_count:470 upper_bound:200 > bucket:<cumulative_count:490 upper_bound:500 > bucket:<cumulative_count:490 upper_bound:1000 > bucket:<cumulative_count:490 upper_bound:2000 > bucket:<cumulative_count:490 upper_bound:5000 > bucket:<cumulative_count:490 upper_bound:10000 > bucket:<cumulative_count:490 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"200" > label:<name:"error" value:"None" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:19 sample_sum:4727 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:5 upper_bound:200 > bucket:<cumulative_count:18 upper_bound:500 > bucket:<cumulative_count:19 upper_bound:1000 > bucket:<cumulative_count:19 upper_bound:2000 > bucket:<cumulative_count:19 upper_bound:5000 > bucket:<cumulative_count:19 upper_bound:10000 > bucket:<cumulative_count:19 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"DeadlineExceeded" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:79 sample_sum:789989 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:0 upper_bound:200 > bucket:<cumulative_count:0 upper_bound:500 > bucket:<cumulative_count:0 upper_bound:1000 > bucket:<cumulative_count:0 upper_bound:2000 > bucket:<cumulative_count:0 upper_bound:5000 > bucket:<cumulative_count:23 upper_bound:10000 > bucket:<cumulative_count:79 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"Net" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:30 sample_sum:41185 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:9 upper_bound:10 > bucket:<cumulative_count:19 upper_bound:20 > bucket:<cumulative_count:21 upper_bound:50 > bucket:<cumulative_count:21 upper_bound:100 > bucket:<cumulative_count:21 upper_bound:200 > bucket:<cumulative_count:21 upper_bound:500 > bucket:<cumulative_count:21 upper_bound:1000 > bucket:<cumulative_count:22 upper_bound:2000 > bucket:<cumulative_count:26 upper_bound:5000 > bucket:<cumulative_count:30 upper_bound:10000 > bucket:<cumulative_count:30 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_latency" { label:<name:"code" value:"0" > label:<name:"error" value:"DNSTimeout" > label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:2 sample_sum:17000 bucket:<cumulative_count:0 upper_bound:1 > bucket:<cumulative_count:0 upper_bound:2 > bucket:<cumulative_count:0 upper_bound:5 > bucket:<cumulative_count:0 upper_bound:10 > bucket:<cumulative_count:0 upper_bound:20 > bucket:<cumulative_count:0 upper_bound:50 > bucket:<cumulative_count:0 upper_bound:100 > bucket:<cumulative_count:0 upper_bound:200 > bucket:<cumulative_count:0 upper_bound:500 > bucket:<cumulative_count:0 upper_bound:1000 > bucket:<cumulative_count:0 upper_bound:2000 > bucket:<cumulative_count:0 upper_bound:5000 > bucket:<cumulative_count:2 upper_bound:10000 > bucket:<cumulative_count:2 upper_bound:20000 > > } was collected before with the same name and label values
* collected metric "routing_http_client_length" { label:<name:"host" value:"cid.contact" > label:<name:"operation" value:"FindProviders" > histogram:<sample_count:17760 sample_sum:67.00000000000001 bucket:<cumulative_count:17741 upper_bound:1 > bucket:<cumulative_count:17742 upper_bound:2 > bucket:<cumulative_count:17753 upper_bound:5 > bucket:<cumulative_count:17760 upper_bound:10 > bucket:<cumulative_count:17760 upper_bound:11 > bucket:<cumulative_count:17760 upper_bound:12 > bucket:<cumulative_count:17760 upper_bound:15 > bucket:<cumulative_count:17760 upper_bound:20 > bucket:<cumulative_count:17760 upper_bound:50 > bucket:<cumulative_count:17760 upper_bound:100 > bucket:<cumulative_count:17760 upper_bound:200 > bucket:<cumulative_count:17760 upper_bound:500 > > } was collected before with the same name and label values

Seems we have a bug in boxo/routing/http/client, needs analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: Good to have, but can wait until someone steps upexp/intermediatePrior experience is likely helpfulkind/bugA bug in existing code (including security flaws)topic/IPNIIssues related to InterPlanetary Network Indexertopic/routingTopic routing

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions