-
Notifications
You must be signed in to change notification settings - Fork 173
PS: Expose request metrics #3232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PS: Expose request metrics #3232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r1.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lukedirtwalker)
go/path_srv/internal/metrics/requests.go, line 29 at r1 (raw file):
) var reqResults = []string{RequestCached, RequestFetched, ErrCrypto, ErrDB, ErrTimeout, ErrReply}
If it is up to me, I would not include in the description the possible label values. People can get the values from prometheus and there is only risk to forget
updating it.
If we have to keep that, I would like to have it next to the description (some line above) or in the metric/metric.go.
No strong preference though.
go/path_srv/internal/metrics/requests.go, line 33 at r1 (raw file):
// RequestOkLabels contains the labels for a request that succeeded. type RequestOkLabels struct { Type proto.PathSegType
type
is a bit dangerous name (e.g. if in the future we want to make it unexported it will give strange errors). Maybe RequestType
go/path_srv/internal/metrics/requests.go, line 46 at r1 (raw file):
func (l RequestOkLabels) Values() []string { return []string{l.Type.String(), strconv.FormatBool(l.CacheOnly), strconv.FormatUint(uint64(l.DstISD), 10)}
can we have only .String()
here?
go/path_srv/internal/metrics/requests.go, line 69 at r1 (raw file):
total *prometheus.CounterVec replySegs *prometheus.CounterVec replyRevs *prometheus.CounterVec
maybe this could be one-liner
go/path_srv/internal/metrics/requests.go, line 74 at r1 (raw file):
func newRequests() Request { return Request{ total: prom.NewCounterVec(Namespace, "", "requests_total",
subsystem is missing,
go/path_srv/internal/metrics/requests.go, line 79 at r1 (raw file):
RequestLabels{}.Labels()), replySegs: prom.NewCounterVec(Namespace, "", "requests_reply_segs_total", "Number of segments in reply to path request.", RequestOkLabels{}.Labels()),
The description confuses me because it does not have the word requests
.
What is requests?
go/path_srv/internal/metrics/requests.go, line 86 at r1 (raw file):
// Total returns the counter for requests total. func (r Request) Total(l RequestLabels) prometheus.Counter {
Total
as a name might not be accurate, maybe Requests
go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):
func TestRequestLabels(t *testing.T) { promtest.CheckLabelsStruct(t, metrics.RequestOkLabels{}) promtest.CheckLabelsStruct(t, metrics.RequestLabels{})
maybe we could follow the table test driven approach
https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/metrics_test.go#L24-L41
go/path_srv/internal/segreq/handler.go, line 60 at r1 (raw file):
ctx := request.Context() logger := log.FromCtx(ctx) labels := metrics.RequestLabels{
Ideally we should initialize all the label values. (e.g the label value of type for the first error should be proto.PathSegType_unset
)
labels := metrics.Labels{ifid: "0", result: success}
if err!=nil {
A pattern that we follow is the label.WithResult()
Check this
scion/go/lib/infra/modules/trust/trust.go
Lines 203 to 218 in 2eab837
l := metrics.LookupLabels{ | |
Client: addrLocation(client, store.ia), | |
Trigger: metrics.FromCtx(ctx), | |
ReqType: metrics.TRCReq, | |
CacheOnly: opts.LocalOnly, | |
Result: metrics.ErrInternal, | |
} | |
trcObj, err := store.trustdb.GetTRCVersion(ctx, isd, version) | |
if err != nil { | |
metrics.Store.Lookup(l.WithResult(metrics.ErrDB)).Inc() | |
return nil, err | |
} | |
if trcObj != nil { | |
metrics.Store.Lookup(l.WithResult(metrics.OkCached)).Inc() | |
return trcObj, nil | |
} |
go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):
logger.Error("Failed to handler request", "err", err) sendAck(proto.Ack_ErrCode_reject, err.Error()) // TODO(lukedirtwalker): set result label if we have error categorization.
left?
go/path_srv/internal/segreq/handler.go, line 108 at r1 (raw file):
logger.Error("[segReqHandler] Failed to send reply", "err", err) labels.Result = metrics.ErrReply metrics.Requests.Total(labels).Inc()
the withLabel()
approach
go/path_srv/internal/segreq/handler.go, line 171 at r1 (raw file):
} func determineReplyType(segs segfetcher.Segments) proto.PathSegType {
my preference is that the label functionality to live within the metrics package.
Unless we need to have cycle import (but we can usually avoid it).
For example https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/beaconing.go#L66
d30b061
to
0938e7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @karampok)
go/path_srv/internal/metrics/requests.go, line 29 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
If it is up to me, I would not include in the description the possible label values. People can get the values from prometheus and there is only risk to forget
updating it.If we have to keep that, I would like to have it next to the description (some line above) or in the metric/metric.go.
No strong preference though.
Done.
go/path_srv/internal/metrics/requests.go, line 33 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
type
is a bit dangerous name (e.g. if in the future we want to make it unexported it will give strange errors). MaybeRequestType
Done.
go/path_srv/internal/metrics/requests.go, line 46 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
can we have only
.String()
here?
Done.
go/path_srv/internal/metrics/requests.go, line 69 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
maybe this could be one-liner
It's not something we usually use in our codebase.
go/path_srv/internal/metrics/requests.go, line 74 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
subsystem is missing,
Done.
go/path_srv/internal/metrics/requests.go, line 79 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
The description confuses me because it does not have the word
requests
.
What is requests?
Done.
go/path_srv/internal/metrics/requests.go, line 86 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Total
as a name might not be accurate, maybeRequests
Done.
go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
maybe we could follow the table test driven approach
https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/metrics_test.go#L24-L41
I don't see why, it isn't much code that is duplicated. With table driven approach we would have more lines. But I don't feel strongly about this.
go/path_srv/internal/segreq/handler.go, line 60 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Ideally we should initialize all the label values. (e.g the label value of type for the first error should be
proto.PathSegType_unset
)labels := metrics.Labels{ifid: "0", result: success} if err!=nil {
A pattern that we follow is the
label.WithResult()
Check thisscion/go/lib/infra/modules/trust/trust.go
Lines 203 to 218 in 2eab837
l := metrics.LookupLabels{ Client: addrLocation(client, store.ia), Trigger: metrics.FromCtx(ctx), ReqType: metrics.TRCReq, CacheOnly: opts.LocalOnly, Result: metrics.ErrInternal, } trcObj, err := store.trustdb.GetTRCVersion(ctx, isd, version) if err != nil { metrics.Store.Lookup(l.WithResult(metrics.ErrDB)).Inc() return nil, err } if trcObj != nil { metrics.Store.Lookup(l.WithResult(metrics.OkCached)).Inc() return trcObj, nil }
Hm yeah I like the With approach. proto.PathSegType_unset
is already the default. For results I'd rather have an error default, so that I can easily see that something is wrong in the metrics if we forget to add the With somewhere.
go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
left?
? This can't be done easily, so yeah this will stay here.
go/path_srv/internal/segreq/handler.go, line 108 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
the
withLabel()
approach
Done.
go/path_srv/internal/segreq/handler.go, line 171 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
my preference is that the label functionality to live within the metrics package.
Unless we need to have cycle import (but we can usually avoid it).
For example https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/beaconing.go#L66
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit description needs updating
Reviewed 4 of 4 files at r2.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @lukedirtwalker)
go/path_srv/internal/metrics/metrics.go, line 35 at r2 (raw file):
const ( Success = prom.Success RegistrationNew = "new"
= "request_new"
From doc/Metrics.md
Use values that can be searched with regex. E.g. prepend err_ for every error result.
go/path_srv/internal/metrics/requests.go, line 87 at r1 (raw file):
// Total returns the counter for requests total. func (r Request) Total(l RequestLabels) prometheus.Counter { return r.total.WithLabelValues(l.Values()...)
r.total
->r.count
just to be in order.
go/path_srv/internal/metrics/requests.go, line 82 at r2 (raw file):
replySegs: prom.NewCounterVec(Namespace, subsystem, "reply_segs_total", "Number of segments in reply to segment requests.", RequestOkLabels{}.Labels()), replyRevs: prom.NewCounterVec(Namespace, subsystem, "reply_revs_total",
reply_revocation_total
, kormat once asked me not to short the names.
go/path_srv/internal/metrics/requests.go, line 83 at r2 (raw file):
Number of segments in (returned) reply to path request.
One is with returned, other not
go/path_srv/internal/metrics/requests.go, line 94 at r2 (raw file):
// ReplySegsTotal returns the counter for the number of segments in a seg reply. func (r Request) ReplySegsTotal(l RequestOkLabels) prometheus.Counter {
Suffix Total
can be dropped
go/path_srv/internal/metrics/requests.go, line 100 at r2 (raw file):
// ReplyRevsTotal returns the counter for the number of revocations in a seg // reply. func (r Request) ReplyRevsTotal(l RequestOkLabels) prometheus.Counter {
ditto
go/path_srv/internal/metrics/requests.go, line 104 at r2 (raw file):
} func DetermineReplyType(segs segfetcher.Segments) proto.PathSegType {
requests.go:104:1: exported function DetermineReplyType should have comment or be unexported
golinter message
go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):
Previously, lukedirtwalker (Lukas Vogel) wrote…
I don't see why, it isn't much code that is duplicated. With table driven approach we would have more lines. But I don't feel strongly about this.
Well, we have three preferences:
You the above, me the table-driven approach, and Dominik the map-table-driven :).
Maybe we should go to public voting!
Feel free to keep it.
go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):
Previously, lukedirtwalker (Lukas Vogel) wrote…
? This can't be done easily, so yeah this will stay here.
It is not clear for me the task.
I would suggest that you create an issue so we can work it. Or not :)
I am not sure when we should add issue and when TODO in the code.
For me usually a TODO in the code will not really be fixed unless a person run into that TODO twice.
0938e7c
to
a926542
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 9 files reviewed, 9 unresolved discussions (waiting on @karampok)
go/path_srv/internal/metrics/metrics.go, line 35 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
= "request_new"
From doc/Metrics.md
Use values that can be searched with regex. E.g. prepend err_ for every error result.
Good point. Done.
go/path_srv/internal/metrics/requests.go, line 87 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
r.total
->r.count
just to be in order.
Done.
go/path_srv/internal/metrics/requests.go, line 82 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
reply_revocation_total
, kormat once asked me not to short the names.
Done.
go/path_srv/internal/metrics/requests.go, line 83 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
Number of segments in (returned) reply to path request.
One is with returned, other not
Done.
go/path_srv/internal/metrics/requests.go, line 94 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
Suffix
Total
can be dropped
Done.
go/path_srv/internal/metrics/requests.go, line 100 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
ditto
Done.
go/path_srv/internal/metrics/requests.go, line 104 at r2 (raw file):
Previously, karampok (Konstantinos) wrote…
requests.go:104:1: exported function DetermineReplyType should have comment or be unexported
golinter message
Done.
go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Well, we have three preferences:
You the above, me the table-driven approach, and Dominik the map-table-driven :).
Maybe we should go to public voting!Feel free to keep it.
I generally like table-driven. But in this case I just don't see the advantage of it, it's very little repetition.
go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
It is not clear for me the task.
I would suggest that you create an issue so we can work it. Or not :)
I am not sure when we should add issue and when TODO in the code.For me usually a TODO in the code will not really be fixed unless a person run into that TODO twice.
Done. (#3240)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status:complete! all files reviewed, all discussions resolved
Adds metrics: ``` # HELP ps_requests_reply_revs_total Number of revocations returned in reply to path request. # TYPE ps_requests_reply_revs_total counter ps_requests_reply_revs_total{cache_only="false",dst_isd="1",type="down"} 0 # HELP ps_requests_reply_segs_total Number of segments in reply to path request. # TYPE ps_requests_reply_segs_total counter ps_requests_reply_segs_total{cache_only="false",dst_isd="1",type="down"} 4 # HELP ps_requests_total Number of path requests. "result" can be one of: [cached,fetched,err_crypto,err_db,err_timeout,err_reply] # TYPE ps_requests_total counter ps_requests_total{cache_only="false",dst_isd="1",result="ok_success",type="down"} 4 ``` Contributes scionproto#3106
a926542
to
68cc687
Compare
Adds metrics:
Contributes #3106
This change is