Skip to content

Conversation

lukedirtwalker
Copy link
Collaborator

@lukedirtwalker lukedirtwalker commented Oct 8, 2019

Adds metrics:

# HELP ps_requests_replied_revocations_total Number of revocations in reply to segments requests.
# TYPE ps_requests_replied_revocations_total counter
ps_requests_replied_revocations_total{cache_only="false",dst_isd="1",seg_type="down"} 0
# HELP ps_requests_replied_segments_total Number of segments in reply to segment requests.
# TYPE ps_requests_replied_segments_total counter
ps_requests_replied_segments_total{cache_only="false",dst_isd="1",seg_type="down"} 4
# HELP ps_requests_total Number of segment requests total. "result" indicates the outcome.
# TYPE ps_requests_total counter
ps_requests_total{cache_only="false",dst_isd="1",result="ok_success",seg_type="down"} 4

Contributes #3106


This change is Reviewable

@lukedirtwalker lukedirtwalker added PS c/observability Metrics, logging, tracing labels Oct 8, 2019
Copy link
Contributor

@karampok karampok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r1.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lukedirtwalker)


go/path_srv/internal/metrics/requests.go, line 29 at r1 (raw file):

)

var reqResults = []string{RequestCached, RequestFetched, ErrCrypto, ErrDB, ErrTimeout, ErrReply}

If it is up to me, I would not include in the description the possible label values. People can get the values from prometheus and there is only risk to forget
updating it.

If we have to keep that, I would like to have it next to the description (some line above) or in the metric/metric.go.
No strong preference though.


go/path_srv/internal/metrics/requests.go, line 33 at r1 (raw file):

// RequestOkLabels contains the labels for a request that succeeded.
type RequestOkLabels struct {
	Type      proto.PathSegType

type is a bit dangerous name (e.g. if in the future we want to make it unexported it will give strange errors). Maybe RequestType


go/path_srv/internal/metrics/requests.go, line 46 at r1 (raw file):

func (l RequestOkLabels) Values() []string {
	return []string{l.Type.String(), strconv.FormatBool(l.CacheOnly),
		strconv.FormatUint(uint64(l.DstISD), 10)}

can we have only .String() here?


go/path_srv/internal/metrics/requests.go, line 69 at r1 (raw file):

	total     *prometheus.CounterVec
	replySegs *prometheus.CounterVec
	replyRevs *prometheus.CounterVec

maybe this could be one-liner


go/path_srv/internal/metrics/requests.go, line 74 at r1 (raw file):

func newRequests() Request {
	return Request{
		total: prom.NewCounterVec(Namespace, "", "requests_total",

subsystem is missing,


go/path_srv/internal/metrics/requests.go, line 79 at r1 (raw file):

			RequestLabels{}.Labels()),
		replySegs: prom.NewCounterVec(Namespace, "", "requests_reply_segs_total",
			"Number of segments in reply to path request.", RequestOkLabels{}.Labels()),

The description confuses me because it does not have the word requests.
What is requests?


go/path_srv/internal/metrics/requests.go, line 86 at r1 (raw file):

// Total returns the counter for requests total.
func (r Request) Total(l RequestLabels) prometheus.Counter {

Total as a name might not be accurate, maybe Requests


go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):

func TestRequestLabels(t *testing.T) {
	promtest.CheckLabelsStruct(t, metrics.RequestOkLabels{})
	promtest.CheckLabelsStruct(t, metrics.RequestLabels{})

maybe we could follow the table test driven approach
https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/metrics_test.go#L24-L41


go/path_srv/internal/segreq/handler.go, line 60 at r1 (raw file):

	ctx := request.Context()
	logger := log.FromCtx(ctx)
	labels := metrics.RequestLabels{

Ideally we should initialize all the label values. (e.g the label value of type for the first error should be proto.PathSegType_unset)

labels := metrics.Labels{ifid: "0", result: success}

if err!=nil {  

A pattern that we follow is the label.WithResult()
Check this

l := metrics.LookupLabels{
Client: addrLocation(client, store.ia),
Trigger: metrics.FromCtx(ctx),
ReqType: metrics.TRCReq,
CacheOnly: opts.LocalOnly,
Result: metrics.ErrInternal,
}
trcObj, err := store.trustdb.GetTRCVersion(ctx, isd, version)
if err != nil {
metrics.Store.Lookup(l.WithResult(metrics.ErrDB)).Inc()
return nil, err
}
if trcObj != nil {
metrics.Store.Lookup(l.WithResult(metrics.OkCached)).Inc()
return trcObj, nil
}


go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):

		logger.Error("Failed to handler request", "err", err)
		sendAck(proto.Ack_ErrCode_reject, err.Error())
		// TODO(lukedirtwalker): set result label if we have error categorization.

left?


go/path_srv/internal/segreq/handler.go, line 108 at r1 (raw file):

		logger.Error("[segReqHandler] Failed to send reply", "err", err)
		labels.Result = metrics.ErrReply
		metrics.Requests.Total(labels).Inc()

the withLabel() approach


go/path_srv/internal/segreq/handler.go, line 171 at r1 (raw file):

}

func determineReplyType(segs segfetcher.Segments) proto.PathSegType {

my preference is that the label functionality to live within the metrics package.
Unless we need to have cycle import (but we can usually avoid it).
For example https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/beaconing.go#L66

Copy link
Collaborator Author

@lukedirtwalker lukedirtwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @karampok)


go/path_srv/internal/metrics/requests.go, line 29 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

If it is up to me, I would not include in the description the possible label values. People can get the values from prometheus and there is only risk to forget
updating it.

If we have to keep that, I would like to have it next to the description (some line above) or in the metric/metric.go.
No strong preference though.

Done.


go/path_srv/internal/metrics/requests.go, line 33 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

type is a bit dangerous name (e.g. if in the future we want to make it unexported it will give strange errors). Maybe RequestType

Done.


go/path_srv/internal/metrics/requests.go, line 46 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

can we have only .String() here?

Done.


go/path_srv/internal/metrics/requests.go, line 69 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

maybe this could be one-liner

It's not something we usually use in our codebase.


go/path_srv/internal/metrics/requests.go, line 74 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

subsystem is missing,

Done.


go/path_srv/internal/metrics/requests.go, line 79 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

The description confuses me because it does not have the word requests.
What is requests?

Done.


go/path_srv/internal/metrics/requests.go, line 86 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

Total as a name might not be accurate, maybe Requests

Done.


go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

maybe we could follow the table test driven approach
https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/metrics_test.go#L24-L41

I don't see why, it isn't much code that is duplicated. With table driven approach we would have more lines. But I don't feel strongly about this.


go/path_srv/internal/segreq/handler.go, line 60 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

Ideally we should initialize all the label values. (e.g the label value of type for the first error should be proto.PathSegType_unset)

labels := metrics.Labels{ifid: "0", result: success}

if err!=nil {  

A pattern that we follow is the label.WithResult()
Check this

l := metrics.LookupLabels{
Client: addrLocation(client, store.ia),
Trigger: metrics.FromCtx(ctx),
ReqType: metrics.TRCReq,
CacheOnly: opts.LocalOnly,
Result: metrics.ErrInternal,
}
trcObj, err := store.trustdb.GetTRCVersion(ctx, isd, version)
if err != nil {
metrics.Store.Lookup(l.WithResult(metrics.ErrDB)).Inc()
return nil, err
}
if trcObj != nil {
metrics.Store.Lookup(l.WithResult(metrics.OkCached)).Inc()
return trcObj, nil
}

Hm yeah I like the With approach. proto.PathSegType_unset is already the default. For results I'd rather have an error default, so that I can easily see that something is wrong in the metrics if we forget to add the With somewhere.


go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

left?

? This can't be done easily, so yeah this will stay here.


go/path_srv/internal/segreq/handler.go, line 108 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

the withLabel() approach

Done.


go/path_srv/internal/segreq/handler.go, line 171 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

my preference is that the label functionality to live within the metrics package.
Unless we need to have cycle import (but we can usually avoid it).
For example https://github.com/scionproto/scion/blob/master/go/beacon_srv/internal/metrics/beaconing.go#L66

Done.

Copy link
Contributor

@karampok karampok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit description needs updating

Reviewed 4 of 4 files at r2.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @lukedirtwalker)


go/path_srv/internal/metrics/metrics.go, line 35 at r2 (raw file):

const (
	Success               = prom.Success
	RegistrationNew       = "new"

= "request_new"

From doc/Metrics.md
Use values that can be searched with regex. E.g. prepend err_ for every error result.


go/path_srv/internal/metrics/requests.go, line 87 at r1 (raw file):

// Total returns the counter for requests total.
func (r Request) Total(l RequestLabels) prometheus.Counter {
	return r.total.WithLabelValues(l.Values()...)

r.total ->r.count just to be in order.


go/path_srv/internal/metrics/requests.go, line 82 at r2 (raw file):

		replySegs: prom.NewCounterVec(Namespace, subsystem, "reply_segs_total",
			"Number of segments in reply to segment requests.", RequestOkLabels{}.Labels()),
		replyRevs: prom.NewCounterVec(Namespace, subsystem, "reply_revs_total",

reply_revocation_total, kormat once asked me not to short the names.


go/path_srv/internal/metrics/requests.go, line 83 at r2 (raw file):

Number of segments in (returned) reply to path request.
One is with returned, other not


go/path_srv/internal/metrics/requests.go, line 94 at r2 (raw file):

// ReplySegsTotal returns the counter for the number of segments in a seg reply.
func (r Request) ReplySegsTotal(l RequestOkLabels) prometheus.Counter {

Suffix Total can be dropped


go/path_srv/internal/metrics/requests.go, line 100 at r2 (raw file):

// ReplyRevsTotal returns the counter for the number of revocations in a seg
// reply.
func (r Request) ReplyRevsTotal(l RequestOkLabels) prometheus.Counter {

ditto


go/path_srv/internal/metrics/requests.go, line 104 at r2 (raw file):

}

func DetermineReplyType(segs segfetcher.Segments) proto.PathSegType {

requests.go:104:1: exported function DetermineReplyType should have comment or be unexported golinter message


go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):

Previously, lukedirtwalker (Lukas Vogel) wrote…

I don't see why, it isn't much code that is duplicated. With table driven approach we would have more lines. But I don't feel strongly about this.

Well, we have three preferences:
You the above, me the table-driven approach, and Dominik the map-table-driven :).
Maybe we should go to public voting!

Feel free to keep it.


go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):

Previously, lukedirtwalker (Lukas Vogel) wrote…

? This can't be done easily, so yeah this will stay here.

It is not clear for me the task.
I would suggest that you create an issue so we can work it. Or not :)
I am not sure when we should add issue and when TODO in the code.

For me usually a TODO in the code will not really be fixed unless a person run into that TODO twice.

Copy link
Collaborator Author

@lukedirtwalker lukedirtwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 9 files reviewed, 9 unresolved discussions (waiting on @karampok)


go/path_srv/internal/metrics/metrics.go, line 35 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

= "request_new"

From doc/Metrics.md
Use values that can be searched with regex. E.g. prepend err_ for every error result.

Good point. Done.


go/path_srv/internal/metrics/requests.go, line 87 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

r.total ->r.count just to be in order.

Done.


go/path_srv/internal/metrics/requests.go, line 82 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

reply_revocation_total, kormat once asked me not to short the names.

Done.


go/path_srv/internal/metrics/requests.go, line 83 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

Number of segments in (returned) reply to path request.
One is with returned, other not

Done.


go/path_srv/internal/metrics/requests.go, line 94 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

Suffix Total can be dropped

Done.


go/path_srv/internal/metrics/requests.go, line 100 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

ditto

Done.


go/path_srv/internal/metrics/requests.go, line 104 at r2 (raw file):

Previously, karampok (Konstantinos) wrote…

requests.go:104:1: exported function DetermineReplyType should have comment or be unexported golinter message

Done.


go/path_srv/internal/metrics/requests_test.go, line 26 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

Well, we have three preferences:
You the above, me the table-driven approach, and Dominik the map-table-driven :).
Maybe we should go to public voting!

Feel free to keep it.

I generally like table-driven. But in this case I just don't see the advantage of it, it's very little repetition.


go/path_srv/internal/segreq/handler.go, line 88 at r1 (raw file):

Previously, karampok (Konstantinos) wrote…

It is not clear for me the task.
I would suggest that you create an issue so we can work it. Or not :)
I am not sure when we should add issue and when TODO in the code.

For me usually a TODO in the code will not really be fixed unless a person run into that TODO twice.

Done. (#3240)

Copy link
Contributor

@karampok karampok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r3.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Adds metrics:
```
# HELP ps_requests_reply_revs_total Number of revocations returned in reply to path request.
# TYPE ps_requests_reply_revs_total counter
ps_requests_reply_revs_total{cache_only="false",dst_isd="1",type="down"} 0
# HELP ps_requests_reply_segs_total Number of segments in reply to path request.
# TYPE ps_requests_reply_segs_total counter
ps_requests_reply_segs_total{cache_only="false",dst_isd="1",type="down"} 4
# HELP ps_requests_total Number of path requests. "result" can be one of: [cached,fetched,err_crypto,err_db,err_timeout,err_reply]
# TYPE ps_requests_total counter
ps_requests_total{cache_only="false",dst_isd="1",result="ok_success",type="down"} 4
```

Contributes scionproto#3106
@lukedirtwalker lukedirtwalker merged commit c865360 into scionproto:master Oct 11, 2019
@lukedirtwalker lukedirtwalker deleted the pubPSMetrics branch October 11, 2019 09:01
@kormat kormat mentioned this pull request Oct 11, 2019
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/observability Metrics, logging, tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants