-
Notifications
You must be signed in to change notification settings - Fork 173
TrustStore: Add metrics #3169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TrustStore: Add metrics #3169
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 18 of 18 files at r1.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @oncilla)
go/lib/infra/modules/trust/handlers.go, line 70 at r1 (raw file):
subCtx, cancelF := context.WithTimeout(h.request.Context(), HandlerTimeout) defer cancelF() subCtx = metrics.CtxWith(subCtx, metrics.TRCReq)
Can you point me out what we do with that, on-line documentation, example? thx!
go/lib/infra/modules/trust/handlers.go, line 343 at r1 (raw file):
// addrLocation extracts the client from an infra request. If the request is nil, // "app" is returned. func addrLocation(client net.Addr, localIA addr.IA) string {
Help function about setting labels could be only inside to metrics.
(e.g. not to have long files, polute the go coverage, and sure personal taste)
Also do we want to put comment on the small letters?
go/lib/infra/modules/trust/handlers.go, line 345 at r1 (raw file):
func addrLocation(client net.Addr, localIA addr.IA) string { if client == nil { return "app"
Usually you do not put direct strings (I do not mind but usually you do ;) )
go/lib/infra/modules/trust/signhelper.go, line 165 at r1 (raw file):
} if v.ignoreSign(cpld, spld.Sign) { // Do not increase metric
why? maybe on the comment better
go/lib/infra/modules/trust/trust.go, line 207 at r1 (raw file):
} } return store.getTRCFromNetwork(ctx, &trcRequest{
It would have been more intuitive to sto increase the counter when you handle this error.
Otherwise in the future someone might just replicate it.
go/lib/infra/modules/trust/trust.go, line 256 at r1 (raw file):
func (store *Store) insertTRCHookForwarding(ctx context.Context, trcObj *trc.TRC) error { if err := store.insertTRCHookLocal(ctx, trcObj); err != nil { return err
on purpose we do not put metric here?
go/lib/infra/modules/trust/trust.go, line 388 at r1 (raw file):
} } l.Result, chain, err = store.getChainFromNetworkWithMetric(ctx, client, &chainRequest{
suggestion would be that we we not mix the labels with the logic, so we could in the future just remove the metrics part without thinking. For example
chain,err:= store.getChainFromNetwork(ctx... )
l:=getLabels(chain,err)
metrics.Store.Lookup(l).Inc()
go/lib/infra/modules/trust/trust.go, line 414 at r1 (raw file):
} _, err := store.trustdb.InsertChain(ctx, chain) if err != nil {
Do we on purpose avoid sending errors?
go/lib/infra/modules/trust/internal/metrics/db.go, line 62 at r1 (raw file):
// Values returns the label values in the order defined by Labels. func (l *QueryLabels) Values() []string {
no * better
go/lib/infra/modules/trust/internal/metrics/handler.go, line 39 at r1 (raw file):
// Values returns the label values in the order defined by Labels. func (l *HandlerLabels) Values() []string {
no *
go/lib/infra/modules/trust/internal/metrics/labels_test.go, line 24 at r1 (raw file):
) func TestQueryLabels(t *testing.T) {
can be an array?
go/lib/infra/modules/trust/internal/metrics/metrics.go, line 33 at r1 (raw file):
ASInspector = "as_inspector" Load = "load" App = "application"
I always suggest that unless we need to export, we should keep private
go/lib/infra/modules/trust/internal/metrics/metrics.go, line 54 at r1 (raw file):
DB = newDB() // Handler exposes the handler metrics. Handler = newHandler()
nit/ Handler is a name that is overloaded
go/lib/infra/modules/trust/trustdb/metrics.go, line 32 at r1 (raw file):
// WithMetrics wraps the given TrustDB into one that also exports metrics. func WithMetrics(driver string, trustDB TrustDB) TrustDB {
types / vars should be on top of the file for better reading
go/lib/infra/modules/trust/trustdb/metrics.go, line 41 at r1 (raw file):
return &metricsTrustDB{ metricsExecutor: rwDBWrapper, db: trustDB,
trustDB exists twice in the struct, can it be once?
go/lib/infra/modules/trust/trustdb/metrics.go, line 46 at r1 (raw file):
type observer struct { driver string
instead of struct/nested struct maybe w could have closure on that.
Also driver i was expected to be something fancier than a string :)
go/lib/infra/modules/trust/trustdb/metrics.go, line 55 at r1 (raw file):
Driver: o.driver, Operation: op, Result: db.ErrToMetricLabel(action(ctx)),
So basically from the code we do the db action because we want to get a value for the metrics :)
Just was super confused around here
go/lib/infra/modules/trust/trustdb/metrics.go, line 231 at r1 (raw file):
var res *cert.Chain var err error db.metrics.Observe(ctx, metrics.GetChainMax, func(ctx context.Context) error {
I am asking for a change but it would be cleaner
res,err:=db.rwDB.GetChainMaxVersion(ctx.ia)
Observe(metrics.GetChainMax,err)
return res,err
4beb125
to
e11c665
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @karampok)
go/lib/infra/modules/trust/handlers.go, line 70 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Can you point me out what we do with that, on-line documentation, example? thx!
Done.
go/lib/infra/modules/trust/handlers.go, line 343 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Help function about setting labels could be only inside to metrics.
(e.g. not to have long files, polute the go coverage, and sure personal taste)
Also do we want to put comment on the small letters?
I don't want to put that in the metrics package.
infra
pulls in a lot of dependencies, and we do not want that in a metrics package.
go/lib/infra/modules/trust/handlers.go, line 345 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Usually you do not put direct strings (I do not mind but usually you do ;) )
Done.
go/lib/infra/modules/trust/signhelper.go, line 165 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
why? maybe on the comment better
Done.
go/lib/infra/modules/trust/trust.go, line 207 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
It would have been more intuitive to sto increase the counter when you handle this error.
Otherwise in the future someone might just replicate it.
I don't understand, what counter and what error are you talking about?
go/lib/infra/modules/trust/trust.go, line 256 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
on purpose we do not put metric here?
yes, there is no counter we could increase here.
go/lib/infra/modules/trust/trust.go, line 388 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
suggestion would be that we we not mix the labels with the logic, so we could in the future just remove the metrics part without thinking. For example
chain,err:= store.getChainFromNetwork(ctx... ) l:=getLabels(chain,err) metrics.Store.Lookup(l).Inc()
I like the idea. 👍
Done
go/lib/infra/modules/trust/trust.go, line 414 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Do we on purpose avoid sending errors?
Yes, there is no point of forwarding an error if we cannot insert the data.
There is nothing the receiving end could do about it, nor should it care.
(This is about forwarding, not responding to the peer.)
go/lib/infra/modules/trust/internal/metrics/db.go, line 62 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
no * better
Done.
go/lib/infra/modules/trust/internal/metrics/handler.go, line 39 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
no *
Done.
go/lib/infra/modules/trust/internal/metrics/labels_test.go, line 24 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
can be an array?
Done as a map, such that we easily spot which sub test fails.
go/lib/infra/modules/trust/internal/metrics/metrics.go, line 33 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
I always suggest that unless we need to export, we should keep private
Done.
go/lib/infra/modules/trust/internal/metrics/metrics.go, line 54 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
nit/ Handler is a name that is overloaded
Agreed, but in this case I think it fits pretty well, since those are the Handler
metrics.
go/lib/infra/modules/trust/trustdb/metrics.go, line 32 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
types / vars should be on top of the file for better reading
WithMetrics is ~constructor of metricsTrustDB -> I put it where the constructor would be.
go/lib/infra/modules/trust/trustdb/metrics.go, line 41 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
trustDB exists twice in the struct, can it be once?
they are different types.
go/lib/infra/modules/trust/trustdb/metrics.go, line 46 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
instead of struct/nested struct maybe w could have closure on that.
Also driver i was expected to be something fancier than a string :)
how would that look like?
And what would be the advantage?
go/lib/infra/modules/trust/trustdb/metrics.go, line 55 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
So basically from the code we do the db action because we want to get a value for the metrics :)
Just was super confused around here
made more explicit
go/lib/infra/modules/trust/trustdb/metrics.go, line 231 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
I am asking for a change but it would be cleaner
res,err:=db.rwDB.GetChainMaxVersion(ctx.ia) Observe(metrics.GetChainMax,err) return res,err
that is not possible because the observe also does tracing
-> it starts and stops a span.
e11c665
to
2d8f275
Compare
go/lib/infra/modules/trust/trust.go, line 207 at r1 (raw file): Previously, Oncilla wrote…
I am not sure if it applies but I mean this
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 18 files reviewed, 2 unresolved discussions (waiting on @karampok and @oncilla)
go/lib/infra/modules/trust/trustdb/metrics.go, line 46 at r1 (raw file):
Previously, Oncilla wrote…
how would that look like?
And what would be the advantage?
hard to demonstrate
func WithMetrics(driver string, trustDB TrustDB) TrustDB {
rwDBWrapper := &metricsExecutor{
rwDB: trustDB,
executeAndObserve: getObserFunc(driver),
}
return &metricsTrustDB{
metricsExecutor: rwDBWrapper,
db: trustDB,
}
}
func getObserFunc(drive string) ( func() ){
return func( .....) {
l:= metrics.QueryLabels{
....
Driver: drive
....
}
}
}
...
db.metricsExecutor.executeAndObserve( ....) instead of db.metricsExecutor.metrics.Observe
Does that make sense?
In my opinion of course, I think the code could have been simpler, but in this case the overall complexity hides this detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @oncilla)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @karampok)
go/lib/infra/modules/trust/trust.go, line 207 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
I am not sure if it applies but I mean this
func process() error { if err := subprocess(); err != nil { sendMetric("err_x") return err } //more processiing sendMetric("Success") //here we should send metrics return nil // because we know it is success } func subprocess() error { err := subsubproccess() if err != nil { sendMetric("err_x") //this can be duplicate } else { sendMetric("Sucess") //this can be wrong because more processing } return err }
We have metrics increments all in this method to exactly avoid the duplication of increments.
E.g. if you scatter the metrics increase around, you cannot really be sure whether metrics are always increased for every branch in the execution, and it is really hard to check.
With the current approach, you only need to check one single method to see whether metrics are updated for every branch.
go/lib/infra/modules/trust/trustdb/metrics.go, line 46 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
hard to demonstrate
func WithMetrics(driver string, trustDB TrustDB) TrustDB { rwDBWrapper := &metricsExecutor{ rwDB: trustDB, executeAndObserve: getObserFunc(driver), } return &metricsTrustDB{ metricsExecutor: rwDBWrapper, db: trustDB, } } func getObserFunc(drive string) ( func() ){ return func( .....) { l:= metrics.QueryLabels{ .... Driver: drive .... } } } ... db.metricsExecutor.executeAndObserve( ....) instead of db.metricsExecutor.metrics.Observe
Does that make sense?
In my opinion of course, I think the code could have been simpler, but in this case the overall complexity hides this detail.
tbh, I don't think we should not invest too much time into this package.
it will be removed when we move to the new CPPKI, same for trust/trust.go above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! all files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! all files reviewed, all discussions resolved
2d8f275
to
a683703
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3.
Reviewable status:complete! all files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @oncilla)
a discussion (no related file):
- Shorten
database
todb
. - Differentiate between requests that could be locally satisfied or not
- change
truststore_outgoing_requests_total
totruststore_sent_requests_total
- change
truststore_served_requests_total
totruststore_received_requests_total
This change adds metrics to the trust store. The exposed metrics are: ``` truststore_lookups_total{client="app",local_only="false",req_type="chain_req",result="success",trigger="sig_verification"} 10 truststore_outgoing_requests_total{cache_only="false",client="app",req_type="chain_req",result="success",server="as_local",trigger="sig_verification"} 8 truststore_signature_verifications_total{result="success",type="chain"} 4 truststore_signature_verifications_total{result="success",type="signature"} 10 truststore_served_requests_total{cache_only="false",client="as_local",req_type="chain_push",result="success"} 207 ``` Additionally, move the TrustDB metrics code around and modify namespace: ``` truststore_database_queries_total{driver="sqlite",operation="get_chain",result="success"} 5 truststore_database_queries_total{driver="sqlite",operation="get_chain_max",result="success"} 2 truststore_database_queries_total{driver="sqlite",operation="get_trc",result="success"} 1 truststore_database_queries_total{driver="sqlite",operation="get_trc_max",result="success"} 2 truststore_database_queries_total{driver="sqlite",operation="insert_chain",result="success"} 208 truststore_database_queries_total{driver="sqlite",operation="insert_trc",result="success"} 52 ``` fixes scionproto#3148
a683703
to
435ad48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 11 of 18 files reviewed, 1 unresolved discussion (waiting on @karampok and @kormat)
a discussion (no related file):
Previously, kormat (Stephen Shirley) wrote…
- Shorten
database
todb
.- Differentiate between requests that could be locally satisfied or not
- change
truststore_outgoing_requests_total
totruststore_sent_requests_total
- change
truststore_served_requests_total
totruststore_received_requests_total
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @kormat)
expose `ok_insert` and `ok_exists` on push ``` # HELP truststore_db_queries_total Total queries to the database # TYPE truststore_db_queries_total counter truststore_db_queries_total{driver="sqlite",operation="get_chain",result="ok_success"} 13 truststore_db_queries_total{driver="sqlite",operation="get_chain_max",result="ok_success"} 2 truststore_db_queries_total{driver="sqlite",operation="get_trc",result="ok_success"} 1 truststore_db_queries_total{driver="sqlite",operation="get_trc_max",result="ok_success"} 2 truststore_db_queries_total{driver="sqlite",operation="insert_chain",result="ok_success"} 64 truststore_db_queries_total{driver="sqlite",operation="insert_trc",result="ok_success"} 10 # HELP truststore_lookups_total Number of crypto lookups in the trust store # TYPE truststore_lookups_total counter truststore_lookups_total{cache_only="false",client="application",req_type="chain_req",result="ok_cached",trigger="sig_verification"} 2295 truststore_lookups_total{cache_only="false",client="application",req_type="chain_req",result="ok_requested",trigger="sig_verification"} 8 truststore_lookups_total{cache_only="false",client="application",req_type="trc_req",result="ok_cached",trigger="sig_verification"} 7 truststore_lookups_total{cache_only="false",client="application",req_type="trc_req",result="ok_requested",trigger="sig_verification"} 1 truststore_lookups_total{cache_only="false",client="as_local",req_type="chain_req",result="ok_cached",trigger="chain_req"} 3 truststore_lookups_total{cache_only="true",client="application",req_type="chain_req",result="err_internal",trigger="load"} 1 truststore_lookups_total{cache_only="true",client="application",req_type="trc_req",result="err_not_found",trigger="load"} 1 # HELP truststore_received_requests_total Number of requests served by the trust store # TYPE truststore_received_requests_total counter truststore_received_requests_total{cache_only="false",client="as_local",req_type="chain_push",result="ok_exists"} 52 truststore_received_requests_total{cache_only="false",client="as_local",req_type="chain_push",result="ok_inserted"} 7 truststore_received_requests_total{cache_only="false",client="as_local",req_type="chain_req",result="ok_success"} 3 truststore_received_requests_total{cache_only="false",client="as_local",req_type="trc_push",result="ok_exists"} 9 truststore_received_requests_total{cache_only="false",client="isd_local",req_type="chain_push",result="ok_inserted"} 4 truststore_received_requests_total{cache_only="true",client="isd_local",req_type="chain_req",result="ok_success"} 9 # HELP truststore_sent_requests_total Number of requests initiated by the trust store # TYPE truststore_sent_requests_total counter truststore_sent_requests_total{cache_only="false",client="application",req_type="chain_req",result="ok_requested",server="isd_local",trigger="sig_verification"} 7 truststore_sent_requests_total{cache_only="false",client="application",req_type="chain_req",result="ok_requested",server="isd_remote",trigger="sig_verification"} 1 truststore_sent_requests_total{cache_only="false",client="application",req_type="trc_req",result="ok_requested",server="isd_remote",trigger="sig_verification"} 1 # HELP truststore_signature_verifications_total Number of signature verifications done by trust store # TYPE truststore_signature_verifications_total counter truststore_signature_verifications_total{result="ok_success",type="chain"} 4 truststore_signature_verifications_total{result="ok_success",type="signature"} 1759 ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r5.
Reviewable status:complete! all files reviewed, all discussions resolved
This change adds metrics to the trust store.
The exposed metrics are:
Additionally, move the TrustDB metrics code around and modify namespace:
fixes #3148
This change is