Skip to content

Conversation

dmcgowan
Copy link
Collaborator

@dmcgowan dmcgowan commented Mar 4, 2020

earlier redis cache was updated when there was any error including any
temporary connectivity issue. This would trigger set calls which would
further increase load and possibly connectivity errors from redis
leaving the system with continuous errors and high latency. Now the
cache is updated only when it is genuine cache miss. Other errors do not
trigger a cache update.

add prometheus metrics for cache calls

@dmcgowan dmcgowan added this to the Registry/2.8 milestone Mar 4, 2020
@dmcgowan dmcgowan force-pushed the upstream-redis-fixes branch 3 times, most recently from 0b4063f to 18c1de0 Compare March 4, 2020 23:22
@manishtomar manishtomar self-requested a review March 4, 2020 23:46
Copy link
Contributor

@manishtomar manishtomar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this. If you don't mind, would you be willing to add unit tests for cachedblobdescriptor?

func NewPrometheusCacheProvider(wrap cache.BlobDescriptorCacheProvider, name, help string) cache.BlobDescriptorCacheProvider {
return &prometheusCacheProvider{
wrap,
// TODO: May want to have fine grained buckets since redis calls are generally <1ms and the default minimum bucket is 5ms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that is required. This will require update in go-metrics API to take in the buckets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would require exposing the prometheus opt through go-metrics. The default bucket is reasonable, this is a good TODO but not a requirement for this.

latencyTimer metrics.LabeledTimer
}

func NewPrometheusCacheProvider(wrap cache.BlobDescriptorCacheProvider, name, help string) cache.BlobDescriptorCacheProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should document all the metrics exposed by registry under docs/. Need not be this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good exercise in which I would be interested in the result

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single /metrics output should be sufficient to start with. I've seen other projects do that.

dmcgowan and others added 5 commits March 4, 2020 17:51
Test base functionality of the cache statter

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* fix redis caching issue

earlier redis cache was updated when there was any error including any
temporary connectivity issue. This would trigger set calls which would
further increase load and possibly connectivity errors from redis
leaving the system with continuous errors and high latency. Now the
cache is updated only when it is genuine cache miss. Other errors do not
trigger a cache update.

* add back tracker Hit() and Miss() calls

*squashed commits*
(cherry picked from commit 6f3e1c10260ef59ba4e9c42e939329fad9fdd8c3)
(cherry picked from commit 6738ff3320cf82cc2df919a95a1bde2f7789a501)

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* redis metrics

it is working but metrics are not very useful since default buckets
start from 5ms and almost all of them are in that range.

* remove extra comment

(cherry picked from commit ba1a1d74e7eb047dd1056548ccf0695e8846782c)
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* Update redis.go

If the dgst key does not exist in the cache when calling HGET, `redis.String` will return an `ErrNil` which we need to translate into `distribution.ErrBlobUnknown` so that the error being returned can be properly handled. This will ensure that `SetDescriptor` is properly called from `cachedBlobStatter::Stat` for `repositoryScopedRedisBlobDescriptorService` which will update the redis cache and be considered as a Miss rather than an Error.

cc @manishtomar

* Update suite.go

Add unit test to ensure missing blobs for scoped repo properly return ErrBlobUnknown when HGET returns redis.ErrNil.

(cherry picked from commit dca6b9526a1d30dd218a9f321c4f84ecc4b5e62e)
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Separate fix for cherry-picked code

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
@dmcgowan dmcgowan force-pushed the upstream-redis-fixes branch from 18c1de0 to d5acfc4 Compare March 5, 2020 01:52
@dmcgowan dmcgowan mentioned this pull request Mar 5, 2020
Copy link
Contributor

@manishtomar manishtomar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding tests! LGTM

The metrics tracker in cached blob statter was replaced with prometheus
metrics and no longer needed.
Remove unused log wrapping.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
@dmcgowan dmcgowan force-pushed the upstream-redis-fixes branch from d5acfc4 to be29c05 Compare March 9, 2020 20:12
@dmcgowan dmcgowan merged commit 581be91 into distribution:master Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants