Skip to content

Replace Redis with local in-memory peer store #270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 4, 2020
Merged

Conversation

codygibb
Copy link
Contributor

@codygibb codygibb commented Aug 2, 2020

The use of Redis in Kraken is purely a historical artifact from when announce requests were not sharded and all Trackers shared the same Redis instance. These days, we have used client-side hashing in Agent to ensure that all Agent announces for a particular info hash are sent to the same Tracker, meaning the Trackers do not need to share peer stores anymore. As such, Tracker deployments have generally used local Redis instances.

This PR introduces a new peerstore.Store implementation called LocalStore, which is an in-memory two-level locking map. I made a couple of performance-conscious decisions that made the implementation more complex than reviewers may be accustomed to. The main optimization is ensuring that the top-level lock and infohash-level locks are never held at the same time except for runCleanup (which only runs every hour). This ensures that reads/writes to different infohashes only ever compete for the top-level lock. The alternative is to acquire the infohash-level lock while still holding the top-level lock, which is simpler, but causes more lock contention.

Here are the benchmarks for 4 hosts x 400 peers x 100 hashes announcing every 2s for 2m:

Redis:

min(ms) p50(ms) p95(ms) p99(ms) max(ms)
0.00    1.00    20.00   71.00   79.00

Local:

min(ms) p50(ms) p95(ms) p99(ms) max(ms)
0.00    1.00    15.00   22.00   25.00

@codecov
Copy link

codecov bot commented Aug 2, 2020

Codecov Report

Merging #270 into master will increase coverage by 0.23%.
The diff coverage is 80.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #270      +/-   ##
==========================================
+ Coverage   66.16%   66.40%   +0.23%     
==========================================
  Files         184      186       +2     
  Lines        9098     9224     +126     
==========================================
+ Hits         6020     6125     +105     
- Misses       2319     2337      +18     
- Partials      759      762       +3     
Impacted Files Coverage Δ
tracker/peerstore/redis.go 71.73% <0.00%> (-0.79%) ⬇️
tracker/peerstore/store.go 0.00% <0.00%> (ø)
tracker/peerstore/testing.go 0.00% <0.00%> (ø)
tracker/peerstore/config.go 70.00% <33.33%> (-6.48%) ⬇️
tracker/peerstore/local.go 90.17% <90.17%> (ø)
lib/torrent/scheduler/events.go 72.59% <0.00%> (+1.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59fa7bc...0a879eb. Read the comment docs.

// At this point, A is holding onto a peerGroup reference which has been
// deleted from the peerGroups map, and thus has no choice but to attempt to
// reload a new peerGroup. Since the cleanup interval is quite large, it is
// *extremeley* unlikely this for-loop will execute more than twice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spelling

}
g.mu.Lock()
for _, id := range expired {
delete(g.peers, id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think there's a race condition here where UpdatePeers could update the expiresAt field between the time we release the read lock and acquire the write lock. It seems like we should recheck the expiredAt value here before we actually perform the delete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Nice catch.

@bbarnes52-zz
Copy link

LGTM

s.mu.RUnlock()

for _, h := range hashes {
s.mu.RLock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to to combine this with the above for loop (to reduce the number of lock operations)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

}

var expired []int
g.mu.RLock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could actually consider just not locking here at all. If any entries are stale, they'll be checked again (with the lock) before they're deleted.

The benefit to this is you reduce the number of lock operations. This is nice because a lock operation always requires a memory access (it can't use the processor cache). While we don't really care about the extra cycles this introduces to the GC thread, if there are threads running on other processors (e.g. readers) that are also performing lock operations, there will be contention for the memory bus which may slow down the readers.

Totally optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UpdatePeers function is appending to the slice, therefore it's not safe to access without a lock.

@codygibb codygibb merged commit ecab987 into master Aug 4, 2020
@callmeyesh
Copy link

callmeyesh commented Nov 10, 2020

@codygibb Is there any documentation on how to use this? Could you also explain what you mean by we have used client-side hashing in Agent to ensure that all Agent announces for a particular info hash are sent to the same Tracker

@codygibb
Copy link
Contributor Author

@codygibb Is there any documentation on how to use this? Could you also explain what you mean by we have used client-side hashing in Agent to ensure that all Agent announces for a particular info hash are sent to the same Tracker

The local store is enabled by default. If you want to continue using Redis, you can set peerstore.redis.enabled: true.

Check out https://github.com/uber/kraken/blob/master/tracker/announceclient/client.go#L111 for how client-side hashing works.

@callmeyesh
Copy link

@codygibb Thank you. One last question. We are using spot instances on AWS as kraken agents. Which means AWS can kill these instances anytime. I am seeing some Error initializing outgoing handshake in the kraken-agent logs

2020-11-13T21:18:42.811Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.162.97.195:16001: i/o timeout	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "addr": "10.162.97.195:16001"}
2020-11-13T21:18:42.811Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.165.213.5:16001: i/o timeout	{"peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "addr": "10.165.213.5:16001"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 9	{"hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:157	Connection blacklisted for 30s	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 10	{"hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:157	Connection blacklisted for 30s	{"peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2"}
2020-11-13T21:18:42.918Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.162.97.195:16001: i/o timeout	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "4510c63237a54606f195bcbb9401ca7b0c25219d", "addr": "10.162.97.195:16001"}
2020-11-13T21:18:42.918Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 9	{"hash": "4510c63237a54606f195bcbb9401ca7b0c25219d", "peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc"}

I suspect this is because I have peer_set_window_size: 1m & max_peer_set_windows: 5, which means the announce requests are only expiring every 5mins but the agents are getting preempted? Is my understanding correct?

I want to understand what announce request mean? Is it the time frame a tracker will keep the ip of the peer and share it with other peers?

Also is there a default egress & ingress limits for agents & origins? We are using peers which have 10 Gigabit network performance & 25 Gigabit on the origins. I wasn't sure if this limit is applied by default on all agents.

@codygibb
Copy link
Contributor Author

@codygibb Thank you. One last question. We are using spot instances on AWS as kraken agents. Which means AWS can kill these instances anytime. I am seeing some Error initializing outgoing handshake in the kraken-agent logs

2020-11-13T21:18:42.811Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.162.97.195:16001: i/o timeout	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "addr": "10.162.97.195:16001"}
2020-11-13T21:18:42.811Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.165.213.5:16001: i/o timeout	{"peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "addr": "10.165.213.5:16001"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 9	{"hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:157	Connection blacklisted for 30s	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 10	{"hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2", "peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8"}
2020-11-13T21:18:42.811Z	INFO	connstate/state.go:157	Connection blacklisted for 30s	{"peer": "84eb169b49456cf0e1540ba456ebb63fb68425a8", "hash": "34311dd60b630c728d0e188dc0b7e11cd311b3d2"}
2020-11-13T21:18:42.918Z	INFO	scheduler/scheduler.go:407	Error initializing outgoing handshake: dial: dial tcp 10.162.97.195:16001: i/o timeout	{"peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc", "hash": "4510c63237a54606f195bcbb9401ca7b0c25219d", "addr": "10.162.97.195:16001"}
2020-11-13T21:18:42.918Z	INFO	connstate/state.go:210	Deleted pending conn, capacity now at 9	{"hash": "4510c63237a54606f195bcbb9401ca7b0c25219d", "peer": "e69aaeb9f8626796d049c7b6cf21275fac80fdfc"}

I suspect this is because I have peer_set_window_size: 1m & max_peer_set_windows: 5, which means the announce requests are only expiring every 5mins but the agents are getting preempted? Is my understanding correct?

I want to understand what announce request mean? Is it the time frame a tracker will keep the ip of the peer and share it with other peers?

Also is there a default egress & ingress limits for agents & origins? We are using peers which have 10 Gigabit network performance & 25 Gigabit on the origins. I wasn't sure if this limit is applied by default on all agents.

Would you mind moving this to an issue? The problem you're seeing seems unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants