addrman: Enable selecting addresses by network #27214

amitiuttarwar · 2023-03-06T19:47:33Z

For the full context & motivation of this patch, see #27213

This is joint work with mzumsande.

This PR adds functionality to AddrMan::Select to enable callers to specify a network they are interested in.

Along the way, it refactors the function to deduplicate the logic, updates the local variables to match modern conventions, adds test coverage for both the new and existing Select logic, and adds bench tests for the worst case performance of both the new and existing Select logic.

This functionality is used in the parent PR.

Extract the logic that decides whether the new or the tried table is going to be searched to the beginning of the function. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

DrahtBot · 2023-03-06T19:47:36Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	vasild, brunoerg, ajtowns, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27213 (p2p: Diversify automatic outbound connections with respect to networks by amitiuttarwar)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

brunoerg

Concept ACK

going to review soon

src/addrman.cpp

amitiuttarwar · 2023-03-07T03:02:51Z

thanks for the review @brunoerg ! responded to all your feedback

src/bench/addrman.cpp

brunoerg

crACK 25a64a2

vasild

Approach ACK 25a64a2

Feel free to ignore the nits below. I will try to assess the performance of the new Select(network).

Thanks for working on this!

src/addrman.cpp

src/addrman.h

src/test/addrman_tests.cpp

src/bench/addrman.cpp

vasild · 2023-03-09T11:14:39Z

src/bench/addrman.cpp

+    });
+}
+
+static void AddrManSelectByNetwork(benchmark::Bench& bench)


It is excellent to see benchmark tests added! My only concern with this PR is that it will iterate through a lot of entries in addrman before finding one from the requested network. This in theory is O(addrman size). And the benchmarks will help us assess that. So, it would be nice to compare, on the same, full addrman (e.g. 70k addresses):

the speed of a regular Select()

the speed of a Select(network) where the searched for address is ~~near the end~~ rare. In other words, worst case scenario, when ~70k addresses are iterated before finding the result.

Somehow I don't see that from the added benchmarks. I will play with them to see if that's possible (~~how to put an address "at the end"?~~, that is nonsense, there is no end).

Btw the results are unstable:

:wavy_dash: `AddrManSelectByNetwork` (Unstable with ~1.2 iters. Increase `minEpochIterations` to e.g. 12)

Let me elaborate why I think comparing 1. and 2. above is important (the benchmarks in this PR don't do that):

Right now, on master, when we pick a peer to connect to 1. happens.

With p2p: Diversify automatic outbound connections with respect to networks #27213 when we pick a peer to connect to 2. happens.

I assume most of the addrmans out there are full (tens of thousands of addresses, max 80k), thus testing on an empty addrman is not representative.

So, we will add security at the cost of making it a bit slower. To assess how much slower I tweaked the benchmark as this:

benchmark tweaks

diff --git i/src/bench/addrman.cpp w/src/bench/addrman.cpp index 9fe50f4ec2..d1def9e520 100644 --- i/src/bench/addrman.cpp +++ w/src/bench/addrman.cpp @@ -12,13 +12,13 @@ #include <optional> #include <vector> /* A "source" is a source address from which we have received a bunch of other addresses. */ -static constexpr size_t NUM_SOURCES = 64; +static constexpr size_t NUM_SOURCES = 512; // fills addrman with ~55k addresses, instead of ~15k static constexpr size_t NUM_ADDRESSES_PER_SOURCE = 256; static NetGroupManager EMPTY_NETGROUPMAN{std::vector<bool>()}; static constexpr uint32_t ADDRMAN_CONSISTENCY_CHECK_RATIO{0}; static std::vector<CAddress> g_sources; @@ -137,16 +137,20 @@ static void AddrManSelectByNetwork(benchmark::Bench& bench) CAddress i2p_address(i2p_service, NODE_NONE); i2p_address.nTime = Now<NodeSeconds>(); CNetAddr source = ResolveIP("252.2.2.2"); addrman.Add({i2p_address}, source); FillAddrMan(addrman); - - bench.run([&] { - const auto& address = addrman.Select(/*new_only*/false, NET_I2P); - assert(address.first.ToStringAddrPort() == "udhdrtrcetjm5sxzskjyr5ztpeszydbh4dpl3pl4utgqqw2v4jna.b32.i2p:0"); + fprintf(stderr, "addrman size: %zu\n", addrman.Size()); + + bench.minEpochIterations(300).run([&] { +#if 0 + addrman.Select(/*new_only=*/false); +#else + addrman.Select(/*new_only=*/false, NET_I2P); +#endif }); } static void AddrManGetAddr(benchmark::Bench& bench) { AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO};

The results are:

Select() takes 0.18 microseconds

Select(network) takes 6000 microseconds

Now, this is a big difference, but maybe that is ok. We don't Select() in a tight loop. However, it can be improved relatively easy. I observed that in 2. we visit between ~100 and ~600k bucket positions before finding a match. Given that there are 80k in addrman it means that we visit some multiple times, which is a waste of resources. This is avoided if we don't visit already visited buckets (but still visit buckets in random order). The patch below does that (on top of this PR):

don't visit already visited buckets

diff --git i/src/addrman.cpp w/src/addrman.cpp index 1023c3cbdb..0e196050b9 100644 --- i/src/addrman.cpp +++ w/src/addrman.cpp @@ -16,12 +16,13 @@ #include <streams.h> #include <tinyformat.h> #include <uint256.h> #include <util/check.h> #include <util/time.h> +#include <array> #include <cmath> #include <optional> /** Over how many buckets entries with tried addresses from a single group (/16 for IPv4) are spread */ static constexpr uint32_t ADDRMAN_TRIED_BUCKETS_PER_GROUP{8}; /** Over how many buckets entries with new addresses originating from a single group are spread */ @@ -745,17 +746,39 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option } else { search_tried = insecure_rand.randbool(); } const int bucket_count{search_tried ? ADDRMAN_TRIED_BUCKET_COUNT : ADDRMAN_NEW_BUCKET_COUNT}; + // We want to search buckets in random order and also prefer visiting a + // bucket that we have not visited before instead of visiting an already + // visited bucket. Thus shuffle the buckets and visit that shuffled list + // in order. + + std::array<uint16_t, std::max(ADDRMAN_TRIED_BUCKET_COUNT, ADDRMAN_NEW_BUCKET_COUNT)> + shuffled_bucket_indexes; + for (uint16_t i = 0; i < bucket_count; ++i) { + shuffled_bucket_indexes[i] = i; + } + Shuffle(shuffled_bucket_indexes.begin(), + shuffled_bucket_indexes.begin() + bucket_count, + insecure_rand); + // If we visit a bucket and find 0 matching addresses in it, mark it with + // this and never visit it again. + static constexpr uint16_t ALREADY_VISITED_AND_BORING{std::numeric_limits<uint16_t>::max()}; + + static_assert(ADDRMAN_TRIED_BUCKET_COUNT < ALREADY_VISITED_AND_BORING); + static_assert(ADDRMAN_NEW_BUCKET_COUNT < ALREADY_VISITED_AND_BORING); + // Loop through the addrman table until we find an appropriate entry double chance_factor = 1.0; - while (1) { - // Pick a bucket, and an initial position in that bucket. - int bucket = insecure_rand.randrange(bucket_count); + for (uint16_t b = 0;; b = (b + 1) % bucket_count) { + if (shuffled_bucket_indexes[b] == ALREADY_VISITED_AND_BORING) { + continue; + } + const size_t bucket{shuffled_bucket_indexes[b]}; int initial_position = insecure_rand.randrange(ADDRMAN_BUCKET_SIZE); // Iterate over the positions of that bucket, starting at the initial one, // and looping around. int i; for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) { @@ -769,14 +792,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option } else { break; } } } - // If the bucket is entirely empty, start over with a (likely) different one. - if (i == ADDRMAN_BUCKET_SIZE) continue; + // Start over with a different bucket if this one is entirely empty or + // specific network was requested and it does not contain any addresses + // from that network. + if (i == ADDRMAN_BUCKET_SIZE) { + shuffled_bucket_indexes[b] = ALREADY_VISITED_AND_BORING; + continue; + } // Find the entry to return. int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE; int nId = LookupAddrmanEntry(search_tried, bucket, position); const auto it_found{mapInfo.find(nId)}; assert(it_found != mapInfo.end());

It changes Select(network) to about 3000 microseconds (down from 6000), but that result from the benchmark is averaged between executions that take anything between 100 and 600k iterations. What is more important is that with the change above it never takes more than ~60k iterations, that is - anything between 100 and 60k. Worst case lowered ~10 times.

The optimization brings some complexity but IMO it is worth it.

This looks like a nice speedup in the case where we only have a few addresses - however, due to the constant overhead of building the shuffled list of buckets in the beginning I'd expect performance to go down a bit for the case where we have many addresses to choose from. Do you see this in your benchmark?
Why does this need ALREADY_VISITED_AND_BORING? If we do a for-loop through a pre-shuffled list of buckets instead of a while loop, doesn't that already guarantee that we visit each bucket at most once?

if I am understanding this thread properly, it seems like there are two concepts being discussed:

changes to the bench tests

optimization to AddrManImpl::Select_

RE 1, I'm slightly confused as to the desired coverage. Without the network parameter, I would expect the worst case of the current Select_ function to happen when addrman is practically empty, which is why we introduced AddrManSelectFromAlmostEmpty. On mainnet, this case would be unlikely, which is represented by AddrManSelect. With the network parameter, I would expect worst case to be represented by AddrManSelectByNetwork where theres just 1 I2P address on an addrman filled with other addresses. @vasild can you clarify what feels absent or misrepresentative?

also happy to discuss 2 further, but a question there - does it feel relevant to these patches to improve performance, or is it an orthogonal improvement (since we see similar worst cases right now)?

@mzumsande,

I'd expect performance to go down a bit for the case where we have many addresses to choose from. Do you see this in your benchmark?

Yes, that slows down a bit the fast Select() (any network is ok). I observed that and think it is ok because it is still super fast with or without the shuffling.

Why does this need ALREADY_VISITED_AND_BORING?

The for loop could still repeat the whole array from the start if some address was found but was skipped. ALREADY_VISITED_AND_BORING is a further optimization to completely skip that bucket on the second and further passes through the array.

@amitiuttarwar, you are right that I modified the newly added AddrManSelectByNetwork() so that, in general, it tests the same thing as AddrManSelect() which already exists in master. I did that in order to be sure that everything else is identical: the way addrman is filled, minEpochIterations(300) and the GetPort() call. Way too many times I have been bitten by chasing noise in benchmarks, so I wanted to be sure that this is the only difference in the two things I am comparing:

#if 0 // flip to 1 to test the other case addrman.Select(/*new_only=*/false); #else addrman.Select(/*new_only=*/false, NET_I2P); #endif

also happy to discuss 2 further, but a question there - does it feel relevant to these patches to improve performance, or is it an orthogonal improvement

I think the performance improvement would be a nice addition to the work done by this PR (or maybe even a "must have"?)

(since we see similar worst cases right now)?

Hmm? Now on master on mainnet it takes 0.18 microseconds and with this PR on mainnet it will take 6000 microseconds. That is the average case (averaged by the benchmark executing Select about 3500 times). The difference in the worst case is even more pronounced.

Try to asses how much is the slowdown. Use a full addrman (15k is not full) because that is what people out there are running on. If the slowdown is ok, then no further improvements are necessary on this PR or its parent.

agreed that this is the fundamental thing we are trying to evaluate - is the performance difference significant & acceptable

I will try to compare on a snapshot of an addrman from my public full node, not on artificially filled addrman from the bench. Last time I checked it had ~70k addresses and also it will have at least a bunch of addresses from each network, not just 1 address. That would be more representative.

would be awesome if you could compare based on your addrman snapshot from mainnet!

my understanding of martin's benches is its trying to observe "normal" as well as "worst case". so the examples with 1 address are aiming towards "worst case". but of course "normal" would vary widely from node to node.

What is your "Test 5" doing? Does it call 20 times Select(network)?

No, it calls Select(network) for an addrman that has 20 addresses of the selected network.

I'll try to explain the idea behind my benchmarks better:
Starting point is the observation that the suggested change makes Select slower if we already know a lot of addresses of the desired type, while making it faster if addrman has very few addresses of this type. This is true for botch Select() and Select(network), which is why it is not obvious at first sight if the suggestion is an overall performance improvement or a degradation.

So my idea was to find the cutoff - how many addresses do we need such that both algorithms are equally fast - in situations where we have more addrs than this, the change would be a slowdown, when we have less it's a speedup. This cutoff turned out to be very small for Select() (3 addresses), while being a bit larger for Select(network) - it was approximately 20. It would be helpful to know how things change with a full addrman!

So if we used an addrman with counts similar to those reported by @jonatack, my results would suggest that the pre-shufffling change suggested here would be a speedup for CJDNS-specific queries, while being a slowdown for clearnet, onion, i2p and unspecific queries (all compared to this PR).

I ran this on a mainnet peers.dat:

addrman size total: 72947 addrman size IPv4: 47252 addrman size IPv6: 11491 addrman size Tor: 13305 addrman size I2P: 893 addrman size CJDNS: 6

[patch] How to load /tmp/peers.dat from bench_bitcoin

diff --git i/src/bench/addrman.cpp w/src/bench/addrman.cpp index 8a5cab443f..b8ec4a4071 100644 --- i/src/bench/addrman.cpp +++ w/src/bench/addrman.cpp @@ -1,17 +1,22 @@ // Copyright (c) 2020-2022 The Bitcoin Core developers // Distributed under the MIT software license, see the accompanying // file COPYING or http://www.opensource.org/licenses/mit-license.php. +#include <addrdb.h> #include <addrman.h> #include <bench/bench.h> +#include <chainparams.h> +#include <chainparamsbase.h> #include <netbase.h> #include <netgroup.h> #include <random.h> #include <util/check.h> +#include <util/system.h> #include <util/time.h> +#include <util/translation.h> #include <optional> #include <vector> /* A "source" is a source address from which we have received a bunch of other addresses. */ @@ -21,12 +26,23 @@ static constexpr size_t NUM_ADDRESSES_PER_SOURCE = 256; static NetGroupManager EMPTY_NETGROUPMAN{std::vector<bool>()}; static constexpr uint32_t ADDRMAN_CONSISTENCY_CHECK_RATIO{0}; static std::vector<CAddress> g_sources; static std::vector<std::vector<CAddress>> g_addresses; +std::unique_ptr<AddrMan> LoadAddrmanFromPeersDat() +{ + SelectBaseParams("main"); + SelectParams("main"); + gArgs.ForceSetArg("-datadir", "/tmp"); + std::unique_ptr<AddrMan> addrman; + auto err = LoadAddrman(EMPTY_NETGROUPMAN, gArgs, addrman); + assert(!err.has_value()); + return addrman; +} + static void CreateAddresses() { if (g_sources.size() > 0) { // already created return; } @@ -69,19 +85,12 @@ static void FillAddrMan(AddrMan& addrman) { CreateAddresses(); AddAddressesToAddrMan(addrman); } -static CNetAddr ResolveIP(const std::string& ip) -{ - CNetAddr addr; - LookupHost(ip, addr, false); - return addr; -} - static CService ResolveService(const std::string& ip, uint16_t port = 0) { CService serv; Lookup(ip, serv, port, false); return serv; } @@ -125,26 +134,24 @@ static void AddrManSelectFromAlmostEmpty(benchmark::Bench& bench) (void)addrman.Select(); }); } static void AddrManSelectByNetwork(benchmark::Bench& bench) { - AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO}; - - // add single I2P address to new table - CService i2p_service; - i2p_service.SetSpecial("udhdrtrcetjm5sxzskjyr5ztpeszydbh4dpl3pl4utgqqw2v4jna.b32.i2p"); - CAddress i2p_address(i2p_service, NODE_NONE); - i2p_address.nTime = Now<NodeSeconds>(); - CNetAddr source = ResolveIP("252.2.2.2"); - addrman.Add({i2p_address}, source); - - FillAddrMan(addrman); + auto addrman = LoadAddrmanFromPeersDat(); + std::cout << "addrman size total: " << addrman->Size() << std::endl + << "addrman size IPv4: " << addrman->Size(NET_IPV4) << std::endl + << "addrman size IPv6: " << addrman->Size(NET_IPV6) << std::endl + << "addrman size Tor: " << addrman->Size(NET_ONION) << std::endl + << "addrman size I2P: " << addrman->Size(NET_I2P) << std::endl + << "addrman size CJDNS: " << addrman->Size(NET_CJDNS) << std::endl; bench.run([&] { - (void)addrman.Select(/*new_only=*/false, NET_I2P); + //addrman->Select(/*new_only=*/false); + //addrman->Select(/*new_only=*/false, NET_I2P); + addrman->Select(/*new_only=*/false, NET_CJDNS); }); } static void AddrManGetAddr(benchmark::Bench& bench) { AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO};

Results:

Select(/*new_only=*/false): 0.265 microseconds, this is what master is doing before this PR

After this PR:

Select(/*new_only=*/false, NET_IPV4): 0.5 microseconds

Select(/*new_only=*/false, NET_IPV6): 1.7 microseconds

Select(/*new_only=*/false, NET_ONION): 3.1 microseconds

Select(/*new_only=*/false, NET_I2P): 27 microseconds

Select(/*new_only=*/false, NET_CJDNS): 1600 microseconds

Is this slowdown acceptable? Maybe yes. We don't call Select in a tight loop, so maybe even the 1600 microseconds is ok.

What happens if we mark and subsequently skip buckets we have seen to contain 0 interesting addresses, without the shuffling? I guess that should bring most of the improvement from the above patch without the downside of the shuffling. Here is the change on top of this PR:

[patch] Skip boring

diff --git i/src/addrman.cpp w/src/addrman.cpp index cdfd079fcd..46f4de99be 100644 --- i/src/addrman.cpp +++ w/src/addrman.cpp @@ -747,15 +747,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option } const int bucket_count{search_tried ? ADDRMAN_TRIED_BUCKET_COUNT : ADDRMAN_NEW_BUCKET_COUNT}; // Loop through the addrman table until we find an appropriate entry double chance_factor = 1.0; + std::unordered_set<int> already_visited_and_boring_buckets; while (1) { // Pick a bucket, and an initial position in that bucket. int bucket = insecure_rand.randrange(bucket_count); + if (already_visited_and_boring_buckets.count(bucket) > 0) { + continue; + } int initial_position = insecure_rand.randrange(ADDRMAN_BUCKET_SIZE); // Iterate over the positions of that bucket, starting at the initial one, // and looping around. int i; for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) { @@ -770,14 +774,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option } else { break; } } } - // If the bucket is entirely empty, start over with a (likely) different one. - if (i == ADDRMAN_BUCKET_SIZE) continue; + // Start over with a different bucket if this one is entirely empty or + // specific network was requested and it does not contain any addresses + // from that network. + if (i == ADDRMAN_BUCKET_SIZE) { + already_visited_and_boring_buckets.insert(bucket); + continue; + } // Find the entry to return. int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE; int nId = GetEntry(search_tried, bucket, position); const auto it_found{mapInfo.find(nId)}; assert(it_found != mapInfo.end());

Result for NET_CJDNS where the slowdown is most pronounced: 1200 microseconds (vs 1600). Those numbers are averaged over many runs and they do not tell the whole story. Here are the distributions without "skip boring" (the height of a bar indicates how many addresses were checked before finding the result, e.g. if the bar that spans between 2500 and 3500 is tall 280, it means 280 of the runs checked 2500-3500 addresses before finding the result).

with "skip boring":

I don't see why 1.6 milliseconds to choose an address would be a concern.

A simpler approach to skipping already tried buckets might be something like this:

int n = 1 << 8; // power of 2 int initial_pos = random(n); int step = random(n/2) * 2 + 1; // odd number int i = initial_pos; do { if (check_bucket(i)) break; i = (i + step) % n; // oops, try a new bucket } while (i != initial_pos);

That doesn't need a full shuffle, still hits every bucket, and only hits each bucket once (provided step and n are co-prime).

Based on vasild's stats above, I think this is fine to merge without any improvement here, but it might be interesting to rebench with this approach -- performance should benefit a little from needing less randomness as well as avoiding repeated work...

If we wanted to making things actually reliably fast for cjdns-like networks lost in a sea of ipv4 addresses, then having separate addrman tables for each network might be worth exploring. That would probably help prevent an attacker from blocking valid ipv4 addresses with a mass of valid-but-sybiled tor addresses, which might be worthwhile.

amitiuttarwar · 2023-03-11T18:38:06Z

thank you for the in-depth review @vasild ! I have addressed all your review comments (incorporated most, left questions on a couple)

vasild

ACK 09d5145

Suggestions remaining (non-blocker):

Check out-of-bounds array access in AddrManImpl::GetEntry().
I can't find where the suggested test was added.
Performance wise I think the change in #27214 (comment) would be very nice to have.

Thanks!

Unused until later commit. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

in preparation for consolidating the logic for searching the new and tried tables, generalize the call paths for both Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

-BEGIN VERIFY SCRIPT- sed -i 's/fChanceFactor/chance_factor/g' src/addrman.cpp sed -i 's/nBucketPos/initial_position/g' src/addrman.cpp sed -i 's/nBucket/bucket/g' src/addrman.cpp src/addrman_impl.h sed -i 's/newOnly/new_only/g' src/addrman.cpp src/addrman_impl.h src/addrman.h src/test/addrman_tests.cpp -END VERIFY SCRIPT- Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

Add an optional parameter to the addrman Select function that allows callers to specify which network the returned address should be on. Ensure that the proper table is selected with different cases of whether the new or tried table has network addresses that match. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

this adds coverage for the 7 different cases of which table should be selected when the network is specified. the different cases are the result of new_only being true or false and whether there are network addresses on both, neither, or one of new vs tried tables. the only case not covered is when new_only is false and the only network addresses are on the new table. Co-authored-by: Martin Zumsande <mzumsande@gmail.com> Co-authored-by: Vasil Dimov <vd@FreeBSD.org>

if an addr matching the network requirements is only on the new table and select is invoked with new_only = false, ensure that the code selects the new table. in order to test this case, we use a non deterministic addrman. this means we cannot have more than one address in any addrman table, or risk sporadic failures when the second address happens to conflict. if the code chose a table at random, the test would fail 50% of the time Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

to evaluate the worst case performance with the network parameter passed through, fill the new table with addresses then add a singular I2P address to retrieve Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

the addrman select function will demonstrate it's worst case performance when it is almost empty, because it might have to linearly search several buckets. add a bench test to cover this case Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

amitiuttarwar · 2023-03-18T01:10:45Z

two small updates:

added assertions for bounds checking in GetEntry
added the suggested test

brunoerg · 2023-03-24T14:11:43Z

src/addrman.cpp

+
+        // Iterate over the positions of that bucket, starting at the initial one,
+        // and looping around.
+        int i;


Just a suggestion to make the code a lit bit cleaner in case you have to touch it again, perhaps we don't need to create position and nId again after the loop.

diff --git a/src/addrman.cpp b/src/addrman.cpp index cdfd079fc..30ce2cadc 100644 --- a/src/addrman.cpp +++ b/src/addrman.cpp @@ -757,10 +757,10 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option // Iterate over the positions of that bucket, starting at the initial one, // and looping around. - int i; + int i, position, node_id; for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) { - int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE; - int node_id = GetEntry(search_tried, bucket, position); + position = (initial_position + i) % ADDRMAN_BUCKET_SIZE; + node_id = GetEntry(search_tried, bucket, position); if (node_id != -1) { if (network.has_value()) { const auto it{mapInfo.find(node_id)}; @@ -777,9 +777,7 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option if (i == ADDRMAN_BUCKET_SIZE) continue; // Find the entry to return. - int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE; - int nId = GetEntry(search_tried, bucket, position); - const auto it_found{mapInfo.find(nId)}; + const auto it_found{mapInfo.find(node_id)}; assert(it_found != mapInfo.end()); const AddrInfo& info{it_found->second};

done in #27745

vasild

ACK 17e7054

Maybe other reviewers would be interested in the performance discussion:
#27214 (comment)

brunoerg

re-ACK 17e7054

ajtowns

ACK 17e7054

ajtowns · 2023-04-13T01:13:56Z

src/addrman.h

+     * @param[in] new_only Whether to only select addresses from the new table. Passing `true` returns
+     *                     an address from the new table or an empty pair. Passing `false` will return an
+     *                     address from either the new or tried table (it does not guarantee a tried entry).
+     * @param[in] network  Select only addresses of this network (nullopt = all)


Probably worth documenting that passing in a network here rather than nullopt can make the search significantly slower (though still fast)?

done in #27745

ajtowns · 2023-04-13T01:25:11Z

src/addrman.cpp

+            if (node_id != -1) {
+                if (network.has_value()) {
+                    const auto it{mapInfo.find(node_id)};
+                    assert(it != mapInfo.end());


Do we really want an unconditional assertion failure here, vs say:

if (Assume(it != mapInfo.end()) && it->second.GetNetwork() == *network) break;

(maybe a similar question for GetEntry)

done in #27745

ajtowns · 2023-04-13T01:32:24Z

src/test/addrman_tests.cpp

+    bool new_selected{false};
+    bool tried_selected{false};
+
+    while (!new_selected || !tried_selected) {


Seems like having:

int counter = 256; while (--counter > 0 && (!new_selected || !tried_selected)) { ... } BOOST_CHECK(new_selected); BOOST_CHECK(tried_selected);

would be better here than an infinite loop leading to CI timeout on error. A one-in-2**256 chance of failure is already the security of bitcoin as a whole, after all; and if our randomness generator is biassed enough that it fails more than that, it's probably better that the test case fails anyway? That we use the deterministic addrman presumably means any failures here will always be deterministic, no?

done in #27745

mzumsande

Code Review ACK 17e7054

I reviewed the PR again - although I'm a coauthor, so not sure about the etiquette of ack'ing.

I think there is a small bug in the non-deterministic test that could make it fail on rare occasions and should be fixed - here or in a follow-up.

src/test/addrman_tests.cpp

mzumsande · 2023-04-17T19:17:52Z

src/addrman.h

-     * @param[in] new_only Whether to only select addresses from the new table.
+     * @param[in] new_only Whether to only select addresses from the new table. Passing `true` returns
+     *                     an address from the new table or an empty pair. Passing `false` will return an
+     *                     address from either the new or tried table (it does not guarantee a tried entry).


nit: mentioning the "empty pair" only in the first case makes it sound as if this wasn't one of the possible outcomes in the second case (but it is).

done in #27745

amitiuttarwar · 2023-04-17T21:51:30Z

thanks for the reviews! I'll address the outstanding comments in a followup

fanquake · 2023-04-18T08:27:34Z

This will be merged after branch-off. It would be good to either get the follow up opened in advance, so we can avoid any time between merging, and the potential for intermittent CI failures, or, if that change is straight-forward enough, you could push an additional commit here, leaving the rest of the branch as-is.

amitiuttarwar · 2023-04-19T02:11:41Z

added an extra commit to ensure the non-deterministic test will not fail intermittently. will incorporate the outstanding feedback separately

amitiuttarwar · 2023-04-21T02:55:00Z

opened #27506 for the test fix

10a354f test: prevent intermittent failures (Amiti Uttarwar) Pull request description: Follow up to #27214 - add an address to the tried table before the new table to make sure a new table collision is not possible. ACKs for top commit: mzumsande: Code review ACK 10a354f - the fix is what I suggested [here](bitcoin/bitcoin#27214 (comment)) and should make these intermittent failures impossible. Tree-SHA512: 24099f02e1915395130065af0ef6a2a1893955d222517d156d928765541d9c427da00172a9b5a540163f4d6aae93ca3882e8267eeb35ecc595d42178abc6191c

17e7054 doc: clarify new_only param for Select function (Amiti Uttarwar) b0010c8 bench: test select for a new table with only one address (Amiti Uttarwar) 9b91aae bench: add coverage for addrman select with network parameter (Amiti Uttarwar) 22a4d14 test: increase coverage of addrman select (without network) (Amiti Uttarwar) a98e542 test: add addrman test for special case (Amiti Uttarwar) 5c8b4ba tests: add addrman_select_by_network test (Amiti Uttarwar) 6b22928 addrman: add functionality to select by network (Amiti Uttarwar) 26c3bf1 scripted-diff: rename local variables to match modern conventions (Amiti Uttarwar) 4880641 refactor: consolidate select logic for new and tried tables (Amiti Uttarwar) ca2a9c5 refactor: generalize select logic (Amiti Uttarwar) 052fbcd addrman: Introduce helper to generalize looking up an addrman entry (Amiti Uttarwar) 9bf078f refactor: update Select_ function (Amiti Uttarwar) Pull request description: For the full context & motivation of this patch, see bitcoin#27213 This is joint work with mzumsande. This PR adds functionality to `AddrMan::Select` to enable callers to specify a network they are interested in. Along the way, it refactors the function to deduplicate the logic, updates the local variables to match modern conventions, adds test coverage for both the new and existing `Select` logic, and adds bench tests for the worst case performance of both the new and existing `Select` logic. This functionality is used in the parent PR. ACKs for top commit: vasild: ACK 17e7054 brunoerg: re-ACK 17e7054 ajtowns: ACK 17e7054 mzumsande: Code Review ACK 17e7054 Tree-SHA512: e99d1ce0c44a15601a3daa37deeadfc9d26208a92969ecffbea358d57ca951102d759734ccf77eacd38db368da0bf5b6fede3cd900d8a77b3061f4adc54e52d8

10a354f test: prevent intermittent failures (Amiti Uttarwar) Pull request description: Follow up to bitcoin#27214 - add an address to the tried table before the new table to make sure a new table collision is not possible. ACKs for top commit: mzumsande: Code review ACK 10a354f - the fix is what I suggested [here](bitcoin#27214 (comment)) and should make these intermittent failures impossible. Tree-SHA512: 24099f02e1915395130065af0ef6a2a1893955d222517d156d928765541d9c427da00172a9b5a540163f4d6aae93ca3882e8267eeb35ecc595d42178abc6191c

cd8ef5b test: ensure addrman test is finite (Amiti Uttarwar) b9f1e86 addrman: change asserts to Assumes (Amiti Uttarwar) 7687707 doc: update `Select` function description (Amiti Uttarwar) 2b6bd12 refactor: de-duplicate lookups (Amiti Uttarwar) Pull request description: this PR addresses outstanding review comments from #27214 ACKs for top commit: achow101: ACK cd8ef5b mzumsande: Code Review ACK cd8ef5b brunoerg: crACK cd8ef5b Tree-SHA512: 669f67904263e3f51c39b175eabf5fa1b1e7b6841e889656afec33d0bd93fb446de9403f0a91b186ddeaf29498c8938484a0547b1188256c4e7c90db6f30bb55

cd8ef5b test: ensure addrman test is finite (Amiti Uttarwar) b9f1e86 addrman: change asserts to Assumes (Amiti Uttarwar) 7687707 doc: update `Select` function description (Amiti Uttarwar) 2b6bd12 refactor: de-duplicate lookups (Amiti Uttarwar) Pull request description: this PR addresses outstanding review comments from bitcoin#27214 ACKs for top commit: achow101: ACK cd8ef5b mzumsande: Code Review ACK cd8ef5b brunoerg: crACK cd8ef5b Tree-SHA512: 669f67904263e3f51c39b175eabf5fa1b1e7b6841e889656afec33d0bd93fb446de9403f0a91b186ddeaf29498c8938484a0547b1188256c4e7c90db6f30bb55

refactor: update Select_ function

9bf078f

Extract the logic that decides whether the new or the tried table is going to be searched to the beginning of the function. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

DrahtBot changed the title ~~addrman: Enable selecting addresses by network~~ addrman: Enable selecting addresses by network Mar 6, 2023

DrahtBot added the P2P label Mar 6, 2023

amitiuttarwar mentioned this pull request Mar 6, 2023

p2p: Diversify automatic outbound connections with respect to networks #27213

Merged

brunoerg reviewed Mar 6, 2023

View reviewed changes

src/addrman.cpp Show resolved Hide resolved

brunoerg reviewed Mar 6, 2023

View reviewed changes

src/addrman.cpp Outdated Show resolved Hide resolved

brunoerg reviewed Mar 6, 2023

View reviewed changes

src/addrman.cpp Outdated Show resolved Hide resolved

amitiuttarwar force-pushed the 2023-03-select-by-network branch from fdfeac9 to 3a93327 Compare March 7, 2023 03:01

brunoerg reviewed Mar 7, 2023

View reviewed changes

src/bench/addrman.cpp Outdated Show resolved Hide resolved

brunoerg reviewed Mar 7, 2023

View reviewed changes

src/bench/addrman.cpp Show resolved Hide resolved

amitiuttarwar force-pushed the 2023-03-select-by-network branch from 3a93327 to 25a64a2 Compare March 7, 2023 22:52

brunoerg approved these changes Mar 8, 2023

View reviewed changes

vasild reviewed Mar 9, 2023

View reviewed changes

amitiuttarwar force-pushed the 2023-03-select-by-network branch from 25a64a2 to 09d5145 Compare March 11, 2023 17:05

vasild approved these changes Mar 15, 2023

View reviewed changes

DrahtBot requested a review from brunoerg March 15, 2023 14:36

amitiuttarwar and others added 10 commits March 17, 2023 17:59

addrman: Introduce helper to generalize looking up an addrman entry

052fbcd

Unused until later commit. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

refactor: generalize select logic

ca2a9c5

in preparation for consolidating the logic for searching the new and tried tables, generalize the call paths for both Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

refactor: consolidate select logic for new and tried tables

4880641

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

test: increase coverage of addrman select (without network)

22a4d14

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

bench: add coverage for addrman select with network parameter

9b91aae

to evaluate the worst case performance with the network parameter passed through, fill the new table with addresses then add a singular I2P address to retrieve Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

doc: clarify new_only param for Select function

17e7054

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

amitiuttarwar force-pushed the 2023-03-select-by-network branch from 09d5145 to 17e7054 Compare March 18, 2023 01:04

brunoerg reviewed Mar 24, 2023

View reviewed changes

vasild approved these changes Mar 25, 2023

View reviewed changes

DrahtBot requested a review from brunoerg March 25, 2023 05:15

brunoerg approved these changes Mar 29, 2023

View reviewed changes

ajtowns reviewed Apr 13, 2023

View reviewed changes

mzumsande reviewed Apr 17, 2023

View reviewed changes

fanquake added this to the 26.0 milestone Apr 18, 2023

amitiuttarwar force-pushed the 2023-03-select-by-network branch from 03e996c to 17854ae Compare April 19, 2023 02:07

amitiuttarwar force-pushed the 2023-03-select-by-network branch 2 times, most recently from 5da828b to 17e7054 Compare April 19, 2023 17:35

DrahtBot added the CI failed label Apr 19, 2023

achow101 merged commit 3a93957 into bitcoin:master Apr 20, 2023

amitiuttarwar mentioned this pull request Apr 21, 2023

test: prevent intermittent failures #27506

Merged

maflcko removed the CI failed label Apr 21, 2023

amitiuttarwar mentioned this pull request May 24, 2023

addrman: select addresses by network follow-up #27745

Merged

bitcoin locked and limited conversation to collaborators May 23, 2024

addrman: Enable selecting addresses by network #27214

addrman: Enable selecting addresses by network #27214

Uh oh!

Conversation

amitiuttarwar commented Mar 6, 2023

Uh oh!

DrahtBot commented Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

brunoerg left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amitiuttarwar commented Mar 7, 2023

Uh oh!

Uh oh!

Uh oh!

brunoerg left a comment

Choose a reason for hiding this comment

Uh oh!

vasild left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vasild Mar 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amitiuttarwar commented Mar 11, 2023

Uh oh!

vasild left a comment

Choose a reason for hiding this comment

Uh oh!

amitiuttarwar commented Mar 18, 2023

Uh oh!

brunoerg Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasild left a comment

Choose a reason for hiding this comment

Uh oh!

brunoerg left a comment

Choose a reason for hiding this comment

Uh oh!

ajtowns left a comment

DrahtBot commented Mar 6, 2023 •

edited

Loading

brunoerg left a comment •

edited

Loading

vasild left a comment •

edited

Loading

vasild Mar 9, 2023 •

edited

Loading

brunoerg Mar 24, 2023 •

edited

Loading

ajtowns Apr 13, 2023 •

edited

Loading