-
Notifications
You must be signed in to change notification settings - Fork 37.8k
addrman: Enable selecting addresses by network #27214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
addrman: Enable selecting addresses by network #27214
Conversation
Extract the logic that decides whether the new or the tried table is going to be searched to the beginning of the function. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
going to review soon
fdfeac9
to
3a93327
Compare
thanks for the review @brunoerg ! responded to all your feedback |
3a93327
to
25a64a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crACK 25a64a2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK 25a64a2
Feel free to ignore the nits below. I will try to assess the performance of the new Select(network)
.
Thanks for working on this!
}); | ||
} | ||
|
||
static void AddrManSelectByNetwork(benchmark::Bench& bench) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is excellent to see benchmark tests added! My only concern with this PR is that it will iterate through a lot of entries in addrman before finding one from the requested network. This in theory is O(addrman size)
. And the benchmarks will help us assess that. So, it would be nice to compare, on the same, full addrman (e.g. 70k addresses):
- the speed of a regular
Select()
- the speed of a
Select(network)
where the searched for address isnear the endrare. In other words, worst case scenario, when ~70k addresses are iterated before finding the result.
Somehow I don't see that from the added benchmarks. I will play with them to see if that's possible (how to put an address "at the end"?, that is nonsense, there is no end).
Btw the results are unstable:
:wavy_dash: `AddrManSelectByNetwork` (Unstable with ~1.2 iters. Increase `minEpochIterations` to e.g. 12)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me elaborate why I think comparing 1. and 2. above is important (the benchmarks in this PR don't do that):
- Right now, on
master
, when we pick a peer to connect to 1. happens. - With p2p: Diversify automatic outbound connections with respect to networks #27213 when we pick a peer to connect to 2. happens.
- I assume most of the addrmans out there are full (tens of thousands of addresses, max 80k), thus testing on an empty addrman is not representative.
So, we will add security at the cost of making it a bit slower. To assess how much slower I tweaked the benchmark as this:
benchmark tweaks
diff --git i/src/bench/addrman.cpp w/src/bench/addrman.cpp
index 9fe50f4ec2..d1def9e520 100644
--- i/src/bench/addrman.cpp
+++ w/src/bench/addrman.cpp
@@ -12,13 +12,13 @@
#include <optional>
#include <vector>
/* A "source" is a source address from which we have received a bunch of other addresses. */
-static constexpr size_t NUM_SOURCES = 64;
+static constexpr size_t NUM_SOURCES = 512; // fills addrman with ~55k addresses, instead of ~15k
static constexpr size_t NUM_ADDRESSES_PER_SOURCE = 256;
static NetGroupManager EMPTY_NETGROUPMAN{std::vector<bool>()};
static constexpr uint32_t ADDRMAN_CONSISTENCY_CHECK_RATIO{0};
static std::vector<CAddress> g_sources;
@@ -137,16 +137,20 @@ static void AddrManSelectByNetwork(benchmark::Bench& bench)
CAddress i2p_address(i2p_service, NODE_NONE);
i2p_address.nTime = Now<NodeSeconds>();
CNetAddr source = ResolveIP("252.2.2.2");
addrman.Add({i2p_address}, source);
FillAddrMan(addrman);
-
- bench.run([&] {
- const auto& address = addrman.Select(/*new_only*/false, NET_I2P);
- assert(address.first.ToStringAddrPort() == "udhdrtrcetjm5sxzskjyr5ztpeszydbh4dpl3pl4utgqqw2v4jna.b32.i2p:0");
+ fprintf(stderr, "addrman size: %zu\n", addrman.Size());
+
+ bench.minEpochIterations(300).run([&] {
+#if 0
+ addrman.Select(/*new_only=*/false);
+#else
+ addrman.Select(/*new_only=*/false, NET_I2P);
+#endif
});
}
static void AddrManGetAddr(benchmark::Bench& bench)
{
AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO};
The results are:
Select()
takes 0.18 microsecondsSelect(network)
takes 6000 microseconds
Now, this is a big difference, but maybe that is ok. We don't Select()
in a tight loop. However, it can be improved relatively easy. I observed that in 2. we visit between ~100 and ~600k bucket positions before finding a match. Given that there are 80k in addrman it means that we visit some multiple times, which is a waste of resources. This is avoided if we don't visit already visited buckets (but still visit buckets in random order). The patch below does that (on top of this PR):
don't visit already visited buckets
diff --git i/src/addrman.cpp w/src/addrman.cpp
index 1023c3cbdb..0e196050b9 100644
--- i/src/addrman.cpp
+++ w/src/addrman.cpp
@@ -16,12 +16,13 @@
#include <streams.h>
#include <tinyformat.h>
#include <uint256.h>
#include <util/check.h>
#include <util/time.h>
+#include <array>
#include <cmath>
#include <optional>
/** Over how many buckets entries with tried addresses from a single group (/16 for IPv4) are spread */
static constexpr uint32_t ADDRMAN_TRIED_BUCKETS_PER_GROUP{8};
/** Over how many buckets entries with new addresses originating from a single group are spread */
@@ -745,17 +746,39 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
} else {
search_tried = insecure_rand.randbool();
}
const int bucket_count{search_tried ? ADDRMAN_TRIED_BUCKET_COUNT : ADDRMAN_NEW_BUCKET_COUNT};
+ // We want to search buckets in random order and also prefer visiting a
+ // bucket that we have not visited before instead of visiting an already
+ // visited bucket. Thus shuffle the buckets and visit that shuffled list
+ // in order.
+
+ std::array<uint16_t, std::max(ADDRMAN_TRIED_BUCKET_COUNT, ADDRMAN_NEW_BUCKET_COUNT)>
+ shuffled_bucket_indexes;
+ for (uint16_t i = 0; i < bucket_count; ++i) {
+ shuffled_bucket_indexes[i] = i;
+ }
+ Shuffle(shuffled_bucket_indexes.begin(),
+ shuffled_bucket_indexes.begin() + bucket_count,
+ insecure_rand);
+ // If we visit a bucket and find 0 matching addresses in it, mark it with
+ // this and never visit it again.
+ static constexpr uint16_t ALREADY_VISITED_AND_BORING{std::numeric_limits<uint16_t>::max()};
+
+ static_assert(ADDRMAN_TRIED_BUCKET_COUNT < ALREADY_VISITED_AND_BORING);
+ static_assert(ADDRMAN_NEW_BUCKET_COUNT < ALREADY_VISITED_AND_BORING);
+
// Loop through the addrman table until we find an appropriate entry
double chance_factor = 1.0;
- while (1) {
- // Pick a bucket, and an initial position in that bucket.
- int bucket = insecure_rand.randrange(bucket_count);
+ for (uint16_t b = 0;; b = (b + 1) % bucket_count) {
+ if (shuffled_bucket_indexes[b] == ALREADY_VISITED_AND_BORING) {
+ continue;
+ }
+ const size_t bucket{shuffled_bucket_indexes[b]};
int initial_position = insecure_rand.randrange(ADDRMAN_BUCKET_SIZE);
// Iterate over the positions of that bucket, starting at the initial one,
// and looping around.
int i;
for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) {
@@ -769,14 +792,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
} else {
break;
}
}
}
- // If the bucket is entirely empty, start over with a (likely) different one.
- if (i == ADDRMAN_BUCKET_SIZE) continue;
+ // Start over with a different bucket if this one is entirely empty or
+ // specific network was requested and it does not contain any addresses
+ // from that network.
+ if (i == ADDRMAN_BUCKET_SIZE) {
+ shuffled_bucket_indexes[b] = ALREADY_VISITED_AND_BORING;
+ continue;
+ }
// Find the entry to return.
int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE;
int nId = LookupAddrmanEntry(search_tried, bucket, position);
const auto it_found{mapInfo.find(nId)};
assert(it_found != mapInfo.end());
It changes Select(network)
to about 3000 microseconds (down from 6000), but that result from the benchmark is averaged between executions that take anything between 100 and 600k iterations. What is more important is that with the change above it never takes more than ~60k iterations, that is - anything between 100 and 60k. Worst case lowered ~10 times.
The optimization brings some complexity but IMO it is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a nice speedup in the case where we only have a few addresses - however, due to the constant overhead of building the shuffled list of buckets in the beginning I'd expect performance to go down a bit for the case where we have many addresses to choose from. Do you see this in your benchmark?
Why does this need ALREADY_VISITED_AND_BORING
? If we do a for-loop through a pre-shuffled list of buckets instead of a while loop, doesn't that already guarantee that we visit each bucket at most once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I am understanding this thread properly, it seems like there are two concepts being discussed:
- changes to the bench tests
- optimization to
AddrManImpl::Select_
RE 1, I'm slightly confused as to the desired coverage. Without the network parameter, I would expect the worst case of the current Select_
function to happen when addrman is practically empty, which is why we introduced AddrManSelectFromAlmostEmpty
. On mainnet, this case would be unlikely, which is represented by AddrManSelect
. With the network parameter, I would expect worst case to be represented by AddrManSelectByNetwork
where theres just 1 I2P address on an addrman filled with other addresses. @vasild can you clarify what feels absent or misrepresentative?
also happy to discuss 2 further, but a question there - does it feel relevant to these patches to improve performance, or is it an orthogonal improvement (since we see similar worst cases right now)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect performance to go down a bit for the case where we have many addresses to choose from. Do you see this in your benchmark?
Yes, that slows down a bit the fast Select()
(any network is ok). I observed that and think it is ok because it is still super fast with or without the shuffling.
Why does this need
ALREADY_VISITED_AND_BORING
?
The for
loop could still repeat the whole array from the start if some address was found but was skipped. ALREADY_VISITED_AND_BORING
is a further optimization to completely skip that bucket on the second and further passes through the array.
@amitiuttarwar, you are right that I modified the newly added AddrManSelectByNetwork()
so that, in general, it tests the same thing as AddrManSelect()
which already exists in master
. I did that in order to be sure that everything else is identical: the way addrman is filled, minEpochIterations(300)
and the GetPort()
call. Way too many times I have been bitten by chasing noise in benchmarks, so I wanted to be sure that this is the only difference in the two things I am comparing:
#if 0 // flip to 1 to test the other case
addrman.Select(/*new_only=*/false);
#else
addrman.Select(/*new_only=*/false, NET_I2P);
#endif
also happy to discuss 2 further, but a question there - does it feel relevant to these patches to improve performance, or is it an orthogonal improvement
I think the performance improvement would be a nice addition to the work done by this PR (or maybe even a "must have"?)
(since we see similar worst cases right now)?
Hmm? Now on master
on mainnet it takes 0.18 microseconds and with this PR on mainnet it will take 6000 microseconds. That is the average case (averaged by the benchmark executing Select
about 3500 times). The difference in the worst case is even more pronounced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to asses how much is the slowdown. Use a full addrman (15k is not full) because that is what people out there are running on. If the slowdown is ok, then no further improvements are necessary on this PR or its parent.
agreed that this is the fundamental thing we are trying to evaluate - is the performance difference significant & acceptable
I will try to compare on a snapshot of an addrman from my public full node, not on artificially filled addrman from the bench. Last time I checked it had ~70k addresses and also it will have at least a bunch of addresses from each network, not just 1 address. That would be more representative.
would be awesome if you could compare based on your addrman snapshot from mainnet!
my understanding of martin's benches is its trying to observe "normal" as well as "worst case". so the examples with 1 address are aiming towards "worst case". but of course "normal" would vary widely from node to node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is your "Test 5" doing? Does it call 20 times
Select(network)
?
No, it calls Select(network)
for an addrman that has 20 addresses of the selected network.
I'll try to explain the idea behind my benchmarks better:
Starting point is the observation that the suggested change makes Select
slower if we already know a lot of addresses of the desired type, while making it faster if addrman has very few addresses of this type. This is true for botch Select()
and Select(network)
, which is why it is not obvious at first sight if the suggestion is an overall performance improvement or a degradation.
So my idea was to find the cutoff - how many addresses do we need such that both algorithms are equally fast - in situations where we have more addrs than this, the change would be a slowdown, when we have less it's a speedup. This cutoff turned out to be very small for Select()
(3 addresses), while being a bit larger for Select(network)
- it was approximately 20. It would be helpful to know how things change with a full addrman!
So if we used an addrman with counts similar to those reported by @jonatack, my results would suggest that the pre-shufffling change suggested here would be a speedup for CJDNS-specific queries, while being a slowdown for clearnet, onion, i2p and unspecific queries (all compared to this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran this on a mainnet peers.dat
:
addrman size total: 72947
addrman size IPv4: 47252
addrman size IPv6: 11491
addrman size Tor: 13305
addrman size I2P: 893
addrman size CJDNS: 6
[patch] How to load /tmp/peers.dat from bench_bitcoin
diff --git i/src/bench/addrman.cpp w/src/bench/addrman.cpp
index 8a5cab443f..b8ec4a4071 100644
--- i/src/bench/addrman.cpp
+++ w/src/bench/addrman.cpp
@@ -1,17 +1,22 @@
// Copyright (c) 2020-2022 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
+#include <addrdb.h>
#include <addrman.h>
#include <bench/bench.h>
+#include <chainparams.h>
+#include <chainparamsbase.h>
#include <netbase.h>
#include <netgroup.h>
#include <random.h>
#include <util/check.h>
+#include <util/system.h>
#include <util/time.h>
+#include <util/translation.h>
#include <optional>
#include <vector>
/* A "source" is a source address from which we have received a bunch of other addresses. */
@@ -21,12 +26,23 @@ static constexpr size_t NUM_ADDRESSES_PER_SOURCE = 256;
static NetGroupManager EMPTY_NETGROUPMAN{std::vector<bool>()};
static constexpr uint32_t ADDRMAN_CONSISTENCY_CHECK_RATIO{0};
static std::vector<CAddress> g_sources;
static std::vector<std::vector<CAddress>> g_addresses;
+std::unique_ptr<AddrMan> LoadAddrmanFromPeersDat()
+{
+ SelectBaseParams("main");
+ SelectParams("main");
+ gArgs.ForceSetArg("-datadir", "/tmp");
+ std::unique_ptr<AddrMan> addrman;
+ auto err = LoadAddrman(EMPTY_NETGROUPMAN, gArgs, addrman);
+ assert(!err.has_value());
+ return addrman;
+}
+
static void CreateAddresses()
{
if (g_sources.size() > 0) { // already created
return;
}
@@ -69,19 +85,12 @@ static void FillAddrMan(AddrMan& addrman)
{
CreateAddresses();
AddAddressesToAddrMan(addrman);
}
-static CNetAddr ResolveIP(const std::string& ip)
-{
- CNetAddr addr;
- LookupHost(ip, addr, false);
- return addr;
-}
-
static CService ResolveService(const std::string& ip, uint16_t port = 0)
{
CService serv;
Lookup(ip, serv, port, false);
return serv;
}
@@ -125,26 +134,24 @@ static void AddrManSelectFromAlmostEmpty(benchmark::Bench& bench)
(void)addrman.Select();
});
}
static void AddrManSelectByNetwork(benchmark::Bench& bench)
{
- AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO};
-
- // add single I2P address to new table
- CService i2p_service;
- i2p_service.SetSpecial("udhdrtrcetjm5sxzskjyr5ztpeszydbh4dpl3pl4utgqqw2v4jna.b32.i2p");
- CAddress i2p_address(i2p_service, NODE_NONE);
- i2p_address.nTime = Now<NodeSeconds>();
- CNetAddr source = ResolveIP("252.2.2.2");
- addrman.Add({i2p_address}, source);
-
- FillAddrMan(addrman);
+ auto addrman = LoadAddrmanFromPeersDat();
+ std::cout << "addrman size total: " << addrman->Size() << std::endl
+ << "addrman size IPv4: " << addrman->Size(NET_IPV4) << std::endl
+ << "addrman size IPv6: " << addrman->Size(NET_IPV6) << std::endl
+ << "addrman size Tor: " << addrman->Size(NET_ONION) << std::endl
+ << "addrman size I2P: " << addrman->Size(NET_I2P) << std::endl
+ << "addrman size CJDNS: " << addrman->Size(NET_CJDNS) << std::endl;
bench.run([&] {
- (void)addrman.Select(/*new_only=*/false, NET_I2P);
+ //addrman->Select(/*new_only=*/false);
+ //addrman->Select(/*new_only=*/false, NET_I2P);
+ addrman->Select(/*new_only=*/false, NET_CJDNS);
});
}
static void AddrManGetAddr(benchmark::Bench& bench)
{
AddrMan addrman{EMPTY_NETGROUPMAN, /*deterministic=*/false, ADDRMAN_CONSISTENCY_CHECK_RATIO};
Results:
Select(/*new_only=*/false)
: 0.265 microseconds, this is whatmaster
is doing before this PR
After this PR:
Select(/*new_only=*/false, NET_IPV4)
: 0.5 microsecondsSelect(/*new_only=*/false, NET_IPV6)
: 1.7 microsecondsSelect(/*new_only=*/false, NET_ONION)
: 3.1 microsecondsSelect(/*new_only=*/false, NET_I2P)
: 27 microsecondsSelect(/*new_only=*/false, NET_CJDNS)
: 1600 microseconds
Is this slowdown acceptable? Maybe yes. We don't call Select
in a tight loop, so maybe even the 1600 microseconds is ok.
What happens if we mark and subsequently skip buckets we have seen to contain 0 interesting addresses, without the shuffling? I guess that should bring most of the improvement from the above patch without the downside of the shuffling. Here is the change on top of this PR:
[patch] Skip boring
diff --git i/src/addrman.cpp w/src/addrman.cpp
index cdfd079fcd..46f4de99be 100644
--- i/src/addrman.cpp
+++ w/src/addrman.cpp
@@ -747,15 +747,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
}
const int bucket_count{search_tried ? ADDRMAN_TRIED_BUCKET_COUNT : ADDRMAN_NEW_BUCKET_COUNT};
// Loop through the addrman table until we find an appropriate entry
double chance_factor = 1.0;
+ std::unordered_set<int> already_visited_and_boring_buckets;
while (1) {
// Pick a bucket, and an initial position in that bucket.
int bucket = insecure_rand.randrange(bucket_count);
+ if (already_visited_and_boring_buckets.count(bucket) > 0) {
+ continue;
+ }
int initial_position = insecure_rand.randrange(ADDRMAN_BUCKET_SIZE);
// Iterate over the positions of that bucket, starting at the initial one,
// and looping around.
int i;
for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) {
@@ -770,14 +774,19 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
} else {
break;
}
}
}
- // If the bucket is entirely empty, start over with a (likely) different one.
- if (i == ADDRMAN_BUCKET_SIZE) continue;
+ // Start over with a different bucket if this one is entirely empty or
+ // specific network was requested and it does not contain any addresses
+ // from that network.
+ if (i == ADDRMAN_BUCKET_SIZE) {
+ already_visited_and_boring_buckets.insert(bucket);
+ continue;
+ }
// Find the entry to return.
int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE;
int nId = GetEntry(search_tried, bucket, position);
const auto it_found{mapInfo.find(nId)};
assert(it_found != mapInfo.end());
Result for NET_CJDNS
where the slowdown is most pronounced: 1200 microseconds (vs 1600). Those numbers are averaged over many runs and they do not tell the whole story. Here are the distributions without "skip boring" (the height of a bar indicates how many addresses were checked before finding the result, e.g. if the bar that spans between 2500 and 3500 is tall 280, it means 280 of the runs checked 2500-3500 addresses before finding the result).
with "skip boring":
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why 1.6 milliseconds to choose an address would be a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simpler approach to skipping already tried buckets might be something like this:
int n = 1 << 8; // power of 2
int initial_pos = random(n);
int step = random(n/2) * 2 + 1; // odd number
int i = initial_pos;
do {
if (check_bucket(i)) break;
i = (i + step) % n; // oops, try a new bucket
} while (i != initial_pos);
That doesn't need a full shuffle, still hits every bucket, and only hits each bucket once (provided step and n are co-prime).
Based on vasild's stats above, I think this is fine to merge without any improvement here, but it might be interesting to rebench with this approach -- performance should benefit a little from needing less randomness as well as avoiding repeated work...
If we wanted to making things actually reliably fast for cjdns-like networks lost in a sea of ipv4 addresses, then having separate addrman tables for each network might be worth exploring. That would probably help prevent an attacker from blocking valid ipv4 addresses with a mass of valid-but-sybiled tor addresses, which might be worthwhile.
25a64a2
to
09d5145
Compare
thank you for the in-depth review @vasild ! I have addressed all your review comments (incorporated most, left questions on a couple) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 09d5145
Suggestions remaining (non-blocker):
-
Check out-of-bounds array access in AddrManImpl::GetEntry().
-
I can't find where the suggested test was added.
-
Performance wise I think the change in #27214 (comment) would be very nice to have.
Thanks!
Unused until later commit. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
in preparation for consolidating the logic for searching the new and tried tables, generalize the call paths for both Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
-BEGIN VERIFY SCRIPT- sed -i 's/fChanceFactor/chance_factor/g' src/addrman.cpp sed -i 's/nBucketPos/initial_position/g' src/addrman.cpp sed -i 's/nBucket/bucket/g' src/addrman.cpp src/addrman_impl.h sed -i 's/newOnly/new_only/g' src/addrman.cpp src/addrman_impl.h src/addrman.h src/test/addrman_tests.cpp -END VERIFY SCRIPT- Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Add an optional parameter to the addrman Select function that allows callers to specify which network the returned address should be on. Ensure that the proper table is selected with different cases of whether the new or tried table has network addresses that match. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
this adds coverage for the 7 different cases of which table should be selected when the network is specified. the different cases are the result of new_only being true or false and whether there are network addresses on both, neither, or one of new vs tried tables. the only case not covered is when new_only is false and the only network addresses are on the new table. Co-authored-by: Martin Zumsande <mzumsande@gmail.com> Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
if an addr matching the network requirements is only on the new table and select is invoked with new_only = false, ensure that the code selects the new table. in order to test this case, we use a non deterministic addrman. this means we cannot have more than one address in any addrman table, or risk sporadic failures when the second address happens to conflict. if the code chose a table at random, the test would fail 50% of the time Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
to evaluate the worst case performance with the network parameter passed through, fill the new table with addresses then add a singular I2P address to retrieve Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
the addrman select function will demonstrate it's worst case performance when it is almost empty, because it might have to linearly search several buckets. add a bench test to cover this case Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
09d5145
to
17e7054
Compare
two small updates:
|
|
||
// Iterate over the positions of that bucket, starting at the initial one, | ||
// and looping around. | ||
int i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion to make the code a lit bit cleaner in case you have to touch it again, perhaps we don't need to create position
and nId
again after the loop.
diff --git a/src/addrman.cpp b/src/addrman.cpp
index cdfd079fc..30ce2cadc 100644
--- a/src/addrman.cpp
+++ b/src/addrman.cpp
@@ -757,10 +757,10 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
// Iterate over the positions of that bucket, starting at the initial one,
// and looping around.
- int i;
+ int i, position, node_id;
for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) {
- int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE;
- int node_id = GetEntry(search_tried, bucket, position);
+ position = (initial_position + i) % ADDRMAN_BUCKET_SIZE;
+ node_id = GetEntry(search_tried, bucket, position);
if (node_id != -1) {
if (network.has_value()) {
const auto it{mapInfo.find(node_id)};
@@ -777,9 +777,7 @@ std::pair<CAddress, NodeSeconds> AddrManImpl::Select_(bool new_only, std::option
if (i == ADDRMAN_BUCKET_SIZE) continue;
// Find the entry to return.
- int position = (initial_position + i) % ADDRMAN_BUCKET_SIZE;
- int nId = GetEntry(search_tried, bucket, position);
- const auto it_found{mapInfo.find(nId)};
+ const auto it_found{mapInfo.find(node_id)};
assert(it_found != mapInfo.end());
const AddrInfo& info{it_found->second};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #27745
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 17e7054
Maybe other reviewers would be interested in the performance discussion:
#27214 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 17e7054
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 17e7054
* @param[in] new_only Whether to only select addresses from the new table. Passing `true` returns | ||
* an address from the new table or an empty pair. Passing `false` will return an | ||
* address from either the new or tried table (it does not guarantee a tried entry). | ||
* @param[in] network Select only addresses of this network (nullopt = all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth documenting that passing in a network
here rather than nullopt
can make the search significantly slower (though still fast)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #27745
if (node_id != -1) { | ||
if (network.has_value()) { | ||
const auto it{mapInfo.find(node_id)}; | ||
assert(it != mapInfo.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want an unconditional assertion failure here, vs say:
if (Assume(it != mapInfo.end()) && it->second.GetNetwork() == *network) break;
(maybe a similar question for GetEntry
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #27745
bool new_selected{false}; | ||
bool tried_selected{false}; | ||
|
||
while (!new_selected || !tried_selected) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like having:
int counter = 256;
while (--counter > 0 && (!new_selected || !tried_selected)) { ... }
BOOST_CHECK(new_selected);
BOOST_CHECK(tried_selected);
would be better here than an infinite loop leading to CI timeout on error. A one-in-2**256 chance of failure is already the security of bitcoin as a whole, after all; and if our randomness generator is biassed enough that it fails more than that, it's probably better that the test case fails anyway? That we use the deterministic addrman presumably means any failures here will always be deterministic, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #27745
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review ACK 17e7054
I reviewed the PR again - although I'm a coauthor, so not sure about the etiquette of ack'ing.
I think there is a small bug in the non-deterministic test that could make it fail on rare occasions and should be fixed - here or in a follow-up.
* @param[in] new_only Whether to only select addresses from the new table. | ||
* @param[in] new_only Whether to only select addresses from the new table. Passing `true` returns | ||
* an address from the new table or an empty pair. Passing `false` will return an | ||
* address from either the new or tried table (it does not guarantee a tried entry). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: mentioning the "empty pair" only in the first case makes it sound as if this wasn't one of the possible outcomes in the second case (but it is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #27745
thanks for the reviews! I'll address the outstanding comments in a followup |
This will be merged after branch-off. It would be good to either get the follow up opened in advance, so we can avoid any time between merging, and the potential for intermittent CI failures, or, if that change is straight-forward enough, you could push an additional commit here, leaving the rest of the branch as-is. |
03e996c
to
17854ae
Compare
added an extra commit to ensure the non-deterministic test will not fail intermittently. will incorporate the outstanding feedback separately |
5da828b
to
17e7054
Compare
opened #27506 for the test fix |
10a354f test: prevent intermittent failures (Amiti Uttarwar) Pull request description: Follow up to #27214 - add an address to the tried table before the new table to make sure a new table collision is not possible. ACKs for top commit: mzumsande: Code review ACK 10a354f - the fix is what I suggested [here](bitcoin/bitcoin#27214 (comment)) and should make these intermittent failures impossible. Tree-SHA512: 24099f02e1915395130065af0ef6a2a1893955d222517d156d928765541d9c427da00172a9b5a540163f4d6aae93ca3882e8267eeb35ecc595d42178abc6191c
17e7054 doc: clarify new_only param for Select function (Amiti Uttarwar) b0010c8 bench: test select for a new table with only one address (Amiti Uttarwar) 9b91aae bench: add coverage for addrman select with network parameter (Amiti Uttarwar) 22a4d14 test: increase coverage of addrman select (without network) (Amiti Uttarwar) a98e542 test: add addrman test for special case (Amiti Uttarwar) 5c8b4ba tests: add addrman_select_by_network test (Amiti Uttarwar) 6b22928 addrman: add functionality to select by network (Amiti Uttarwar) 26c3bf1 scripted-diff: rename local variables to match modern conventions (Amiti Uttarwar) 4880641 refactor: consolidate select logic for new and tried tables (Amiti Uttarwar) ca2a9c5 refactor: generalize select logic (Amiti Uttarwar) 052fbcd addrman: Introduce helper to generalize looking up an addrman entry (Amiti Uttarwar) 9bf078f refactor: update Select_ function (Amiti Uttarwar) Pull request description: For the full context & motivation of this patch, see bitcoin#27213 This is joint work with mzumsande. This PR adds functionality to `AddrMan::Select` to enable callers to specify a network they are interested in. Along the way, it refactors the function to deduplicate the logic, updates the local variables to match modern conventions, adds test coverage for both the new and existing `Select` logic, and adds bench tests for the worst case performance of both the new and existing `Select` logic. This functionality is used in the parent PR. ACKs for top commit: vasild: ACK 17e7054 brunoerg: re-ACK 17e7054 ajtowns: ACK 17e7054 mzumsande: Code Review ACK 17e7054 Tree-SHA512: e99d1ce0c44a15601a3daa37deeadfc9d26208a92969ecffbea358d57ca951102d759734ccf77eacd38db368da0bf5b6fede3cd900d8a77b3061f4adc54e52d8
10a354f test: prevent intermittent failures (Amiti Uttarwar) Pull request description: Follow up to bitcoin#27214 - add an address to the tried table before the new table to make sure a new table collision is not possible. ACKs for top commit: mzumsande: Code review ACK 10a354f - the fix is what I suggested [here](bitcoin#27214 (comment)) and should make these intermittent failures impossible. Tree-SHA512: 24099f02e1915395130065af0ef6a2a1893955d222517d156d928765541d9c427da00172a9b5a540163f4d6aae93ca3882e8267eeb35ecc595d42178abc6191c
cd8ef5b test: ensure addrman test is finite (Amiti Uttarwar) b9f1e86 addrman: change asserts to Assumes (Amiti Uttarwar) 7687707 doc: update `Select` function description (Amiti Uttarwar) 2b6bd12 refactor: de-duplicate lookups (Amiti Uttarwar) Pull request description: this PR addresses outstanding review comments from #27214 ACKs for top commit: achow101: ACK cd8ef5b mzumsande: Code Review ACK cd8ef5b brunoerg: crACK cd8ef5b Tree-SHA512: 669f67904263e3f51c39b175eabf5fa1b1e7b6841e889656afec33d0bd93fb446de9403f0a91b186ddeaf29498c8938484a0547b1188256c4e7c90db6f30bb55
cd8ef5b test: ensure addrman test is finite (Amiti Uttarwar) b9f1e86 addrman: change asserts to Assumes (Amiti Uttarwar) 7687707 doc: update `Select` function description (Amiti Uttarwar) 2b6bd12 refactor: de-duplicate lookups (Amiti Uttarwar) Pull request description: this PR addresses outstanding review comments from bitcoin#27214 ACKs for top commit: achow101: ACK cd8ef5b mzumsande: Code Review ACK cd8ef5b brunoerg: crACK cd8ef5b Tree-SHA512: 669f67904263e3f51c39b175eabf5fa1b1e7b6841e889656afec33d0bd93fb446de9403f0a91b186ddeaf29498c8938484a0547b1188256c4e7c90db6f30bb55
For the full context & motivation of this patch, see #27213
This is joint work with mzumsande.
This PR adds functionality to
AddrMan::Select
to enable callers to specify a network they are interested in.Along the way, it refactors the function to deduplicate the logic, updates the local variables to match modern conventions, adds test coverage for both the new and existing
Select
logic, and adds bench tests for the worst case performance of both the new and existingSelect
logic.This functionality is used in the parent PR.