Skip to content

Conversation

sipa
Copy link
Member

@sipa sipa commented Apr 25, 2015

Including:

This reduces the memory consumption of a maximal setAddrKnown by +- half.

@sipa
Copy link
Member Author

sipa commented Apr 25, 2015

The HMAC for address inventory hashing is absolute overkill, but was easy to write. I doubt the performance matters.

@sipa
Copy link
Member Author

sipa commented Apr 25, 2015

Rebased on top of @gavinandresen's #6066 (so that I don't need the address hasher here).

Removed the boost::unordered_set usage for now, as it is incompatible with older boost versions (which don't have unordered_set::reserve).

{
set.clear();
queue.clear();
std::vector<iterator>(nMaxSizeIn, set.end()).swap(order);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably just an old fuddy-duddy...
... but I find the "use swap to assign" idiom hard to review.
Why not just:
order.assign(nMaxSizeIn, set.end());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's horrible, but it's the only way to make a vector release its memory, afaik.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK too. Though this is only important if we expect the size to change (become smaller) dynamically.

I'm not sure I like having clear() double as set-maximum-size method. I'd prefer the maximum size to be passed to the constructor and be static over the lifetime of the object. If that's not possible, let's add a special resize() call for resizing (that happens to clear the structure too).

@gavinandresen
Copy link
Contributor

Code review ACK-- note I changed the CRollingBloomFilter implementation commit in #6066 .

Measured heap usage on OSX for a 1,000-entry mruset : 72,192 bytes.
(before this change: 100,992 bytes).

I'm running a node with these changes now.

@sipa
Copy link
Member Author

sipa commented Apr 26, 2015

Rebased on updated #6066, and addressed a nit.

@sipa
Copy link
Member Author

sipa commented Apr 26, 2015

@gavinandresen I measure around a 50 byte overhead per mruset entry now. Before, it was 50 bytes + double storage of the elements.

@sandakersmann
Copy link
Contributor

Heading in src/mruset.h should be:
// Copyright (c) 2012-2015 The Bitcoin Core developers

@sipa
Copy link
Member Author

sipa commented Apr 27, 2015

@sandakersmann Why? It's only worked on in 2012 and 2015.

@sandakersmann
Copy link
Contributor

@sipa Last update was in 2014.

@sipa
Copy link
Member Author

sipa commented Apr 27, 2015

@sandakersmann Ok, fixed.

@@ -40,6 +40,17 @@ nFlags(nFlagsIn)
{
}

// Private constructor used by CRollingBloomFilter
CBloomFilter::CBloomFilter(unsigned int nElements, double nFPRate, unsigned int nTweakIn) :
vData((unsigned int)(-1 / LN2SQUARED * nElements * log(nFPRate)) / 8),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indentation

For when you need to keep track of the last N items
you've seen, and can tolerate some false-positives.

Rebased-by: Pieter Wuille <pieter.wuille@gmail.com>
@sipa sipa force-pushed the smallmru branch 2 times, most recently from ab42414 to 2fceed4 Compare April 30, 2015 15:12
@sipa
Copy link
Member Author

sipa commented Apr 30, 2015

Made several changes:

  • The MRU size of mruset is now a constant (set at constructor time).
  • Restyled some of the rolling bloom filter code.

I think there is a significant improvement possible to the size of the rolling bloom filter, by using a different tweak for multiple filters, and checking several filters for a 'contains'. I'll check the math and write a separate PR for that if found useful.

gavinandresen and others added 3 commits April 30, 2015 08:16
Use a probabilistic bloom filter to keep track of which addresses
we think we have given our peers, instead of a list.

This uses much less memory, at the cost of sometimes failing to
relay an address to a peer-- worst case if the bloom filter happens
to be as full as it gets, 1-in-1,000.

Measured memory usage of a full mruset setAddrKnown: 650Kbytes
Constant memory usage of CRollingBloomFilter addrKnown: 37Kbytes.

This will also help heap fragmentation, because the 37K of storage
is allocated when a CNode is created (when a connection to a peer
is established) and then there is no per-item-remembered memory
allocation.

I plan on testing by restarting a full node with an empty peers.dat,
running a while with -debug=addrman and -debug=net, and making sure
that the 'addr' message traffic out is reasonable.
(suggestions for better tests welcome)
@gavinandresen
Copy link
Contributor

@sipa: There's a good survey paper by Sasu Tarkoma et al: "Theory and Practice of Bloom Filters for Distributed Systems." Which pointed me to: https://home.kookmin.ac.kr/~mkyoon/publications/2010TKDE.pdf
... which has a more complicated scheme (maybe similar to what you're thinking).

@gavinandresen
Copy link
Contributor

ACK

@laanwj
Copy link
Member

laanwj commented May 1, 2015

I think there is a significant improvement possible to the size of the rolling bloom filter, by using a different tweak for both, and checking both filters for a 'contains'. I'll check the math and write a separate PR for that if found useful.

@sipa sounds great to me, although let's do that in a later pull, this is already a great improvement compared to the current state, so let's get this reviewed and merged.

@laanwj
Copy link
Member

laanwj commented May 1, 2015

Tested ACK

@laanwj laanwj merged commit f46a680 into bitcoin:master May 1, 2015
laanwj added a commit that referenced this pull request May 1, 2015
f46a680 Better mruset unit test (Pieter Wuille)
d4d5022 Use ring buffer of set iterators instead of deque of copies in mruset (Pieter Wuille)
d81cff3 Replace mruset setAddrKnown with CRollingBloomFilter addrKnown (Gavin Andresen)
69a5f8b Rolling bloom filter class (Gavin Andresen)
Cryptoslave added a commit to Anoncoin/oldrepo-backup that referenced this pull request Mar 8, 2016
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Jul 1, 2020
6f41b3e [QA] Missing mempool sync in pos_coldStaking and zc_publicspends tests (random-zebra)
efaf727 net: correctly initialize nMinPingUsecTime (Wladimir J. van der Laan)
61c8ffe Do not add random inbound peers to addrman. (Gregory Maxwell)
e6a1726 Added feeler connections increasing good addrs in the tried table. (Ethan Heilman)
070b6fb Actually only use filterInventoryKnown with MSG_TX inventory messages. (Gregory Maxwell)
9ac6b28 Only use filterInventoryKnown with MSG_TX inventory messages. (Patick Strateman)
01273db Rename setInventoryKnown filterInventoryKnown (Patick Strateman)
4f11eb2 Remove mruset as it is no longer used. (Gregory Maxwell)
2e3b05c Replace setInventoryKnown with a rolling bloom filter. (Gregory Maxwell)
409aa83 Replace trickle nodes with per-node/message Poisson delays (Pieter Wuille)
93e8c46 Move recentRejects initialization to top of InitBlockIndex (Wladimir J. van der Laan)
9a59420 Keep track of recently rejected transactions (Peter Todd)
34ee777 Only use randomly created nonces in CRollingBloomFilter. (Pieter Wuille)
338d346 Make CRollingBloomFilter set nTweak for you (Peter Todd)
dcd15bc Reuse vector hashing code for uint256 (Pieter Wuille)
3230143 Add uint256 support to CRollingBloomFilter (Peter Todd)
128d644 Better mruset unit test (Pieter Wuille)
89740ed Use ring buffer of set iterators instead of deque of copies in mruset (Pieter Wuille)
14c88ee Replace mruset setAddrKnown with CRollingBloomFilter addrKnown (Gavin Andresen)
e0bebbd Rolling bloom filter class (Gavin Andresen)
7c03bd5 Add correct bool combiner for net signals (Pieter Wuille)
5cb5fd6 Stop exporting ConnectNode (Fuzzbawls)
819295d Stop using ConnectNode in layer 2 code (Fuzzbawls)
851b6b4 net: No need to export DumpBanlist (Cory Fields)
4486d4e net: make Ban/Unban/ClearBan functionality consistent (Cory Fields)
a5e278d net: Drop CNodeRef for AttemptToEvictConnection (Cory Fields)
9fd357d net: use the exposed GetNodeSignals() rather than g_signals directly (Cory Fields)
7962bcc net: remove unused set (Cory Fields)
fabf358 Use network group instead of CNetAddr in final pass to select node to disconnect (Patrick Strateman)
7f030fe Fix comment (Patrick Strateman)
18af800 Acquire cs_vNodes before changing refrence counts (Patrick Strateman)
7aa827f CNodeRef copy constructor and assignment operator (Patrick Strateman)
b3f95e7 Return false early if vEvictionCandidates is empty (Patrick Strateman)
85886c9 Better support for nodes with non-standard nMaxConnections (Patrick Strateman)
9c9e55b RAII wrapper for CNode* (Patrick Strateman)
e92780d Add comments to AttemptToEvictConnection (Patrick Strateman)
0ca7ce3 Prefer to disconnect peers in favor of whitelisted peers (Patrick Strateman)
a1c4aaf AttemptToEvictConnection (Patrick Strateman)
aa7ce9b Record nMinPingUsecTime (Patrick Strateman)
fd7bab0 Refactor: Move failure conditions to the top of AcceptConnection (Patrick Strateman)
fcb732b Refactor: Bail early in AcceptConnection (Patrick Strateman)
411766d Refactor: AcceptConnection (Patrick Strateman)

Pull request description:

  This is a culmination of several upstream PRs touching the P2P/Networking code, and resulting in a state just prior to the P2P/Network encapsulation, which will be it's own PR.

  Backported upstream PRs included here:
  - bitcoin#5859
  - bitcoin#6064
  - bitcoin#6374
  - bitcoin#6498
  - bitcoin#6636
  - bitcoin#7133
  - bitcoin#7125
  - bitcoin#7906
  - bitcoin#8282
  - bitcoin#8594

ACKs for top commit:
  random-zebra:
    Great job. ACK 6f41b3e and merging...
  furszy:
    Have been running it the past days and all is looking good, ACK 6f41b3e .

Tree-SHA512: 1cc4b1271516f000a06141b5b069f3ee00f3eb77056e40d2c021c484f749d9d8db2b76ce490f63572372705b646fad342666f6f90ca5fc69abcacf7b207d058f
@bitcoin bitcoin deleted a comment from curtcurt87 Feb 4, 2021
@bitcoin bitcoin locked and limited conversation to collaborators Feb 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants