-
Notifications
You must be signed in to change notification settings - Fork 37.8k
Replace setInventoryKnown with a rolling bloom filter. #7100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Mruset setInventoryKnown was reduced to a remarkably small 1000 entries as a side effect of sendbuffer size reductions in 2012. This removes setInventoryKnown filtering from merkleBlock responses because false positives there are especially unattractive and also because I'm not sure if there aren't race conditions around the relay pool that would cause some transactions there to be suppressed. (Also, ProcessGetData was accessing setInventoryKnown without taking the required lock.)
@gmaxwell with this change, |
@gmaxwell how large is the filter? |
1.37 MIB per peer...
|
@sipa in that case utACK |
hm. Is it that big? My math must be wrong, I was thinking it was more like 350k. In any case, I didn't want to make it smaller than the maximum INV size, since having a single INV blow it out would be unfortunate. |
Rolling bloom filters are 4 times as large as normal ones.
|
Eh, messed up my math. This should result in 700 KiB per peer. |
Confirmed with a sneaky |
utACK. Also, mruset can go away indeed. |
utACK |
700kiB per peer is quite a lot for a P2P application in itself, but more relevant is how it compares to |
An mruset for the same size (last 50000 entries) would be almost 4 MiB. Of
course, they are only size 1000 currently...
|
Yes, that's why I mentioned "functionally equivalent", comparing against the current state isn't fair as it's broken. |
I do agree 700 KiB is pretty high per peer.
Do we really need a 1/1000000 false positive rate? An FP will just cause us
to not relay a particular transaction to one peer, but we'll still relay it
to others (they have other nonces).
Also, a data structure exists that is 25% more efficient than the current
CRollingBloomFilter.
|
@sipa Care to share? :) |
Ideally what we'd want here is the opposite of a bloom filter. False positives are bad, as they cause peers to miss out on transactions, but the occasional false negative is ok, as they might get it INVed another time. |
Agree... unfortunately I don't know of any better way to do that than an mruset... Anyway, #7113 reduces this to 526 KiB per peer. |
@@ -2370,6 +2370,7 @@ CNode::CNode(SOCKET hSocketIn, const CAddress& addrIn, const std::string& addrNa | |||
nSendOffset = 0; | |||
hashContinue = uint256(); | |||
nStartingHeight = -1; | |||
setInventoryKnown.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this undo the setInventoryKnown(50000, 0.000001) initialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it only wipes the set, not the parameters.
utACK |
@sipa I'm not really excited with having false positives at all here; mostly due to the case where you only have one peer, a FP means you may not relay your own transaction. In a universe where software complexity had no cost, I'd give MRUsets to (or even just the first few) outbound peers, and higher FP rolling bloom filters to all the others. |
utACK |
needs rename of setInventoryKnown to filterInventoryKnown won't a false positive here break block relaying? |
Yes very good point, though I think that after #6494 we may be able to get
rid of using inventory for blocks entirely.
|
there doesn't seem to be a strong reason to filter block inv messages i say just send them without filtering |
Previously this logic could erroneously filter a MSG_BLOCK inventory message.
ACK |
Updated #7125 to not use setInventoryKnown for blocks anymore. |
@@ -492,8 +491,9 @@ class CNode | |||
{ | |||
{ | |||
LOCK(cs_inventory); | |||
if (!setInventoryKnown.count(inv)) | |||
vInventoryToSend.push_back(inv); | |||
if (inv.type == MSG_TX && filterInventoryKnown.contains(inv.hash)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is not enough. SendMessages checks the filter again.
Previously this logic could erroneously filter a MSG_BLOCK inventory message.
Needs rebase on top of #7129. |
See #7133. |
replaced by #7133. |
Mruset setInventoryKnown was reduced to a remarkably small 1000
entries as a side effect of sendbuffer size reductions in 2012.
This removes setInventoryKnown filtering from merkleBlock responses
because false positives there are especially unattractive and
also because I'm not sure if there aren't race conditions around
the relay pool that would cause some transactions there to
be suppressed. (Also, ProcessGetData was accessing
setInventoryKnown without taking the required lock.)