-
Notifications
You must be signed in to change notification settings - Fork 37.7k
doc: make it easier to work out size of bloom filter #19130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks right, but perhaps it should go in the .h file (which has an easier approximate formula already). |
An alternative would be to run the code in massif and show the size of the object. 😬 |
To have an idea of the approximation, the formula "4.328 bits per element and per bit of fprate" is at most 0.115% off for better than 8-bit fprate (fprate < 1/256). For 10-bit, at most 0.074%. For 12-bit, at most 0.052%. For 20-bit, 0.019%. |
I agree. In general, this kind of documentation would be better in the header file. This also gets included in doxygen etc. |
@ajtowns did you want to follow up here and move this into the header? |
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on #19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
This is really helpful. I'd love to see it merged into the header file at some point. |
#19968 did that, i think ? |
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
I've manually traced through the bloom filter code to work out how big a filter ends up being a few times now; this just adds some python code as a comment so it's easy to do that by cutting and pasting into an interpreter instead.