Skip to content

Conversation

ajtowns
Copy link
Contributor

@ajtowns ajtowns commented Jun 1, 2020

I've manually traced through the bloom filter code to work out how big a filter ends up being a few times now; this just adds some python code as a comment so it's easy to do that by cutting and pasting into an interpreter instead.

@fanquake fanquake added the Docs label Jun 1, 2020
@sipa
Copy link
Member

sipa commented Jun 1, 2020

Looks right, but perhaps it should go in the .h file (which has an easier approximate formula already).

@maflcko
Copy link
Member

maflcko commented Jun 1, 2020

An alternative would be to run the code in massif and show the size of the object. 😬

@sipa
Copy link
Member

sipa commented Jun 1, 2020

To have an idea of the approximation, the formula "4.328 bits per element and per bit of fprate" is at most 0.115% off for better than 8-bit fprate (fprate < 1/256). For 10-bit, at most 0.074%. For 12-bit, at most 0.052%. For 20-bit, 0.019%.

@laanwj
Copy link
Member

laanwj commented Jul 22, 2020

Looks right, but perhaps it should go in the .h file (which has an easier approximate formula already).

I agree. In general, this kind of documentation would be better in the header file. This also gets included in doxygen etc.

@fanquake
Copy link
Member

@ajtowns did you want to follow up here and move this into the header?

@fanquake fanquake closed this Sep 15, 2020
laanwj added a commit that referenced this pull request Nov 19, 2020
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on #19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 19, 2020
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
@jnewbery
Copy link
Contributor

This is really helpful. I'd love to see it merged into the header file at some point.

@laanwj
Copy link
Member

laanwj commented Feb 14, 2021

#19968 did that, i think ?

@jnewbery
Copy link
Contributor

#19968 did that, i think ?

Oh! I remembered there was a python snippet to calculate this and couldn't find it in master. I see now that #19968 included the formula without the script. That's good.

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 27, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 28, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 29, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 15, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 15, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 16, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
gabriel-bjg pushed a commit to gabriel-bjg/dash that referenced this pull request Jul 16, 2021
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
gades pushed a commit to cosanta/cosanta-core that referenced this pull request Apr 25, 2022
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns)

Pull request description:

  Based on bitcoin#19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`:

  - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate"
  - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter)
  - Reconcile the newly added code with the existing approximation

ACKs for top commit:
  laanwj:
    ACK d9141a0

Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants