-
Notifications
You must be signed in to change notification settings - Fork 37.7k
bloom: use Span instead of std::vector for insert
and contains
[ZAP3]
#18985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Do you have a good way to quantify the impact of this change? That would be interesting to see :) |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
@practicalswift this is something I want to do once I figure out how to use the benchmarking suite. This one might be nontrivial though since it's networking code. I might just do the benchmarking in #18849 to start. One piece of data that I can show here is the number of allocations before and after which I'll do |
Concept ACK, I think this is the right approach. In general I think that most functions that take either a const vector/prevector as input, or begin/end pointer, or a begin pointer + size, should be replaced with versions that just take in a Span. It would be worthwhile to make Nit: uint256 is not a vector internally, but an array. |
I've noticed that we consistently pass
|
Concept ACK. Will review as soon as this is rebased.
@practicalswift: Good point. I guess seen from a performance perspective it doesn't really make a difference, considering how lightweight spans are. In any case I agree that we should agree on one passing type and use it consistently in the code-base (my personal preference would be to always pass it by value, though). |
Sebastian Falbesoner <notifications@github.com> writes:
Concept ACK. Will review as soon as this is rebased.
> I've noticed that we consistently pass Span by const-ref instead of
> by value in the project (this PR follows that convention): is that an
> intentional choice? Just curious :)
@practicalswift: Good point. I guess seen from a performance
perspective it doesn't really make a difference, considering how
lightweight spans are. In any case I agree that we should agree on one
passing type and use it consistently in the code-base (my personal
preference would be to always pass it by value, though).
It looks like they are being passed by value in other PRs, I don't mind
switching to that style here.
|
@theStack @jb55 FWIW, throughout the C++ Core Guidelines
Same goes for |
a6d8f9f
to
1466b65
Compare
rebased and switched from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
Sebastian Falbesoner <notifications@github.com> writes:
Could simply use the recently introduced `MakeUCharSpan()` helper here and on other occurences of the commit.
oh nice, I didn't know that was a thing. I'll do that.
|
1466b65
to
897f5be
Compare
pushed 897f5be
much cleaner 🤩 diff --git a/src/bloom.cpp b/src/bloom.cpp
index 436e913aa5..c19d6e7d7d 100644
--- a/src/bloom.cpp
+++ b/src/bloom.cpp
@@ -59,12 +59,12 @@ void CBloomFilter::insert(const COutPoint& outpoint)
{
CDataStream stream(SER_NETWORK, PROTOCOL_VERSION);
stream << outpoint;
- insert(Span<const unsigned char>((const unsigned char*)stream.data(), stream.size()));
+ insert(MakeUCharSpan(stream));
}
void CBloomFilter::insert(const uint256& hash)
{
- insert(Span<const unsigned char>(hash.begin(), hash.end()));
+ insert(MakeUCharSpan(hash));
}
bool CBloomFilter::contains(Span<const unsigned char> vKey) const
@@ -85,13 +85,12 @@ bool CBloomFilter::contains(const COutPoint& outpoint) const
{
CDataStream stream(SER_NETWORK, PROTOCOL_VERSION);
stream << outpoint;
- return contains(Span<const unsigned char>((const unsigned char*)stream.data(),
- stream.size()));
+ return contains(MakeUCharSpan(stream));
}
bool CBloomFilter::contains(const uint256& hash) const
{
- return contains(Span<const unsigned char>(hash.begin(), hash.end()));
+ return contains(MakeUCharSpan(hash));
}
bool CBloomFilter::IsWithinSizeConstraints() const
@@ -241,7 +240,7 @@ void CRollingBloomFilter::insert(Span<const unsigned char> vKey)
void CRollingBloomFilter::insert(const uint256& hash)
{
- insert(Span<const unsigned char>(hash.begin(), hash.end()));
+ insert(MakeUCharSpan(hash));
}
bool CRollingBloomFilter::contains(Span<const unsigned char> vKey) const
@@ -260,7 +259,7 @@ bool CRollingBloomFilter::contains(Span<const unsigned char> vKey) const
bool CRollingBloomFilter::contains(const uint256& hash) const
{
- return contains(Span<const unsigned char>(hash.begin(), hash.end()));
+ return contains(MakeUCharSpan(hash));
}
void CRollingBloomFilter::reset()
diff --git a/src/test/bloom_tests.cpp b/src/test/bloom_tests.cpp
index 1ad5607429..9cf6e352e4 100644
--- a/src/test/bloom_tests.cpp
+++ b/src/test/bloom_tests.cpp
@@ -91,7 +91,7 @@ BOOST_AUTO_TEST_CASE(bloom_create_insert_key)
CBloomFilter filter(2, 0.001, 0, BLOOM_UPDATE_ALL);
filter.insert(vchPubKey);
uint160 hash = pubkey.GetID();
- filter.insert(Span<unsigned char>(hash.begin(), hash.end()));
+ filter.insert(MakeUCharSpan(hash));
CDataStream stream(SER_NETWORK, PROTOCOL_VERSION);
stream << filter; |
We can avoid many unnecessary std::vector allocations by changing CBloomFilter to take Spans instead of std::vector's for the `insert` and `contains` operations. CBloomFilter currently converts types such as CDataStream and uint256 to std::vector on `insert` and `contains`. This is unnecessary because CDataStreams and uint256 are already std::vectors internally. We just need a way to point to the right data within those types. Span gives us this ability. Signed-off-by: William Casarin <jb55@jb55.com>
897f5be
to
be8c3b4
Compare
rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK be8c3b4
Theoretically, this could be a good improvement because it removes the smart pointer from the game, but I never use it in practice the span and I'm curious to see how much improvement brings this game.
On Tue, May 04, 2021 at 03:28:59PM -0700, Vincenzo Palazzo wrote:
Theoretically, this could be a good improvement because it removes the smart pointer from the game, but I never use it in practice the span and I'm curious to see how much improvement brings this game.
See my comment here:
This code came up as a place where many allocations occur. Mainly due to allocations of CService::GetKey which is passed to these functions, but this PR is needed before we get to that.
This PR is a stepping stone to:
[ZAP4] netaddress: return a prevector from CService::GetKey() [1]
Which removes a bunch of unnecessary heap allocations during network operations.
[1] #18849
|
Concept ACK. Have you figured out a way to benchmark this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs rebase
void insert(const uint256& hash); | ||
bool contains(const std::vector<unsigned char>& vKey) const; | ||
bool contains(Span<const unsigned char> vKey) const; | ||
bool contains(const uint256& hash) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this method now
void insert(const COutPoint& outpoint); | ||
void insert(const uint256& hash); | ||
|
||
bool contains(const std::vector<unsigned char>& vKey) const; | ||
bool contains(Span<const unsigned char> vKey) const; | ||
bool contains(const COutPoint& outpoint) const; | ||
bool contains(const uint256& hash) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -83,7 +83,7 @@ BOOST_AUTO_TEST_CASE(bloom_create_insert_key) | |||
CBloomFilter filter(2, 0.001, 0, BLOOM_UPDATE_ALL); | |||
filter.insert(vchPubKey); | |||
uint160 hash = pubkey.GetID(); | |||
filter.insert(std::vector<unsigned char>(hash.begin(), hash.end())); | |||
filter.insert(MakeUCharSpan(hash)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for MakeUCharSpan
, filter.insert(hash);
works now too
@@ -59,17 +59,15 @@ void CBloomFilter::insert(const COutPoint& outpoint) | |||
{ | |||
CDataStream stream(SER_NETWORK, PROTOCOL_VERSION); | |||
stream << outpoint; | |||
std::vector<unsigned char> data(stream.begin(), stream.end()); | |||
insert(data); | |||
insert(MakeUCharSpan(stream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for MakeUCharSpan
I think DrahtBot screwed up here. No rebase happened in between. |
:( |
Any specific reason for closing? |
lost interest. It's fine with me if someone else wants to pick it up. |
@jb55, I can be interested in it. I love your optimization, and maybe can be also a good starting point to push some code on bitcoin :-) |
Rebased, with comments addressed in #23115. |
…rt` and `contains` a11da75 bloom: cleanup includes (fanquake) f1ed1d3 bloom: use constexpr where appropriate (fanquake) 2ba4ddf bloom: use Span instead of std::vector for `insert` and `contains` (William Casarin) Pull request description: This is bitcoin#18985 rebased, with the most recent comments addressed. > We can avoid many unnecessary std::vector allocations by changing CBloomFilter to take Spans instead of std::vector's for the `insert` and `contains` operations. > CBloomFilter currently converts types such as CDataStream and uint256 to std::vector on `insert` and `contains`. This is unnecessary because CDataStreams and uint256 are already std::vectors internally. We just need a way to point to the right data within those types. Span gives us this ability. ACKs for top commit: sipa: Code review ACK a11da75 laanwj: Code review ACK a11da75 Tree-SHA512: ee9ba02c9588daa1ff51782d1953fd060839dd15aa85861b2633b6ff2398320188ddd00f01d0c99442224485364ede9f8322366de4239fc7831ebfa06bd34659
This builds off #18468 since it simplifies this PR a bunch. Review that first.
We can avoid many unnecessary
std::vector
allocations by changingCBloomFilter
to takeSpan
s instead ofstd::vector
s for theinsert
and
contains
operations.CBloomFilter
currently converts types such asCDataStream
anduint256
to
std::vector
oninsert
andcontains
. This is unnecessary becauseCDataStream
s anduint256
s are alreadystd::vectors
internally. We justneed a way to point to the right data within those types.
Span
givesus this ability.
This is a part of the Zero Allocations Project #18849 (ZAP3). This code came up as a place where many allocations occur. Mainly due to allocations of
CService::GetKey
which is passed to these functions, but this PR is needed before we get to that.