-
Notifications
You must be signed in to change notification settings - Fork 37.7k
speed up Unserialize_impl for prevector #12324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for adding benchmarks! That's the way to do optimization PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash.
src/test/prevector_tests.cpp
Outdated
@@ -183,6 +183,20 @@ class prevector_tester { | |||
pre_vector = pre_vector_alt; | |||
} | |||
|
|||
void append(realtype values) { | |||
for(auto v : values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, space after for
.
src/test/prevector_tests.cpp
Outdated
} | ||
auto p = pre_vector.size(); | ||
auto f = [&]() { | ||
for(auto v : values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, space after for
.
@promag Thank you for your review. I fixed them and squashed commits. |
@fanquake Fortunately, I think that there was no collision or adverse effect between #10785 and #12324. Also, #12324 still usefull even if #10785 has been merged. Confirmation summary:
|
Concept ACK. I like the idea of making this faster (I've seen this taking a lot of time in my profiles), but using a template to pass a lambda seems unnecessarily complex. If it gets inlined it will be fast. But if it ends up not being inlined, the lambda will turn into a full std::function object (since it captures) and that will be slow. Can you either:
|
@eklitzke thank you for your comment. I will try it. |
Adding But I agree that lambdas and callbacks are a bit complex here. I personally think a caveat note on a method in /**
* Grow the size of the prevector by b bytes.
* NOTE: The added capacity must be overwritten, or it will contain garbage data.
*/ |
Agree with @kallewoof. It seems the goal of using a callback here is to avoid having a public method that brings the vector in a (partially) undefined state. However, the result is that now we have a callback that needs to run in this state. I would either:
|
@eklitzke @kallewoof @sipa Thank you for suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! Just some typos in the comments.
src/prevector.h
Outdated
@@ -382,6 +382,20 @@ class prevector { | |||
} | |||
} | |||
|
|||
inline void resize_uninitialized(size_type new_size) { | |||
// resize_uninitialized change the size of the prevector but dose not initialize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not: s/dose/does/
src/prevector.h
Outdated
@@ -382,6 +382,20 @@ class prevector { | |||
} | |||
} | |||
|
|||
inline void resize_uninitialized(size_type new_size) { | |||
// resize_uninitialized change the size of the prevector but dose not initialize. | |||
// If size < new_size, the added elements must be initialized explicitly after it return. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/return/returns/
@eklitzke Thank you for your pointing out for my typos. Fixed them. |
This looks good, although you still need to squash. I'm curious: do you still see the speedup from your intial benchmark? I know we changed other logic in this file since then. On master:
With your branch:
|
Could make sense to squash and rebase on master to ease benchmarking? |
Squashed and rebased. my enviroment : iMac late 2013 (macOS 10.13.3/i5 2.9GHz/mem 16GB/SSD)
[my PR]
|
utACK d85530db45b327eecf408bc8e9636fa60e886208 |
Thanks for checking. utACK d85530db45b327eecf408bc8e9636fa60e886208 |
Rebased and add a bench. PrevectorDeserializeNontrivial => 3% faster my enviroment : iMac late 2013 (macOS 10.13.3/i5 2.9GHz/mem 16GB/SSD) [on master ] commit 0212187
[my PR] commit 23afe7acfa7908905e826f09601c9564ff685be0
|
Could add the bench in a separate commit/pull request to make it easier to check for the speedup. |
@MarcoFalke Thank you for your suggestion. |
Re-orderd commits. |
46340b3 [bench] Add benchmark for unserialize prevector (Akio Nakamura) Pull request description: This PR adds benchmarks for the unserialization of the prevector. Note: Separated from #12324. Tree-SHA512: c055a283328cc2634c01eb60f26604a8665939bbf77d367b6ba6b4e01e77d4511fab69cc3ddb1e62969adb3c48752ed870f45ceba153eee192302601341e18a7
@kallewoof I understand your concern of slow-down caused from randomness. |
Cool about #10321, didn't realize that change was made. utACK b91962ecf0a9b90c989068e3f12e5699bc90ef6f [Edited to fix commit reference.] |
utACK b91962ecf0a9b90c989068e3f12e5699bc90ef6f |
void resize_uninitialized(realtype values) { | ||
size_t s = real_vector.size() / 2; | ||
real_vector.resize(s); | ||
pre_vector.resize_uninitialized(s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps here you could make sure before going in loop to reserve values.size() many more memory to optimize the vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shahzadlone Thank you for your review.
Addressed that you pointed out.
The unserializer for prevector uses resize() for reserve the area, but it's prefer to use reserve() because resize() have overhead to call its constructor many times. However, reserve() does not change the value of "_size" (a private member of prevector). This PR introduce resize_uninitialized() to prevector that similar to resize() but does not call constructor, and added elements are explicitly initialized in Unserialize_imple(). The changes are as follows: 1. prevector.h Add a public member function named 'resize_uninitialized'. This function processes like as resize() but does not call constructors. So added elemensts needs explicitly initialized after this returns. 2. serialize.h In the following two function: Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&) Unserialize_impl(Stream& is, prevector<N, T>& v, const V&) Calls resize_uninitialized() instead of resize() 3. test/prevector_tests.cpp Add a test for resize_uninitialized().
utACK 86b47fa |
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
Has anyone checked that this actually improves the benchmark as claimed in the OP. It does not for me. |
I ran the benchmarks on two linux machines (one pretty powerful (GCO) and one not so (Lefty)), and a MacBook Pro. Raw numbers at bottom. I see improvements in master compared to e2182b0 in (Sorry, I wasn't sure how to make graphs for benchmarks. Raw data below.)
|
@MarcoFalke @kallewoof Thank you for the comment and reporting the benchmark result. The #12549, which is an improvement to prevector and merged on 1 Mar 2018, is very efficient, so the improvement of this PR is relatively hidden in the benchmark of Deserialize*BlockTest. So I had have to change the PR description to PrevectorDeserializerivial instead of DeserializeBlockTest to clarify where is improved. |
Can verify I'm seeing speedups in this branch (as rebased onto master) relative to the commit that came before its merge into master, but only for gcc: e2182b0 vs. unserialize (relative)
These changes do not notably affect IBD time though. |
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
86b47fa speed up Unserialize_impl for prevector (Akio Nakamura) Pull request description: The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`. The changes are as follows: 1. prevector.h Add a public member function named 'append'. This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values. 2. serialize.h In the following two function: - `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)` - `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)` Make a callback function from each original logic of reading values from stream, and call prevector's `append()`. 3. test/prevector_tests.cpp Add a test for `append()`. ## A benchmark result is following: [Machine] MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) [result] DeserializeAndCheckBlockTest => 22% faster DeserializeBlockTest => 29% faster [before PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339 DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187 [After PR] # Benchmark, evals, iterations, total, min, max, median DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757 DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527 ACKs for top commit: laanwj: utACK 86b47fa Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
46340b3 [bench] Add benchmark for unserialize prevector (Akio Nakamura) Pull request description: This PR adds benchmarks for the unserialization of the prevector. Note: Separated from bitcoin#12324. Tree-SHA512: c055a283328cc2634c01eb60f26604a8665939bbf77d367b6ba6b4e01e77d4511fab69cc3ddb1e62969adb3c48752ed870f45ceba153eee192302601341e18a7
Summary: > The unserializer for prevector uses resize() for reserve the area, > but it's prefer to use reserve() because resize() have overhead > to call its constructor many times. > > However, reserve() does not change the value of "_size" > (a private member of prevector). > > This PR introduce resize_uninitialized() to prevector that similar to > resize() but does not call constructor, and added elements are > explicitly initialized in Unserialize_imple(). > > The changes are as follows: > 1. prevector.h > Add a public member function named 'resize_uninitialized'. > This function processes like as resize() but does not call constructors. > So added elemensts needs explicitly initialized after this returns. > > 2. serialize.h > In the following two function: > Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&) > Unserialize_impl(Stream& is, prevector<N, T>& v, const V&) > Calls resize_uninitialized() instead of resize() > > 3. test/prevector_tests.cpp > Add a test for resize_uninitialized(). Benchmark details provided in the PR discussion: > [Machine] > MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD) > > [result] > DeserializeAndCheckBlockTest => 22% faster > DeserializeBlockTest => 29% faster This is a backport of Core [[bitcoin/bitcoin#12324 | PR12324]] Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8367
b886a4c Fix header guards using reserved identifiers (Dan Raviv) b389b3f speed up Unserialize_impl for prevector (Akio Nakamura) 13d2102 Minimal fix to slow prevector tests as stopgap measure (Jeremy Rubin) Pull request description: Backports bitcoin#12324. The `DeserializeAndCheckBlock` benchmark (introduced in #2146) shows a speedup of about 4% (not as much as the upstream PR, due to the optimizations already included in #2083). Cherry-picks also - bitcoin#8671 (with minimal changes to the random context, due to bitcoin#8914 and bitcoin#9792 being already ported out of order). - bitcoin#11151 ACKs for top commit: Fuzzbawls: utACK b886a4c furszy: utACK b886a4c and merging.. Tree-SHA512: f5de40e5acfb0b875d413d8995d71dd90489730fe4853f0be03d76a1c44ec95eaeb28c0c40d8e91906f23529ad26501bda4f9779ce466cd8603ed97f1662ca98
46340b3 [bench] Add benchmark for unserialize prevector (Akio Nakamura) Pull request description: This PR adds benchmarks for the unserialization of the prevector. Note: Separated from bitcoin#12324. Tree-SHA512: c055a283328cc2634c01eb60f26604a8665939bbf77d367b6ba6b4e01e77d4511fab69cc3ddb1e62969adb3c48752ed870f45ceba153eee192302601341e18a7
Summary === This is a backport of bitcoin/bitcoin#12324 The unserializer for prevector uses `resize()` for reserving the area, but it's preferred to use `reserve()` because `resize()` has overhead to call its constructor many times. However, `reserve()` does not change the value of `_size` (a private member of prevector). This PR introduce `resize_uninitialized()` to prevector that similar to `resize()` but does not call constructor, and added elements are explicitly initialized in `Unserialize_impl()`. The changes are as follows: 1. prevector.h Add a public member function named `resize_uninitialized`. This function processes like as `resize()` but does not call constructors. So added elemensts needs explicitly initialized after this returns. 2. serialize.h In the following two function: `Unserialize_impl(Stream &is, prevector<N, T> &v, const uint8_t &)` `Unserialize_impl(Stream &is, prevector<N, T> &v, const V &)` Calls `resize_uninitialized()` instead of `resize()` 3. test/prevector_tests.cpp Add a test for `resize_uninitialized()`. Benchmark === Before --- DeserializeAndCheckBlockTest_1MB, 10, 130, 13.9896, 0.0100612, 0.011033, 0.0109143 DeserializeAndCheckBlockTest_32MB, 10, 2, 13.0196, 0.614051, 0.705692, 0.64535 DeserializeBlockTest_1MB, 10, 160, 13.2404, 0.00795686, 0.00876105, 0.00827999 DeserializeBlockTest_32MB, 10, 3, 12.1832, 0.392285, 0.432704, 0.40708 After --- DeserializeAndCheckBlockTest_1MB, 10, 130, 13.5135, 0.0100882, 0.0110394, 0.0104667 DeserializeAndCheckBlockTest_32MB, 10, 2, 12.47, 0.601044, 0.655857, 0.624506 DeserializeBlockTest_1MB, 10, 160, 13.2542, 0.00799794, 0.00857411, 0.00829794 DeserializeBlockTest_32MB, 10, 3, 11.8423, 0.380222, 0.405202, 0.399713 Test plan === * `ninja check-all` * `ninja bench_bitcoin` * `./src/bench/bench_bitcoin -filter='Deser.*' --evals=10`
The unserializer for prevector uses
resize()
for reserve the area, but it's prefer to usereserve()
becauseresize()
have overhead to call its constructor many times.However,
reserve()
does not change the value of_size
(a private member of prevector).This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of
_size
.The changes are as follows:
prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
serialize.h
In the following two function:
Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)
Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)
Make a callback function from each original logic of reading values from stream, and call prevector's
append()
.Add a test for
append()
.A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527