refactor: pass CTxMemPool and CFeeRate in-param objects by const reference #23076

jonatack · 2021-09-23T14:51:40Z

Rationale

maintainability: no need to change further as these classes evolve
clarity for developers reading the code and this description
performance in some cases

Excerpts from Scott Meyers' "Effective C++, Third Edition" (2005), Item 20, pages 86-94, and "Effective Modern C++" (2015), Item 41, pages 281-292:

Avoid passing objects of user-defined types by value.
One might conclude that all small types are good candidates for pass-by-value, even if they're user-defined. However, just because an object is small doesn't mean that calling its copy constructor is inexpensive. Many objects--most STL containers among them--contain little more than a pointer, but copying such objects entails copying everything they point to. That can be very expensive.
Even when small objects have inexpensive copy constructors, there can be performance issues. Some compilers treat built-in and user-defined types differently, even if they have the same underlying representation. When that type of thing happens, you can be better off passing such objects by reference, because compilers will certainly put pointers (the implementation of references) into registers.
Another reason why small user-defined types are not necessarily good pass-by-value candidates is that, being user-defined, their size is subject to change. A type that's small now may be bigger in the future if its internal implementation changes. Things can even change when switching to a different C++ implementation.
In general, the only types for which you can reasonably assume that pass-by-value is inexpensive are built-in types and STL iterator and function object types. For everything else, prefer pass-by-reference-to-const over pass-by-value.
Exception: possibly consider pass-by-value of user-defined types for copyable parameters that are cheap to move, if they are always copied anyway in their implementation.
When there are chains of function calls, each of which passes-by-value, the cost for the entire chain of calls accumulates and may not be something you can tolerate. With by-reference passing, chains of calls don't incur this kind of accumulated overhead.
The most practical approach may be to adopt a "guilty until proven innocent" policy for pass-by-value of user-defined types.

JeremyRubin · 2021-09-24T05:17:32Z

Is there any point in passing cfeerate by const reference vs just const value?

…

On Thu, Sep 23, 2021, 7:52 AM Jon Atack ***@***.***> wrote: and add missing feerate and txmempool headers. ------------------------------ You can view, comment on, or merge this pull request online at: #23076 Commit Summary - Pass CFeeRate and CTxMemPool in-params by reference to const <2e4a6cd> File Changes - *M* src/policy/rbf.cpp <https://github.com/bitcoin/bitcoin/pull/23076/files#diff-fa5cb2d84034ff72cdb9d479b17cf8c744a9bf3fc932b3a77c1a017edd767dfa> (8) - *M* src/policy/rbf.h <https://github.com/bitcoin/bitcoin/pull/23076/files#diff-a46953a84664bb5ac541df026cd5a22e88c50672f551c5e6d7825497c6535c85> (8) - *M* src/wallet/coinselection.h <https://github.com/bitcoin/bitcoin/pull/23076/files#diff-38f1a8db124a979cb6dd76ce263f7aae0053d6967ee909e6356115fa0402dc8c> (4) Patch Links: - https://github.com/bitcoin/bitcoin/pull/23076.patch - https://github.com/bitcoin/bitcoin/pull/23076.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#23076>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGYN67CNE6N4Z6HYN73JHTUDM5J7ANCNFSM5ET4AYJA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

laanwj · 2021-09-24T11:05:33Z

Is there any point in passing cfeerate by const reference vs just const

Agree. A CFeeRate is basically a wrapper around a CAmount, so 8 bytes. Seems for constant use it's better to pass it by value, instead of adding a level of indirection by reference.

jonatack · 2021-09-24T14:10:16Z

Yes a CAmount is 8 bytes, and https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-in suggests a threshold of 2-3 words, e.g. 2 * sizeof(void*).

So, I wondered why elsewhere than this patch CFeeRate params are passed by reference to const. For instance, in feerate.h, fees.{h,cpp}, policy.{h,cpp} andtxmempool.{h,cpp} (granted, it could all be an oversight).

I found more in-depth discussion in Scott Meyers' "Effective C++, Third Edition" (2005), Item 20, pages 86-94, and "Effective Modern C++" (2015), Item 41, pages 281-292. Maybe it is somewhat dated. Apologies if it is obvious info. Some excerpts:

Avoid passing objects of user-defined types by value.
One might conclude that all small types are good candidates for pass-by-value,
even if they're user-defined. However, just because an object is small doesn't
mean that calling its copy constructor is inexpensive. Many objects--most STL
containers among them--contain little more than a pointer, but copying such
objects entails copying everything they point to. That can be very expensive.
Even when small objects have inexpensive copy constructors, there can be
performance issues. Some compilers treat built-in and user-defined types
differently, even if they have the same underlying representation. When that
type of thing happens, you can be better off passing such objects by
reference, because compilers will certainly put pointers (the implementation
of references) into registers.
Another reason why small user-defined types are not necessarily good
pass-by-value candidates is that, being user-defined, their size is subject to
change. A type that's small now may be bigger in the future if its internal
implementation changes. Things can even change when switching to a different
C++ implementation.
In general, the only types for which you can reasonably assume that
pass-by-value is inexpensive are built-in types and STL iterator and function
object types. For everything else, prefer pass-by-reference-to-const over
pass-by-value.
Exception: possibly consider pass-by-value of user-defined types for copyable
parameters that are cheap to move, if they are always copied anyway in their
implementation.
When there are chains of function calls, each of which passes-by-value, the
cost for the entire chain of calls accumulates and may not be something you can
tolerate. With by-reference passing, chains of calls don't incur this kind of
accumulated overhead.
The most practical approach may be to adopt a "guilty until proven innocent"
policy for pass-by-value of user-defined types.

jonatack · 2021-09-24T14:22:28Z

It looks like there are 18 lines of CFeeRate params passed by reference to const, and 7 in this patch passed by value. If the 18 should be changed rather than these 7, it would remain a reasonably small change.

DrahtBot · 2021-09-25T15:29:49Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#23475 (wallet: add config to prioritize a solution that doesn't create change in coin selection by brunoerg)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

laanwj · 2021-09-27T13:20:07Z

Yes I think the main thing to consider is maintainability: if CFeeRate might grow larger at some point in the future, we do not want to change all of these again. FWIW, the story behind CAmount historically being passed by-reference is that it makes it easier to share code with coins that have non-POD amounts (at the time, Freicoin). I don't think we actually care about this, but for the info.

Anyhow:

I think the chance of more fields being added to CFeeRate is relatively small
however, CFeeRate is never passed to really performance critical functions where copying a few more bytes would really make a difference. Nor the extra level of indirection, for that matter.

So to be honest I don't feel strongly either way. It would be somewhat nice to be consistent but that's all.

JeremyRubin · 2021-09-27T18:18:12Z

i don't particularly care either just figured it might be better to pass by const value.

jonatack · 2021-09-29T09:41:43Z

Yes, ISTM the advantages of this change are small amounts (pun intended) of:

maintainability (no need to change further as CFeeRate evolves)
clarity for developers reading the code and this discussion
and perhaps even performance...

With by-reference passing, chains of calls don't incur this kind of accumulated overhead.

...as the CFeeRate comparison operators in feerate.h are reference to const, when feerates are passed into functions to be compared by invoking one of those operators, like in policy/rbf.{h,cpp} in this diff, performance might be not worse and indeed (very slightly) better if the call chain remains reference to const rather than introducing a needless copy in the middle. At least, performance is a wash or shouldn't be an issue.

I tried writing some benchmarks as a sanity check but I didn't see anything conclusive (the difference may be too small to see and my machine isn't very stable for benchmarking anyway): https://github.com/jonatack/bitcoin/commits/bench-passing-CAmount-and-CFeeRate

jonatack · 2021-09-29T10:16:34Z

FWIW, the story behind CAmount historically being passed by-reference is that it makes it easier to share code with coins that have non-POD amounts (at the time, Freicoin).

Thanks for the context! Been wondering for some time why that was.

PastaPastaPasta

utACK 2e4a6cd

This is a simply change, const is good, and based on the discussion I think that passing by reference is also fine/good. Additionally, it seems this PR has none / maybe 1 conflicting PR, so imo we should just merge it.

jonatack · 2022-03-24T13:14:07Z

Rebased. git range-diff e7b6272 2e4a6cd 4f9c44b

w0xlt

Code Review ACK 4f9c44b.

I agree that it's a better approach to pass user-defined types by reference.

jonatack · 2022-04-29T13:41:00Z

Rebased per git range-diff 26296eb 4f9c44b 3c7a3c1.

jonatack · 2022-07-30T13:04:11Z

Rebased, 2 previous ACKs by @PastaPastaPasta and @w0xlt (thanks!)

maflcko · 2022-07-30T13:30:13Z

src/policy/rbf.cpp

@@ -56,7 +56,7 @@ RBFTransactionState IsRBFOptInEmptyMempool(const CTransaction& tx)
 }

 std::optional<std::string> GetEntriesForConflicts(const CTransaction& tx,
-                                                  CTxMemPool& pool,
+                                                  const CTxMemPool& pool,


& doesn't create a copy of CTxMemPool, so I think the pull title is wrong.

Also, I wonder if this is something that can be enforced with clang-tidy, if we want it. Maybe https://clang.llvm.org/extra/clang-tidy/checks/readability/non-const-parameter.html ?

Thanks Marco, updated the title and commit message and testing your clang-tidy idea.

Dropped the tidy check as it proposed making a pointer to const value that needs to be mutable.

Yeah, I don't know if there is a tidy check for this. But when the reason for the changes in this pull are "consistency" #23076 (comment) , we should enforce this "consistency" to avoid incremental fixups to existing code.

Same for the CFeeRate change.

Many changes seem to be based on copying or grepping the existing code. For example, if CFeeRate in-params are all passed in the same fashion rather than, say, 2/3 by const ref and 1/3 by value, ISTM that alone may help to maintain consistency.

How do you prevent a change from introducing this tomorrow? And then a follow-up pull to that change in tomorrow+n days?

Removed consistency from the pull description summary (I think there is ample remaining rationale).

My comment was meant generally, I only filled in a random rationale as an example.

When the reason for the changes in this pull are "$rationale", we should ideally enforce this "$rationale" to avoid incremental fixups to existing or future code.

How do you prevent a change from introducing this tomorrow? And then a follow-up pull to that change in tomorrow+n days?

This is true for many changes that are merged.

There was plenty of rationale provided in the PR description and it seems pointless to continue to spend more time here.

…erence as they are classes / user-defined types. Rationale: - maintainability, no need to change further as these classes evolve - consistency - clarity for developers reading the code and this description - performance in some cases Excerpts from: - Scott Meyers, "Effective C++, Third Edition" (2005), Item 20, pages 86-94 - Scott Meyers, "Effective Modern C++" (2015), Item 41, pages 281-292 - Avoid passing objects of user-defined types by value. - One might conclude that all small types are good candidates for pass-by-value, even if they're user-defined. However, just because an object is small doesn't mean that calling its copy constructor is inexpensive. Many objects--most STL containers among them--contain little more than a pointer, but copying such objects entails copying everything they point to. That can be very expensive. - Even when small objects have inexpensive copy constructors, there can be performance issues. Some compilers treat built-in and user-defined types differently, even if they have the same underlying representation. When that type of thing happens, you can be better off passing such objects by reference, because compilers will certainly put pointers (the implementation of references) into registers. - Another reason why small user-defined types are not necessarily good pass-by-value candidates is that, being user-defined, their size is subject to change. A type that's small now may be bigger in the future if its internal implementation changes. Things can even change when switching to a different C++ implementation. - In general, the only types for which you can reasonably assume that pass-by-value is inexpensive are built-in types and STL iterator and function object types. For everything else, prefer pass-by-reference-to-const over pass-by-value. - Exception: possibly consider pass-by-value of user-defined types for copyable parameters that are cheap to move, if they are always copied anyway in their implementation. - When there are chains of function calls, each of which passes-by-value, the cost for the entire chain of calls accumulates and may not be something you can tolerate. With by-reference passing, chains of calls don't incur this kind of accumulated overhead. - The most practical approach may be to adopt a "guilty until proven innocent" policy for pass-by-value of user-defined types.

jonatack · 2022-07-30T16:02:23Z

Closing as this simple change has been open with ACKs for almost a year without merge. If it was desired, it would have been merged.

ryanofsky

Code review ACK f5cf045, but I also agree with closing the issue since there doesn't seem to have been agreement on it.

FWIW, I prefer passing passing by const reference just because it clearly expresses semantics of "this argument will only be read" and like reserving passing by value for meaning "this argument might be moved/copied/inserted somewhere." But there are cases where functions are called in loops and copying by value is more performant. Funny enough, I just saw something about how this conflict is supposed to be avoided in google's experimental c++ replacement by some new language rule https://www.foonathan.net/2022/07/carbon-calling-convention/

laanwj added the Refactoring label Sep 23, 2021

DrahtBot mentioned this pull request Sep 26, 2021

refactor: Nuke policy/fees->mempool circular dependencies #17786

Merged

DrahtBot mentioned this pull request Jan 6, 2022

wallet: add config to prioritize a solution that doesn't create change in coin selection #23475

Closed

PastaPastaPasta approved these changes Jan 31, 2022

View reviewed changes

This was referenced Mar 8, 2022

wallet: generate random change target for each tx for better privacy #24494

Merged

wallet: Use single FastRandomContext when creating a wallet tx #24560

Merged

DrahtBot added the Needs rebase label Mar 23, 2022

jonatack force-pushed the feerate-and-txmempool-fixups branch from 2e4a6cd to 4f9c44b Compare March 24, 2022 13:13

DrahtBot added Needs rebase and removed Needs rebase labels Mar 24, 2022

w0xlt approved these changes Mar 26, 2022

View reviewed changes

jonatack force-pushed the feerate-and-txmempool-fixups branch from 4f9c44b to 3c7a3c1 Compare April 29, 2022 13:38

DrahtBot removed the Needs rebase label Apr 29, 2022

DrahtBot mentioned this pull request Jun 1, 2022

Move minRelayTxFee to policy/settings #25254

Merged

DrahtBot added the Needs rebase label Jun 7, 2022

jonatack changed the title ~~Pass CFeeRate and CTxMemPool in-params by reference to const~~ refactor: avoid unnecessarily copying CTxMemPool and CFeeRate objects Jul 30, 2022

jonatack force-pushed the feerate-and-txmempool-fixups branch from 3c7a3c1 to f5b094d Compare July 30, 2022 11:37

DrahtBot removed the Needs rebase label Jul 30, 2022

maflcko reviewed Jul 30, 2022

View reviewed changes

jonatack changed the title ~~refactor: avoid unnecessarily copying CTxMemPool and CFeeRate objects~~ refactor: pass CTxMemPool and CFeeRate in-param objects by const reference Jul 30, 2022

jonatack force-pushed the feerate-and-txmempool-fixups branch 3 times, most recently from 0afba19 to f5cf045 Compare July 30, 2022 15:39

jonatack closed this Jul 30, 2022

ryanofsky reviewed Aug 1, 2022

View reviewed changes

bitcoin locked and limited conversation to collaborators Aug 1, 2023

refactor: pass CTxMemPool and CFeeRate in-param objects by const reference #23076

refactor: pass CTxMemPool and CFeeRate in-param objects by const reference #23076

Uh oh!

Conversation

jonatack commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JeremyRubin commented Sep 24, 2021 via email

Uh oh!

laanwj commented Sep 24, 2021

Uh oh!

jonatack commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonatack commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Sep 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

laanwj commented Sep 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JeremyRubin commented Sep 27, 2021

Uh oh!

jonatack commented Sep 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonatack commented Sep 29, 2021

Uh oh!

PastaPastaPasta left a comment

Choose a reason for hiding this comment

Uh oh!

jonatack commented Mar 24, 2022

Uh oh!

w0xlt left a comment

Choose a reason for hiding this comment

Uh oh!

jonatack commented Apr 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonatack commented Jul 30, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonatack commented Jul 30, 2022

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonatack commented Sep 23, 2021 •

edited

Loading

jonatack commented Sep 24, 2021 •

edited

Loading

jonatack commented Sep 24, 2021 •

edited

Loading

DrahtBot commented Sep 25, 2021 •

edited

Loading

laanwj commented Sep 27, 2021 •

edited

Loading

jonatack commented Sep 29, 2021 •

edited

Loading

jonatack commented Apr 29, 2022 •

edited

Loading