Skip to content

Conversation

jonatack
Copy link
Member

@jonatack jonatack commented Sep 23, 2021

Rationale

  • maintainability: no need to change further as these classes evolve
  • clarity for developers reading the code and this description
  • performance in some cases

Excerpts from Scott Meyers' "Effective C++, Third Edition" (2005), Item 20, pages 86-94, and "Effective Modern C++" (2015), Item 41, pages 281-292:

  • Avoid passing objects of user-defined types by value.

  • One might conclude that all small types are good candidates for pass-by-value, even if they're user-defined. However, just because an object is small doesn't mean that calling its copy constructor is inexpensive. Many objects--most STL containers among them--contain little more than a pointer, but copying such objects entails copying everything they point to. That can be very expensive.

  • Even when small objects have inexpensive copy constructors, there can be performance issues. Some compilers treat built-in and user-defined types differently, even if they have the same underlying representation. When that type of thing happens, you can be better off passing such objects by reference, because compilers will certainly put pointers (the implementation of references) into registers.

  • Another reason why small user-defined types are not necessarily good pass-by-value candidates is that, being user-defined, their size is subject to change. A type that's small now may be bigger in the future if its internal implementation changes. Things can even change when switching to a different C++ implementation.

  • In general, the only types for which you can reasonably assume that pass-by-value is inexpensive are built-in types and STL iterator and function object types. For everything else, prefer pass-by-reference-to-const over pass-by-value.

  • Exception: possibly consider pass-by-value of user-defined types for copyable parameters that are cheap to move, if they are always copied anyway in their implementation.

  • When there are chains of function calls, each of which passes-by-value, the cost for the entire chain of calls accumulates and may not be something you can tolerate. With by-reference passing, chains of calls don't incur this kind of accumulated overhead.

  • The most practical approach may be to adopt a "guilty until proven innocent" policy for pass-by-value of user-defined types.

@JeremyRubin
Copy link
Contributor

JeremyRubin commented Sep 24, 2021 via email

@laanwj
Copy link
Member

laanwj commented Sep 24, 2021

Is there any point in passing cfeerate by const reference vs just const

Agree. A CFeeRate is basically a wrapper around a CAmount, so 8 bytes. Seems for constant use it's better to pass it by value, instead of adding a level of indirection by reference.

@jonatack
Copy link
Member Author

jonatack commented Sep 24, 2021

Yes a CAmount is 8 bytes, and https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-in suggests a threshold of 2-3 words, e.g. 2 * sizeof(void*).

So, I wondered why elsewhere than this patch CFeeRate params are passed by reference to const. For instance, in feerate.h, fees.{h,cpp}, policy.{h,cpp} andtxmempool.{h,cpp} (granted, it could all be an oversight).

I found more in-depth discussion in Scott Meyers' "Effective C++, Third Edition" (2005), Item 20, pages 86-94, and "Effective Modern C++" (2015), Item 41, pages 281-292. Maybe it is somewhat dated. Apologies if it is obvious info. Some excerpts:

  • Avoid passing objects of user-defined types by value.

  • One might conclude that all small types are good candidates for pass-by-value,
    even if they're user-defined. However, just because an object is small doesn't
    mean that calling its copy constructor is inexpensive. Many objects--most STL
    containers among them--contain little more than a pointer, but copying such
    objects entails copying everything they point to. That can be very expensive.

  • Even when small objects have inexpensive copy constructors, there can be
    performance issues. Some compilers treat built-in and user-defined types
    differently, even if they have the same underlying representation. When that
    type of thing happens, you can be better off passing such objects by
    reference, because compilers will certainly put pointers (the implementation
    of references) into registers.

  • Another reason why small user-defined types are not necessarily good
    pass-by-value candidates is that, being user-defined, their size is subject to
    change. A type that's small now may be bigger in the future if its internal
    implementation changes. Things can even change when switching to a different
    C++ implementation.

  • In general, the only types for which you can reasonably assume that
    pass-by-value is inexpensive are built-in types and STL iterator and function
    object types. For everything else, prefer pass-by-reference-to-const over
    pass-by-value.

  • Exception: possibly consider pass-by-value of user-defined types for copyable
    parameters that are cheap to move, if they are always copied anyway in their
    implementation.

  • When there are chains of function calls, each of which passes-by-value, the
    cost for the entire chain of calls accumulates and may not be something you can
    tolerate. With by-reference passing, chains of calls don't incur this kind of
    accumulated overhead.

  • The most practical approach may be to adopt a "guilty until proven innocent"
    policy for pass-by-value of user-defined types.

@jonatack
Copy link
Member Author

jonatack commented Sep 24, 2021

It looks like there are 18 lines of CFeeRate params passed by reference to const, and 7 in this patch passed by value. If the 18 should be changed rather than these 7, it would remain a reasonably small change.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 25, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #23475 (wallet: add config to prioritize a solution that doesn't create change in coin selection by brunoerg)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj
Copy link
Member

laanwj commented Sep 27, 2021

Yes I think the main thing to consider is maintainability: if CFeeRate might grow larger at some point in the future, we do not want to change all of these again. FWIW, the story behind CAmount historically being passed by-reference is that it makes it easier to share code with coins that have non-POD amounts (at the time, Freicoin). I don't think we actually care about this, but for the info.

Anyhow:

  • I think the chance of more fields being added to CFeeRate is relatively small
  • however, CFeeRate is never passed to really performance critical functions where copying a few more bytes would really make a difference. Nor the extra level of indirection, for that matter.

So to be honest I don't feel strongly either way. It would be somewhat nice to be consistent but that's all.

@JeremyRubin
Copy link
Contributor

i don't particularly care either just figured it might be better to pass by const value.

@jonatack
Copy link
Member Author

jonatack commented Sep 29, 2021

Yes, ISTM the advantages of this change are small amounts (pun intended) of:

  • maintainability (no need to change further as CFeeRate evolves)
  • clarity for developers reading the code and this discussion
  • and perhaps even performance...

With by-reference passing, chains of calls don't incur this kind of accumulated overhead.

...as the CFeeRate comparison operators in feerate.h are reference to const, when feerates are passed into functions to be compared by invoking one of those operators, like in policy/rbf.{h,cpp} in this diff, performance might be not worse and indeed (very slightly) better if the call chain remains reference to const rather than introducing a needless copy in the middle. At least, performance is a wash or shouldn't be an issue.

I tried writing some benchmarks as a sanity check but I didn't see anything conclusive (the difference may be too small to see and my machine isn't very stable for benchmarking anyway): https://github.com/jonatack/bitcoin/commits/bench-passing-CAmount-and-CFeeRate

@jonatack
Copy link
Member Author

FWIW, the story behind CAmount historically being passed by-reference is that it makes it easier to share code with coins that have non-POD amounts (at the time, Freicoin).

Thanks for the context! Been wondering for some time why that was.

Copy link
Contributor

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 2e4a6cd

This is a simply change, const is good, and based on the discussion I think that passing by reference is also fine/good. Additionally, it seems this PR has none / maybe 1 conflicting PR, so imo we should just merge it.

@jonatack
Copy link
Member Author

Rebased. git range-diff e7b6272 2e4a6cd 4f9c44b

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK 4f9c44b.

I agree that it's a better approach to pass user-defined types by reference.

@jonatack jonatack force-pushed the feerate-and-txmempool-fixups branch from 4f9c44b to 3c7a3c1 Compare April 29, 2022 13:38
@jonatack
Copy link
Member Author

jonatack commented Apr 29, 2022

Rebased per git range-diff 26296eb 4f9c44b 3c7a3c1.

@jonatack jonatack changed the title Pass CFeeRate and CTxMemPool in-params by reference to const refactor: avoid unnecessarily copying CTxMemPool and CFeeRate objects Jul 30, 2022
@jonatack jonatack force-pushed the feerate-and-txmempool-fixups branch from 3c7a3c1 to f5b094d Compare July 30, 2022 11:37
@jonatack
Copy link
Member Author

Rebased, 2 previous ACKs by @PastaPastaPasta and @w0xlt (thanks!)

@@ -56,7 +56,7 @@ RBFTransactionState IsRBFOptInEmptyMempool(const CTransaction& tx)
}

std::optional<std::string> GetEntriesForConflicts(const CTransaction& tx,
CTxMemPool& pool,
const CTxMemPool& pool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

& doesn't create a copy of CTxMemPool, so I think the pull title is wrong.

Also, I wonder if this is something that can be enforced with clang-tidy, if we want it. Maybe https://clang.llvm.org/extra/clang-tidy/checks/readability/non-const-parameter.html ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marco, updated the title and commit message and testing your clang-tidy idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the tidy check as it proposed making a pointer to const value that needs to be mutable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know if there is a tidy check for this. But when the reason for the changes in this pull are "consistency" #23076 (comment) , we should enforce this "consistency" to avoid incremental fixups to existing code.

Same for the CFeeRate change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many changes seem to be based on copying or grepping the existing code. For example, if CFeeRate in-params are all passed in the same fashion rather than, say, 2/3 by const ref and 1/3 by value, ISTM that alone may help to maintain consistency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you prevent a change from introducing this tomorrow? And then a follow-up pull to that change in tomorrow+n days?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed consistency from the pull description summary (I think there is ample remaining rationale).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was meant generally, I only filled in a random rationale as an example.

When the reason for the changes in this pull are "$rationale", we should ideally enforce this "$rationale" to avoid incremental fixups to existing or future code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you prevent a change from introducing this tomorrow? And then a follow-up pull to that change in tomorrow+n days?

This is true for many changes that are merged.

There was plenty of rationale provided in the PR description and it seems pointless to continue to spend more time here.

@jonatack jonatack changed the title refactor: avoid unnecessarily copying CTxMemPool and CFeeRate objects refactor: pass CTxMemPool and CFeeRate in-param objects by const reference Jul 30, 2022
…erence

as they are classes / user-defined types.

Rationale:
- maintainability, no need to change further as these classes evolve
- consistency
- clarity for developers reading the code and this description
- performance in some cases

Excerpts from:
- Scott Meyers, "Effective C++, Third Edition" (2005), Item 20, pages 86-94
- Scott Meyers, "Effective Modern C++" (2015), Item 41, pages 281-292

- Avoid passing objects of user-defined types by value.

- One might conclude that all small types are good candidates for pass-by-value,
  even if they're user-defined. However, just because an object is small doesn't
  mean that calling its copy constructor is inexpensive.  Many objects--most STL
  containers among them--contain little more than a pointer, but copying such
  objects entails copying everything they point to. That can be very expensive.

- Even when small objects have inexpensive copy constructors, there can be
  performance issues. Some compilers treat built-in and user-defined types
  differently, even if they have the same underlying representation. When that
  type of thing happens, you can be better off passing such objects by
  reference, because compilers will certainly put pointers (the implementation
  of references) into registers.

- Another reason why small user-defined types are not necessarily good
  pass-by-value candidates is that, being user-defined, their size is subject to
  change. A type that's small now may be bigger in the future if its internal
  implementation changes.  Things can even change when switching to a different
  C++ implementation.

- In general, the only types for which you can reasonably assume that
  pass-by-value is inexpensive are built-in types and STL iterator and function
  object types. For everything else, prefer pass-by-reference-to-const over
  pass-by-value.

- Exception: possibly consider pass-by-value of user-defined types for copyable
  parameters that are cheap to move, if they are always copied anyway in their
  implementation.

- When there are chains of function calls, each of which passes-by-value, the
  cost for the entire chain of calls accumulates and may not be something you can
  tolerate. With by-reference passing, chains of calls don't incur this kind of
  accumulated overhead.

- The most practical approach may be to adopt a "guilty until proven innocent"
  policy for pass-by-value of user-defined types.
@jonatack jonatack force-pushed the feerate-and-txmempool-fixups branch 3 times, most recently from 0afba19 to f5cf045 Compare July 30, 2022 15:39
@jonatack
Copy link
Member Author

Closing as this simple change has been open with ACKs for almost a year without merge. If it was desired, it would have been merged.

@jonatack jonatack closed this Jul 30, 2022
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK f5cf045, but I also agree with closing the issue since there doesn't seem to have been agreement on it.

FWIW, I prefer passing passing by const reference just because it clearly expresses semantics of "this argument will only be read" and like reserving passing by value for meaning "this argument might be moved/copied/inserted somewhere." But there are cases where functions are called in loops and copying by value is more performant. Funny enough, I just saw something about how this conflict is supposed to be avoided in google's experimental c++ replacement by some new language rule https://www.foonathan.net/2022/07/carbon-calling-convention/

@bitcoin bitcoin locked and limited conversation to collaborators Aug 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants