Skip to content

Conversation

jgarzik
Copy link
Contributor

@jgarzik jgarzik commented Aug 7, 2014

Compiles and passes tests. Could use some refinement.

Compiles and passes tests.  Could use some refinement.
@jgarzik
Copy link
Contributor Author

jgarzik commented Aug 7, 2014

Updated such that each message is a method. It is recommended to review each commit rather than the sum of all commits, as this patch series follows the "equivalent transformation" method of refactoring code into a more useful form.

This should be suitable for updating to use signals or similar (possibly moving back outside of the class if necessary).

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4646_aa9ad2895e32dcaedba5e1511f5edead4664c029/ for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@jtimon
Copy link
Contributor

jtimon commented Sep 3, 2014

Oh, yes, please!
untested and undiffed ACK. The diffed ack will come later. Maybe the last un-indent commit can be replaced with a "Pass clang to the new files". Now that we have a way to automatically format the code, it can be useful for after-diff-sensible commits like that one, should be faster to review just by reproducing it locally.

@sipa
Copy link
Member

sipa commented Sep 3, 2014

Idea ACK, but this code is pretty much in flux still (askfor/headersfirst/sendmessages/...); I'd propose to delay this until close to the next major release.

@jtimon
Copy link
Contributor

jtimon commented Oct 18, 2014

After headersfirst has been merged, can we reopen this?

@laanwj laanwj reopened this Oct 19, 2014
@sipa
Copy link
Member

sipa commented Oct 19, 2014

No objection against just moving things like this, but things like #4831 conflict probably with less, and actually go further, as it actually splits processing up completely, rather than just moving it as a whole.

@sipa
Copy link
Member

sipa commented Oct 19, 2014

Things like ping/pong handling and alert handling can probably be split off in a way similar to #4831.

@@ -17,6 +17,7 @@
#include "txmempool.h"
#include "ui_interface.h"
#include "util.h"
#include "procmsg.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Alphabetical ordering.

@Diapolo
Copy link

Diapolo commented Oct 19, 2014

Suggestion: Why not let the clang-script fix coding style for this?

@jtimon
Copy link
Contributor

jtimon commented Oct 23, 2014

I really like how jgarzik@78edf5c (Split up P2P message processing into MessageEngine methods) improves the readability of the p2p messages.

@jgarzik
Copy link
Contributor Author

jgarzik commented Oct 23, 2014

I felt that this was a good step towards the direction we want to go longer term, which is registering processing methods (signals/hooks), so that the message processing implementation is more de-coupled from network message dispatch.

It sounded like @sipa did not like this as an intermediate step, though.

@sipa
Copy link
Member

sipa commented Oct 23, 2014

I would absolutely be in favor of this as intermediate step if not for the review overhead and the fact that it will break existing pull requests that already go further.

Afterwards I think we can definitely do this for the non-trivial pieces after the trivial ones have been moved out.

@jtimon
Copy link
Contributor

jtimon commented Jun 21, 2015

Needs rebase...
Would this be easier to merge if it only contained one commit equivalent to jgarzik@78edf5c but doing the separation directly in main.cpp and using regular functions instead of method of a new class?
They can even be static inline functions in which case the commit should produce an identical build unless I'm missing something (meaning no risk with low review overhead).

@jgarzik
Copy link
Contributor Author

jgarzik commented Jul 23, 2015

Closing. I still think this is worth merging and @jtimon seems to agree. However it has not gained sufficient momentum to get merged for whatever reason.

As usual, it is easy to re-open a PR if this turns out to be in error, ACKs suddenly appear, etc. Applying the "close old PRs, easy to reopen" pattern.

This code movement change is too large to continue rebasing if it is not getting merged.

@jgarzik jgarzik closed this Jul 23, 2015
@jtimon
Copy link
Contributor

jtimon commented Jul 24, 2015

I really think this kind of PR opens the door to many modularity improvements. And I understand that code movements are a burden to reviewers because they need to keep up to date with them after they're merged. This is a very big move on main.cpp that everybody is touching.
But I seriously don't understand why breaking breaking an overly-nested switch into new functions of a too-large function like ProcessMessage is so difficult when (as you have proven) it can be done with a minimal and review-friendly single commit (jgarzik@78edf5c ). I believe this can be done in a way that results in an identical gitian just prefixing all the new functions with static inline, but I've never tested it. When I tried to do something similar in #5153 for EvalScript @gmaxwell tested it and the build hash wasn't identical. I didn't add that prefix because, I thought, theoretically, the compiler should be smart enough to tell by itself, but as @sipa explained me later, non-static (or in an anonymous namespace) things are effectively "public" to the compiler, because it doesn't have a per-file way to tell whether you're using extern somewhere else or not.

I really think we should break both EvalScript and ProcessMessage with an identical-build single commit. With an identical build you don't even need any tested ACK, utACKS should be enough.
If I remember correctly @petertodd started a bounty on a python script to check whether the last commit produces an identical build hash or not. I proposed it and I didn't needed the bounty as motivation, but I never did it.This is the tutorial by @laanwj that I was going to use: #4180 (comment)

@jgarzik
Copy link
Contributor Author

jgarzik commented Jul 24, 2015

@jtimon I agree. It is however difficult to produce identical hashes due to minor compiler build differences such as the ones that sipa mentions.

You could probably perform some #include tricks to accomplish the first file movement step with equivalent hashes, if you keep ProcessMessage() intact. Once the individual code blocks move to their own functions/methods, the hashes will unavoidably differ.

A line-by-line source code comparison tool would be better than a hash check.

@sipa
Copy link
Member

sipa commented Jul 24, 2015 via email

@jtimon
Copy link
Contributor

jtimon commented Jul 24, 2015

You could probably perform some #include tricks to accomplish the first file movement step with equivalent hashes, if you keep ProcessMessage() intact.

I was talking about doing the equivalent to that commit without moving anything first.

Once the individual code blocks move to their own functions/methods, the hashes will unavoidably differ.

Shouldn't static inline be enough to in practice keep the code where it was (in ProcessMessage)?

A line-by-line source code comparison tool would be better than a hash check.

We already have that with git. And by not correcting the indentation at first, the commit can be very readable (specially if the new class is not introduced). But the identical hash eliminates the need for testing, which is nice.

@sipa would you oppose to just doing the function separation in main.cpp ? That wouldn't create any circular dependencies and I believe it would help with the later modularization that you want (apart from making ProcessMessage much, much, much more readable [I don't use eclipse anymore but I believe some checkstyle plugins can segfault just by trying to analyze this function {if you first disable the typical "too long function" check, obviously}</bad joke>]).

@jtimon
Copy link
Contributor

jtimon commented Jul 24, 2015

I'm talking about something along this lines: bitcoin:master...jtimon:process-message-0.12.99
But I now realize that for the build to be identical the return true; at the end of every new function should be introduced before that commit. That little preparation would also be trivial to review, but I'm not sure having an identical build commit would be a great benefit anymore...

EDIT:
Updated the branch, now there's a preparation commit and a second one that I believe should result in an identical hash, so please, correct me if I'm wrong. @laanwj assuming it produces an identical hash, would that make it more mergeable or it would be better to squash it with the preparation commit from the beginning?

@sipa
Copy link
Member

sipa commented Jul 24, 2015 via email

@jtimon
Copy link
Contributor

jtimon commented Jul 24, 2015

@sipa Great, I will eventually open what I started there then, although I believe it would be good to make more things like #6163 first, so that we don't have to work more later (having to add more parameters to the new functions).

Thoughts on the identical build? Could d8e5ff4 theoretically produce an identical build? Is it worth it (given that you need a previous commit at least for the return trues)?

@sipa
Copy link
Member

sipa commented Jul 24, 2015

I think that for refactors like this, you shouldn't aim for identical builds.

The only way to achieve that is by forcing the compiler to inline things, which for such large blocks of code is probably a bad idea.

@jtimon
Copy link
Contributor

jtimon commented Jul 25, 2015

Thanks, @sipa that's useful. But I don't see why inlining all those new static functions is a bad idea. First, it is no worse than what we currently have, and second, the inline can be removed in a fixup! commit (to be squashed together after "testing", just like the preparation commit).

In any case, maybe is still not worth it, but what about doing something similar in consensus code, more concretely in EvalScript? To remind you, #5153 was considered too risky (#5153 (comment)).

@sipa
Copy link
Member

sipa commented Jul 25, 2015

@jtimon I mean forcing the compiler to inline things - just putting the 'inline' keyword may not be enough for that (it may need compiler flags, for example), and even then, the order in the binary may end up being different.

Just saying that IMHO that's not a good use of your time, but if you think you can make the build identical, so much the better.

I'm much less concerned about changes here than in script evaluation, as getting some state/variables/processing wrong in message handling should result in immediate failure (some messages not being processed), in script evaluation it could lead to very subtle differences.

@jgarzik
Copy link
Contributor Author

jgarzik commented Jul 26, 2015

Basically the inlining doesn't produce any value. It doesn't generate identical hashes. Might as well make them separate functions/methods.

@jtimon
Copy link
Contributor

jtimon commented Jul 26, 2015

Thank you very much both of you: this is very helpful.
When (if nobody does it before me) I reopen the discussed subset of this PR, I will forget about the identical build. I guess I was too much worried about reusing the IsIdenticalBuild tool for #5153 (comment) , but this seems definitely less risky and more of a priority.

But, as said, I would like to see more of those Params() disappear, and some of those const CChainParams& and const CPolicy& moving up first.
The most current and finished thing I have on this will be in https://github.com/jtimon/bitcoin/commits/process-message-0.12.99 until I open it as a PR in case you want to take a look or cherry pick for finishing.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants