Skip to content

Conversation

fjahr
Copy link
Contributor

@fjahr fjahr commented Jun 4, 2020

The semantics of setup_clean_chain are off in my opinion. If it is set to true, there is actually no real chain setup, we only start with the Genesis block. That is not what I would expect 'clean' to mean. The 200 blocks chain that we set up if it is set to false could also be called 'clean' since there are no transactions or reorgs.

I think this is also the reason that other people have been confused. The documentation for test_shell is not correct and the default has been re-declared in several tests. This PR fixes the documentation and removes settings that don't change the default. Additionally I suggest renaming setup_clean_chain to skip_chain_setup.

@maflcko
Copy link
Member

maflcko commented Jun 4, 2020

The first commit can also be a scripted diff with the sed d action?

@maflcko
Copy link
Member

maflcko commented Jun 4, 2020

ACK 6e1a6ad

@fjahr
Copy link
Contributor Author

fjahr commented Jun 4, 2020

The first commit can also be a scripted diff with the sed d action?

Yepp, done.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My issue with this setting is not how it's named, but the lack of clarity on whether the default setting (e.g. setup_clean_chain=False) is the efficient one, and if it is not, why is the inefficient setting the default and what the real-world tradeoffs and differences are -- and to clarify all this better in the README and the example test. I don't see this renaming making that clearer but would love for the docs to be clearer.

@maflcko
Copy link
Member

maflcko commented Jun 4, 2020

@jonatack If you need the blocks in the chain, then loading them from the cache is efficient. If you don't need them, then you may skip the cache, but it wouldn't hurt if you forget to skip the cache.

Setting the default the other way round would risk that tests that need blocks generate them instead of loading them from the cache.

@jonatack
Copy link
Member

jonatack commented Jun 4, 2020

Thanks @MarcoFalke, so IIUC the default is for safety. @fjahr from my personal perspective, adding helpful discussion in the docs about the tradeoffs and reasons would be very helpful (I'm less enthusiastic about the current renaming proposal, esp. if it is loading a cached chain rather than doing setup).

@maflcko
Copy link
Member

maflcko commented Jun 4, 2020

I like the rename, but naming is always subjective to some extent

@jonatack
Copy link
Member

jonatack commented Jun 4, 2020

Maybe I'm thoroughly confused, but isn't the new name inaccurate? :) ISTM it ought to be strictly better to justify a change. For the docs, there is also this recent conversation #18638 (comment) about when to use this. It's very late here but will revisit this at a better hour and try to come up with a clear addition to the docs that everyone agrees with, unless @fjahr you're keen to do it (which would be great).

@fjahr
Copy link
Contributor Author

fjahr commented Jun 4, 2020

Maybe I'm thoroughly confused, but isn't the new name inaccurate? :) ISTM it ought to be strictly better to justify a change. For the docs, there is also this recent conversation #18638 (comment) about when to use this. It's very late here but will revisit this at a better hour and try to come up with a clear addition to the docs that everyone agrees with, unless @fjahr you're keen to do it (which would be great).

It is a different abstraction layer. The person writing a test using BitcoinTestFramework should not have to learn every little detail of how the framework works internally. That's why it is a framework. I mentioned why I think the name change is better: because the current name is very confusing. I think most people do not understand 'clean' to be what it is supposed to mean here. I am trying to use a name that is easy to understand and appropriate for the abstraction layer and I think setup is a neutral term that does not imply mining or loading from cache.

Feel free to open an additional PR for more docs updates or give me a hint where you think I could extend this here and I will add it.

@fjahr fjahr force-pushed the test_chain branch 3 times, most recently from 8637214 to d2183c7 Compare June 4, 2020 22:18
@fjahr
Copy link
Contributor Author

fjahr commented Jun 4, 2020

Added some more small adjustments to the docs based on feedback from @jonatack to make sure users understand performance implications.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 4, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this renaming is unwise and am NACK on it. Why:

  • Semi-subjective: The new name is more confusing. "Set to true to not do something" is not an improvement in clarity.

  • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren't copyright headers but test settings that are important enough that the convention has been to incude them explicitly in most every file. The only wise reason to change these lines is to change the setting.

Concept ACK on improving the documentation. Frameworks increase, not decrease, the need for good documentation and clear explanation of when and why to use settings. Links to source code or terms to grep should be added. Reviewers and contributors aren't beginners to hide details from; they need to thoroughly understand what is happening under the hood. That is why this documentation exists.

# By default the test loads a pre-mined chain of 200 blocks from cache.
# Set skip_chain_setup to True to skip this and start from the Genesis
# block.
self.skip_chain_setup = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.skip_chain_setup = True
self.use_cached_chain = False

The double negative is a good point. What about this name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, happy to change it to that if other reviewers prefer it. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, happy to change it to that if other reviewers prefer it. :)

It would make me feel less guilty of picking the "clean chain" name in fac9349, so I hope that other reviewers take that into consideration 😅

@fjahr
Copy link
Contributor Author

fjahr commented Jun 8, 2020

  • Semi-subjective: The new name is more confusing. "Set to true to not do something" is not an improvement in clarity.

I don't understand this argument. The naming itself is misleading. So you would like it to be clearly misleading rather than not so clearly non-misleading? Anyway, I think @MarcoFalke 's suggested renaming would solve this.

  • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren't copyright headers but test settings that are important enough that the convention has been to include them explicitly in almost every file. The only wise reason to change these lines is to change the setting.

If we would follow this rule blindly we would never do any refactorings. However, we do them sometimes if there is an added benefit. The benefit here is clarity of what this setting is doing. There is no doubt it's confusing as I have laid out before. And in this case, it is also less critical because this line never stands on its own. It is setup code that enables other stuff to work but is never a standalone thing that is relevant without the other code it makes work. The other will be the lines that developers will want to git blame, not the setup code. At least that is my impression of the code I have seen in the files I am changing. Feel free to give counterexamples if I am wrong.

@maflcko
Copy link
Member

maflcko commented Jun 8, 2020

btw, git blame on recent versions of git can ignore specific commits (like refactoring scripted diffs)

@jonatack
Copy link
Member

jonatack commented Jun 8, 2020

  • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren't copyright headers but test settings that are important enough that the convention has been to include them explicitly in almost every file. The only wise reason to change these lines is to change the setting.

If we would follow this rule blindly we would never do any refactorings.

I am not suggesting to follow this blindly, but to follow it in this case for the reasons provided.

-BEGIN VERIFY SCRIPT-
git grep -l "self.setup_clean_chain = False" test/functional/*.py | xargs sed -i "/self.setup_clean_chain = False/d";
-END VERIFY SCRIPT-
Also switched the effect of the bool value, i.e. setup_clean_chain=True is now use_cached_chain=False.

-BEGIN VERIFY SCRIPT-
git grep -l "self.setup_clean_chain = True" test/functional | xargs sed -i "s/self.setup_clean_chain = True/self.use_cached_chain = False/g";

git grep -l "self.setup_clean_chain = False" test/functional/test_framework | xargs sed -i "s/self.setup_clean_chain = False/self.use_cached_chain = True/g";

git grep -l "not self.setup_clean_chain" test/functional/test_framework | xargs sed -i "s/not self.setup_clean_chain/self.use_cached_chain/g";

git grep -l "self.setup_clean_chain" test/functional/test_framework | xargs sed -i "s/self.setup_clean_chain/not self.use_cached_chain/g";

git grep -l "setup_clean_chain" test/functional/test_framework | xargs sed -i "s/setup_clean_chain/use_cached_chain/g";
-END VERIFY SCRIPT-
@fjahr
Copy link
Contributor Author

fjahr commented Nov 29, 2020

herwaardeer (rebased in Afrikaans)

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@fjahr
Copy link
Contributor Author

fjahr commented Jan 31, 2021

Closing this since the renaming seems to be too controversial/is not interesting enough to most reviewers. I opened a new PR without the renaming and some improved doc cleanups in #21042.

@fjahr fjahr closed this Jan 31, 2021
maflcko pushed a commit that referenced this pull request Feb 4, 2021
590bda7 scripted-diff: Remove setup_clean_chain if default is not changed (Fabian Jahr)
98892f3 doc: Improve setup_clean_chain documentation (Fabian Jahr)

Pull request description:

  The first commit improves documentation on setup_clean_chain which is misunderstood quite frequently. Most importantly it fixes the TestShell docs which are simply incorrect.

  The second commit removes the instances of `setup_clean_clain` in functional tests where it is not changing the default.

  This used to be part of #19168 which also sought to rename`setup_clean_chain`.

ACKs for top commit:
  jonatack:
    ACK 590bda7

Tree-SHA512: a7881186e65d31160b8f84107fb185973b37c6e50f190a85c6e2906a13a7472bb4efa9440bd37fe0a9ac5cd2d1e8559870a7e4380632d9a249eca8980b945f3e
glozow added a commit to glozow/bitcoin that referenced this pull request Feb 23, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
glozow added a commit to glozow/bitcoin that referenced this pull request Feb 23, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
glozow added a commit to glozow/bitcoin that referenced this pull request Feb 23, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
glozow added a commit to glozow/bitcoin that referenced this pull request Mar 31, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
glozow added a commit to glozow/bitcoin that referenced this pull request Apr 5, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
glozow added a commit to glozow/bitcoin that referenced this pull request Apr 8, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
windsok pushed a commit to windsok/bitcoin that referenced this pull request Apr 23, 2021
PR bitcoin#19168 introduced this function but it always returns an empty vector.
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants