Skip to content

Conversation

jonatack
Copy link
Member

@jonatack jonatack commented Mar 18, 2022

Following the core dev IRC discussion today, this is a straightforward step per the suggestion at https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-17.html#l-652. Users can still opt in to adjusted time by setting -maxtimeadjustment with a value greater than zero. This allows gaining feedback as to what extent user space depends on adjusted time.

If we want to remove the adjusted time code down the road, it makes sense to give users the fallback option to use adjusted time for a release or two. This also reduces our code reversion risk in the case that users rely on it. The release notes would describe the change and suggest that the option is being deprecated for future removal.

See #7573 and #4521 for further context.

@jonatack
Copy link
Member Author

Will add a release note if concept/approach here is good.

@DrahtBot DrahtBot added the Tests label Mar 18, 2022
@jonatack
Copy link
Member Author

@jonatack jonatack force-pushed the default-zero-maxtimeadjustment branch from 26d54ad to 5c81837 Compare March 18, 2022 12:23
@jonatack
Copy link
Member Author

Dropped the doc commit (proposed separately in #24609 to potentially be added to #24512) and moving this out of draft.

@jonatack jonatack marked this pull request as ready for review March 18, 2022 12:26
@jonatack jonatack mentioned this pull request Mar 18, 2022
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

The time data comes from untrusted peers over cleartext, unauthenticated and unencrypted, so it could be bogus. However, I guess, the assumption here is that most of the nodes are honest. Note that we choose those 200 peers randomly from our addrman, sort their datas and only take the median. So, an evil actor who wants to mess with our perceived time has to control >100 of those peers, or >50% of our addrman entries.

What is the most useful thing that could happen if the system clock is wrong indeed during bitcoind startup and it gets adjusted by those 200 honest samples (in a good way)? Or in other words - what will go wrong in that case if we refuse to adjust the clock?

@@ -16,7 +16,16 @@

#include <boost/test/unit_test.hpp>

BOOST_FIXTURE_TEST_SUITE(timedata_tests, BasicTestingSetup)
/**
* Set -maxtimeadjustment to 70 minutes for testing purposes; this was
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to test with a non-default value? Or should we test both old (70) and the new (0)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a few of the tests it might make sense, if this approach goes forward.

@maflcko
Copy link
Member

maflcko commented Mar 18, 2022

I am not sure this is the right way to go. If we think the changes are ok to make, we should just remove adjusted time. If we don't, then we shouldn't do the changes here either.

No objection to the changes here, but to me it seems to combine the worst of everything:

  • It disables the clumsy code by default, making it effectively dead code, since almost no user changes the default values.
  • It doesn't give any benefits to modules that use adjusted time, as they still need to assume adjusted time may be wrong, even though the system time is correct.
  • User that realize they might need this setting (and change the default) might easier just set the correct system time. Or are you going to write release notes saying: "If your system clock is wrong, please make sure to set -maxtimeadjustment to the value your clock is wrong at most." Instead of: "If your clock is wrong, correct it".

@luke-jr
Copy link
Member

luke-jr commented Mar 19, 2022

Concept ACK. Seems like a good one-release-only stage before complete removal?

@jonatack
Copy link
Member Author

@luke-jr yes, this seems like the right thing to do for users.

@jonatack
Copy link
Member Author

jonatack commented Mar 23, 2022

@MarcoFalke indeed, this change doesn't preclude removing the code permanently afterward. if I understand correctly, this code has been in the codebase since 2016, roughly six years now the beginning (corrected). If there's no urgent need to rip it all out, which may be convenient for us but is a larger change, it may make sense to give users the fallback option to use adjusted time for a release or two. This also reduces code reversion risk in the unexpected case that users rely on it (though I don't feel competent to second-guess what users are doing or not doing). The release notes would describe the change and suggest that the option is being deprecated for future removal.

@maflcko
Copy link
Member

maflcko commented Mar 23, 2022

If there is any user using this with a valid use case in mind, we shouldn't remove it nor deprecate it. In that case, we should probably fix it so that users with a proper system clock aren't negatively affected by adjusted time and the users with an improper system clock are covered in all/most cases where it is needed. So it would be good to know if any users are using this, and I think the most effective way to find out is by removing it (User will see an init error on startup, if they are using it). I am not sure how many users would notice with just the default value changed and a release note snippet. An alternative would be to add a -settingdeprecated= setting, similar to -rpcdeprecated=?

Again, I am not against the changes here. They look fine and I don't see any harm in them. I am only wondering how much benefit they will bring.

@mzumsande
Copy link
Contributor

if I understand correctly, this code has been in the codebase since 2016, roughly six years now

The concept of timedata was already introduced by Satoshi, it's historical 😄
git log -S "Never go to sea with two chronometers; take one or three" points to the first commit!

Concept ACK on removing it though.

So there are three possibilities discussed:

  1. Remove the use of GetAdjustedTime() step by step (address relay, validation), making -maxtimeadjustment setting a null-op and deprecating it afterwards.
  2. Changing default -maxtimeadjustment to 0 and removing the use of GetAdjustedTime() afterwards.
  3. Changing default -maxtimeadjustment to 0 but keeping the functionality.

While there seems to be consensus that 3) doesn't make sense, is there much of a difference between 1) and 2) if it all happens for the 24.0 release (which seems realistic?).

@michaelfolkson
Copy link

If there is any user using this with a valid use case in mind, we shouldn't remove it nor deprecate it.

If this is indeed the approach of the project today nothing should ever be deprecated (including say the legacy wallet in #20160) because you can never be sure a user isn't using it. I'm not sure when this switch to never deprecating anything happened. There are various examples (e.g. #8780 and more recent examples) in the past of things being deprecated over multiple versions. Of course giving time to developers/users to prepare for the changes and good communication are important.

Maybe something to bring up in the Core dev meeting this week.

@jonatack
Copy link
Member Author

jonatack commented Mar 23, 2022

@michaelfolkson let's please stay on this topic. RPC result fields (and sometimes input params) are still deprecated and a config option may possibly be removed (having a deprecation process for this may be an improvement). Removing an RPC outright seems much more rare to avoid needlessly breaking the API / user space, and sometimes an RPC is made hidden (not returned in the CLI help) but not removed.

@vasild
Copy link
Contributor

vasild commented Mar 28, 2022

@mzumsande, good points! I think 1. and 2. can be done in parallel. If 1. is fully completed before 24.x then 2. becomes irrelevant. Having 2. now could help with some testing by people that use pre-release, and don't change the defaults, I guess.

@laanwj
Copy link
Member

laanwj commented Apr 1, 2022

Concept ACK.

@maflcko maflcko added this to the 24.0 milestone Apr 2, 2022
@jonatack
Copy link
Member Author

jonatack commented Apr 7, 2022

So it would be good to know if any users are using this, and I think the most effective way to find out is by removing it.

The least risky way to find out is to do this pull for a release or so, because it is trivial to revert and gives users optionality.

is there much of a difference between 1) and 2) if it all happens for the 24.0 release (which seems realistic?)

Project and user space risk, and ease of reversion.

@maflcko
Copy link
Member

maflcko commented Apr 7, 2022

The least risky way to find out is to do this pull for a release or so, because it is trivial to revert and gives users optionality.

If users change the default setting, they won't find out if this pull is merged. But maybe I am just overthinking this, as almost no one changes the default values.

@jonatack
Copy link
Member Author

jonatack commented Apr 7, 2022

True, in any case we need to communicate about it.

Devil's advocate, if there is a potential attack vector with the current code that should be closed, that would be a different scenario.

@fanquake
Copy link
Member

fanquake commented Apr 7, 2022

I think I agree with @MarcoFalke & @mzumsande and would rather move ahead with refactoring the code, i.e #24662 + validation changes, rather than a sort of interim / soft deprecation period of just changing the default. I think we have a long enough window prior to the 24.x branch off to get the refactoring done, and if for some reason that didn't happen, we could always, if still wanted, make this change just prior to branch off.

I would also be very surprised if anyone is actually using this setting in production.

@jonatack jonatack force-pushed the default-zero-maxtimeadjustment branch from 5c81837 to e91d403 Compare April 7, 2022 12:58
@mruddy
Copy link
Contributor

mruddy commented Apr 17, 2022

I just went through the old conversation for #7573 where this flag was added and I found the reason why we did not set the default to zero back then: "I chose not to disable the time-offset logic entirely because I've gotten feedback about people liking to get the warning message "Please check that your computer's date and time are correct!" if none of the peers are within 5 minutes just in-case."
And: "Right now, if the median offset is more than 70 minutes, the offset is set to zero and a warning is logged if none of the nodes are within five minutes. So, there is some protection against an attack that attempts to change your node's perspective on time a lot. If your node disagrees substantially with every other node, then you get a warning. But you'll continue to exchange transactions and blocks etc..."
This was the justification for setting the flag to zero, when it is used: #7573 (comment)
But, this comment also supports not setting the default to zero: #7573 (comment)

In summary, I tend to Concept NACK this changing of default for the reasons mentioned.

@jonatack
Copy link
Member Author

Hi @mruddy, this change is proposed within the context of a process to remove adjusted time completely.

@mruddy
Copy link
Contributor

mruddy commented Apr 17, 2022

When I think about removing adjusted time completely, this quote from sipa comes to mind:
"I am not convinced that the behaviour of not correcting at all is less harmful than correcting your clock based on what random peers tell you when chosen by someone who has confidence in their own clock configuration."
#7573 (comment)

I think there is some subtle wisdom in sipa's sentiment at the time and I think it still applies. Initially, when I proposed the PR that led to the creation of the -maxtimeadjustment flag, my thought was something similar to, "I just want to be able to ignore what other nodes are telling me, because I know better, and I've configured my system." Atfer considering what sipa and gmaxwell wrote in that PR's comments, I changed my mind to be more accepting that adjusted time could still be useful for a lot of less actively/professionally managed nodes. I think that sipa's and gmaxwell's wisdom were based on experiences of NTP failing silently, NTP being insecure and spoofable, lack of node upkeep/monitoring, and probably more.

I still think that adjusted time can serve the same purpose as it did in 2016. I still think that the justification for the -maxtimeadjustment flag is still valid too (#7573 (comment)). So, I still lean to the Concept NACK on this. But, I'm open to change my mind if there's a new persuasive reason to nuke adjusted time.

@mruddy
Copy link
Contributor

mruddy commented Apr 21, 2022

I think that if we want to progress with removing the mechanism that adjusts our node's view of time, that we should still keep a mechanism that warns about being out of alignment with peer times.
Thus, we'd warn, but not correct (similar to how we warn when we do not adjust, and no other peer is within 5 minutes of our clock).
We'd want to avoid being too verbose when logging this so that we avoid hostile p2p version messages that are designed just to spam our logs.

That would allow us to keep clock sync as a local system problem.
When the system clock is off, one might run into a bunch of different problems. And, it's a bigger issue than just syncing or checking NTP, for example https://tails.boum.org/contribute/design/Time_syncing/
Also, here's some data, more current than when #7573 was worked, on how far off clocks on internet hosts can be https://labs.apnic.net/?p=1186

So, if this PR is moving in that direction, then I'd be ok with concept ack'ing. But, getting to the actual change, only changing the default to 0, seems kind of lacking. It gets closer to the end goal, but it leaves a lot of clumsy code to get log messages. So, I tend to agree with this #24606 (comment)

@maflcko
Copy link
Member

maflcko commented Apr 21, 2022

Not sure how useful it is to look at "internet hosts" to get data points, which are off by several days. (I haven't checked, but) Bitcoin Core should refuse to start up if the clock is behind the latest block saved on disk (or will refuse to download it). It is certainly not uncommon for some clocks to be broken. I've run into this myself, see for example https://bugs.launchpad.net/cloud-images/+bug/1896443 . However, Bitcoin Core will already fail to function properly on those systems and adjusted time doesn't correct offsets in the order of days anyway.

I agree on the warnings. Someone should check that Bitcoin Core raises an InitError when the clock is behind the latest block saved on disk (or maybe the latest data point in the chainTxData). I proposed more aggressive warnings or errors in a recent IRC meeting, but there was no agreement that more aggressive warnings are warranted. So I think we should check that the existing warnings are properly sent out even if the default is set to 0 or the setting removed.

@maflcko
Copy link
Member

maflcko commented Apr 21, 2022

There is a meta issue if anyone feels like contributing data points: #24805

@mzumsande
Copy link
Contributor

I think it would also help to understand the (worst-case) consequences of having an incorrect time in cases where AdjustedTime might have ideally helped before. I had a look at the places where adjusted time is used - there are just a few: I could think of the following items:

  1. (gossip) address relay is very sensitive to the correct time, because addrs will not be relayed for more than 10 minutes after nTime of the original self-advertisement. A node with a wrong time can easily become a "black hole" for addr relay. This means it's not helping the network by relaying received addrs further along, but that's not really a problem for the node itself because the acceptance of addrs and the process of making outgoing connections to them is not affected.

  2. a node with a wrong time will find it harder to find inbound peers (because its own self-advertisement will have a nTime that is off, and therefore likely not be relayed by others, see previous point). The ability to find outbound peers is not affected.

  3. the nTime field in blocks is set based on the maximum of AdjustedTime / MedianTimePast in the mining code. Validation will not allow blocks more than 2 hours in the future (MAX_FUTURE_BLOCK_TIME). Adjusted Time cannot correct more than 70 minutes though and wouldn't correct anything if the the time was off by more, so I can't really think of a scenario in which it would have saved us to keep with the best chain before.

  4. header sync timeout and CanDirectFetch() in net_processing. An incorrect time may stall the header timeout / slow down IBD.

Am I missing anything important?

@maflcko
Copy link
Member

maflcko commented Apr 21, 2022

so I can't really think of a scenario in which it would have saved us to keep with the best chain before.

One theoretical example might be where one clock is behind by one hour and ten min and another is ahead by one hour and 10 min. For both, adjusted time would fix the offset, but without adjusted time the fallen-back node wouldn't accept the blocks of the advanced node.

@maflcko
Copy link
Member

maflcko commented Apr 24, 2022

Another idea for improving the warning: We could calculate the running median offset of the last 10 outbound peer connections instead of the current offset, which locks in after (and takes into account) 200 outbound connections?

@DrahtBot
Copy link
Contributor

DrahtBot commented May 26, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@jonatack
Copy link
Member Author

jonatack commented Aug 5, 2022

This won't be merged, closing.

@jonatack jonatack closed this Aug 5, 2022
@fanquake fanquake removed this from the 24.0 milestone Sep 13, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Sep 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants