-
Notifications
You must be signed in to change notification settings - Fork 37.7k
p2p: delete anchors.dat after trying to connect to that peers #24034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
55cf0d2
to
c9893a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments after skimming the diff, feel free to ignore for now; will review.
Perhaps remove "refactor" from the PR title as there seems to be a change of behavior. |
Done! |
69d4184
to
c24194c
Compare
c24194c
to
fb0c332
Compare
force-pushed addressing @jonatack's comments |
fb0c332
to
831377b
Compare
cc: @hebasto |
Do you mean the normal shutdown, i.e., |
src/net.cpp
Outdated
} else if (m_anchors.size() == 0) { | ||
DeleteAnchorsFile(gArgs.GetDataDirNet() / ANCHORS_DATABASE_FILENAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if there aren't anchors we can delete the file at that point, no need to wait to try connect them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about it earlier but I just remembered I can't remove it because in https://github.com/bitcoin/bitcoin/pull/24034/files#diff-00021eed586a482abdb09d6cdada1d90115abe988a91421851960e26658bed02R2029 if m_anchors
is empty the anchors.dat won't be deleted.
Yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the current approach works for the goal you're trying to achieve.
In case of clean shutdown, CConMan::Close()
is always (I think?) going to be called, which means that DumpAnchors()
is called too. DumpAnchors()
will just overwrite any existing anchors.dat
file, regardless of whether or not that file exists. If no block-only connections have been initialized yet, DumpAnchors()
will simply create an empty file. So I think in case of clean shutdown, you need to control when DumpAnchors()
is called instead of deleting anchors.dat
.
In case of forced shutdown, I think this PR moves in the right direction by not immediately removing anchors.dat
, because then DumpAnchors()
may not be called and it could be helpful to restart from the previously existing anchors.dat
. However, I'm unsure (from a security or resilience perspective) if that's something to be desired or not, I don't fully understand the big picture yet here. On the one hand, you don't want to make it too easy for an attacker to purge existing anchors. However, you should properly handle e.g. a corrupt anchors file too.
src/addrdb.cpp
Outdated
return anchors; | ||
} | ||
|
||
void DeleteAnchorsFile(const fs::path& anchors_db_path) { fs::remove(anchors_db_path); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making anchors_db_path
optional? If you apply the same pattern to DumpAnchors
, ReadAnchors
and DeleteAnchorsFile
, you can move the definition of ANCHORS_DATABASE_FILENAME
to addrdb.cpp
. Personally I think it's more logical to define the default path logic within these functions since they're called DeleteAnchorsFile
, not DeleteFile
.
void DeleteAnchorsFile(const fs::path& anchors_db_path) { fs::remove(anchors_db_path); } | |
void DeleteAnchorsFile(std::optional<const fs::path> anchors_db_path) | |
{ | |
fs::path path{ anchors_db_path.value_or(gArgs.GetDataDirNet() / ANCHORS_DATABASE_FILENAME) }; | |
fs::remove(path); | |
} |
ab509a0
to
fe350d6
Compare
Thanks, @stickies-v. Added a flag to control if a connection with all peers from anchors.dat was tried. So, if the connections with all of them have not been tried yet we don't allow E.g:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Light Concept ACK
This seems to be a definitive improvement, but there might be unthought drawbacks or consequences that I'm not aware of yet. Looking forward to today's review club to shed some more light on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems arbitrary that we preserve our block-relay-only peers across restarts if the node shuts down cleanly but not if the node shuts down uncleanly. Would a better approach be to never delete anchors.dat
but instead rewrite it (truncate and write) immediately whenever the set of block-relay-only peers changes (that is, make it a write-through cache), instead of only on a clean shutdown?
If this set of peers gets updated multiple times close in time, then for better performance, use CScheduler.scheduleFromNow()
to implement a write-back cache (check every few seconds if anchors.dat
is out of date, and if so, rewrite it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK, but I think the code can be simplified.
delete_anchors_file
and tried_connect_anchors
may not be needed.
And there may also be no need to change CConnman::Start
.
I added some suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems arbitrary that we preserve our block-relay-only peers across restarts if the node shuts down cleanly but not if the node shuts down uncleanly. Would a better approach be to never delete
anchors.dat
but instead rewrite it (truncate and write) immediately whenever the set of block-relay-only peers changes (that is, make it a write-through cache), instead of only on a clean shutdown?
There was a lot of discussion on this point in the original PR #17428 which introduced the anchors, see e.g. this summary of some of the aspects.
If I understand it correctly, this PR changes behavior in the special case where a clean shutdown is happening in the time window between startup and connecting to the anchors.
Is there a realistic scenario for how this could happen? If the shutdown was caused maliciously (by DoS, power shutdown, etc.), I think it would likely not be clean.
The only scenario I could think of is that the node operator themselves initiate a clean shutdown (maybe they forgot to specify a startup option, send a SIGINT and try again).
This PR covers both scenarios (clean and unclean). See the functional test this PR modifies, I just added (last force-pushed) a coverage for un clean shutdown using the same logic as
Yes, you're right, I don't think there is a realistic/explicit scenario as an unclean has. It could be caused by a node operator, maybe a social engineering, but there are many more scenarios for unclean shutdown as you mentioned (DoS, power shutdown, etc.). |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
ac11edf
to
e495983
Compare
e495983
to
41ae2b4
Compare
41ae2b4
to
658cfb1
Compare
Just discovered another thing this PR may fix.
Since there is no enough time to establish the 2 connections from |
void DeleteAnchorsFile() | ||
{ | ||
const fs::path path{gArgs.GetDataDirNet() / ANCHORS_DATABASE_FILENAME}; | ||
fs::remove(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now if we fail to remove the file we crash fairly early on startup, so the user can investigate why they are unable to remove it. With this change we will get pretty far into normal operation before crashing. Should we catch and log any error if we fail to remove, or is it a fatal error if we don't remove?
There hasn't been much activity lately. What is the status here? Finding reviewers may take time. However, if the patch is no longer relevant, please close this pull request. If the author lost interest or time to work on this, please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future. |
🐙 This pull request conflicts with the target branch and needs rebase. |
There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
🤔 There hasn't been much activity lately and the CI seems to be failing. If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in. |
Closing for now. 4 pings and 0 followup. From the discussion here it's not even clear that there is buy-in from other contributers that this is a change we want/is ultimately beneficial. |
In the current scenario,
anchors.dat
is deleted right after reading it, however, if my node stops before trying to connect to the anchors peers, it will create an emptyanchors.dat
. This PR changes it to deleteanchors.dat
after trying to connect to the peers from it. So, if my node stops (in a clean way) before trying to connect to that peers, theanchors.dat
file will be preserved and a dump won't happen.