-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Assumeutxo: Sanitize block height in metadata #30516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
if (m_base_blockheight > static_cast<uint32_t>(std::numeric_limits<int>::max())) { | ||
throw std::ios_base::failure("Block height is out of range."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix isn't enough. The file may be completely untrusted and corrupt, so just because the block height is a signed 32-bit, but positive integer, doesn't mean that the height is correct.
You'll have to check that the block height corresponds to the block hash, otherwise I fail to see how it can be used without adding bugs or risks or confusion. However, if you do the check, you may as well discard the deserialized value and just use the value from the check.
So again, my recommendation would be to remove it, because either:
- The value is untrusted, thus can not be used
- The value is checked, in which case it is redundant with the value from the check
An alternative would be to write a full file-content hash in the metadata as the second field (after the file magic), covering the whole file after the second field. This way, the file-content hash would be retrieved from the trusted chainparams, then calculated and compared. Afterwards the code can assume that all fields in the checked file are trusted to some extent, which would allow using the block height value.
However, that is a more involved fix, so my recommendation for now would be to drop the value.
Suggested diff:
diff --git a/src/node/utxo_snapshot.h b/src/node/utxo_snapshot.h
index a7c4135787..bd5b68ba30 100644
--- a/src/node/utxo_snapshot.h
+++ b/src/node/utxo_snapshot.h
@@ -28,16 +28,17 @@ class Chainstate;
namespace node {
//! Metadata describing a serialized version of a UTXO set from which an
//! assumeutxo Chainstate can be constructed.
+//! All metadata fields come from an untrusted file, so must be validated
+//! before being used. Thus, new fields should be added only if needed.
class SnapshotMetadata
{
- const uint16_t m_version{1};
- const std::set<uint16_t> m_supported_versions{1};
+ static const uint16_t VERSION{2};
+ const std::set<uint16_t> m_supported_versions{VERSION};
const MessageStartChars m_network_magic;
public:
//! The hash of the block that reflects the tip of the chain for the
//! UTXO set contained in this snapshot.
uint256 m_base_blockhash;
- uint32_t m_base_blockheight;
//! The number of coins in the UTXO set contained in this snapshot. Used
@@ -54,15 +55,13 @@ public:
uint64_t coins_count) :
m_network_magic(network_magic),
m_base_blockhash(base_blockhash),
- m_base_blockheight(base_blockheight),
m_coins_count(coins_count) { }
template <typename Stream>
inline void Serialize(Stream& s) const {
s << SNAPSHOT_MAGIC_BYTES;
- s << m_version;
+ s << VERSION;
s << m_network_magic;
- s << m_base_blockheight;
s << m_base_blockhash;
s << m_coins_count;
}
@@ -98,7 +97,6 @@ public:
}
}
- s >> m_base_blockheight;
s >> m_base_blockhash;
s >> m_coins_count;
}
diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
index 9899a13a1e..ee72fb2c20 100644
--- a/src/rpc/blockchain.cpp
+++ b/src/rpc/blockchain.cpp
@@ -2862,10 +2862,12 @@ static RPCHelpMan loadtxoutset()
throw JSONRPCError(RPC_INTERNAL_ERROR, strprintf("Unable to load UTXO snapshot: %s. (%s)", util::ErrorString(activation_result).original, path.utf8string()));
}
+ CBlockIndex& snap{*CHECK_NONFATAL(*activation_result)};
+
UniValue result(UniValue::VOBJ);
result.pushKV("coins_loaded", metadata.m_coins_count);
- result.pushKV("tip_hash", metadata.m_base_blockhash.ToString());
- result.pushKV("base_height", metadata.m_base_blockheight);
+ result.pushKV("tip_hash", snap.GetBlockHash().ToString());
+ result.pushKV("base_height", snap.nHeight);
result.pushKV("path", fs::PathToString(path));
return result;
},
diff --git a/src/validation.cpp b/src/validation.cpp
index 2b8f64e81a..0b101d1551 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -5633,31 +5633,31 @@ Chainstate& ChainstateManager::InitializeChainstate(CTxMemPool* mempool)
return destroyed && !fs::exists(db_path);
}
-util::Result<void> ChainstateManager::ActivateSnapshot(
+util::Result<CBlockIndex*> ChainstateManager::ActivateSnapshot(
AutoFile& coins_file,
const SnapshotMetadata& metadata,
bool in_memory)
{
uint256 base_blockhash = metadata.m_base_blockhash;
- int base_blockheight = metadata.m_base_blockheight;
if (this->SnapshotBlockhash()) {
return util::Error{Untranslated("Can't activate a snapshot-based chainstate more than once")};
}
+ CBlockIndex* snapshot_start_block{};
+
{
LOCK(::cs_main);
if (!GetParams().AssumeutxoForBlockhash(base_blockhash).has_value()) {
auto available_heights = GetParams().GetAvailableSnapshotHeights();
std::string heights_formatted = util::Join(available_heights, ", ", [&](const auto& i) { return util::ToString(i); });
- return util::Error{strprintf(Untranslated("assumeutxo block hash in snapshot metadata not recognized (hash: %s, height: %s). The following snapshot heights are available: %s"),
+ return util::Error{strprintf(Untranslated("assumeutxo block hash in snapshot metadata not recognized (hash: %s). The following snapshot heights are available: %s"),
base_blockhash.ToString(),
- base_blockheight,
heights_formatted)};
}
- CBlockIndex* snapshot_start_block = m_blockman.LookupBlockIndex(base_blockhash);
+ snapshot_start_block = m_blockman.LookupBlockIndex(base_blockhash);
if (!snapshot_start_block) {
return util::Error{strprintf(Untranslated("The base block header (%s) must appear in the headers chain. Make sure all headers are syncing, and call loadtxoutset again"),
base_blockhash.ToString())};
@@ -5668,7 +5668,7 @@ util::Result<void> ChainstateManager::ActivateSnapshot(
return util::Error{strprintf(Untranslated("The base block header (%s) is part of an invalid chain"), base_blockhash.ToString())};
}
- if (!m_best_header || m_best_header->GetAncestor(base_blockheight) != snapshot_start_block) {
+ if (!m_best_header || m_best_header->GetAncestor(snapshot_start_block->nHeight) != snapshot_start_block) {
return util::Error{_("A forked headers-chain with more work than the chain with the snapshot base block header exists. Please proceed to sync without AssumeUtxo.")};
}
@@ -5782,7 +5782,7 @@ util::Result<void> ChainstateManager::ActivateSnapshot(
m_snapshot_chainstate->CoinsTip().DynamicMemoryUsage() / (1000 * 1000));
this->MaybeRebalanceCaches();
- return {};
+ return snapshot_start_block;
}
static void FlushSnapshotToDisk(CCoinsViewCache& coins_cache, bool snapshot_loaded)
diff --git a/src/validation.h b/src/validation.h
index 08e672c620..0638bbe983 100644
--- a/src/validation.h
+++ b/src/validation.h
@@ -1098,7 +1098,7 @@ public:
//! faking nTx* block index data along the way.
//! - Move the new chainstate to `m_snapshot_chainstate` and make it our
//! ChainstateActive().
- [[nodiscard]] util::Result<void> ActivateSnapshot(
+ [[nodiscard]] util::Result<CBlockIndex*> ActivateSnapshot(
AutoFile& coins_file, const node::SnapshotMetadata& metadata, bool in_memory);
//! Once the background validation chainstate has reached the height which
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention this earlier, but to reply to your comment on IRC that this is only used in "uncritical" places:
It is currently used in validation.cpp
, added as part of #30320.
This means:
- The check accidentally serves as a validity check for the untrusted or possibly corrupt metadata value. (I don't think this was intentional by the author of the pull request; If it was, it would be good to document it)
- This may be used in new places in the future without proper checks, leading to more issues down the line.
- In an otherwise fully valid utxo snapshot dump, with a single bit flip in the block height field in the metadata, the error message for the user will not be: "Corrupt metadata, block height corrupt", but instead "A forked headers-chain with more work than the chain with the snapshot base block header exists. Please proceed to sync without AssumeUtxo."
This seems confusing at best.
Steps to reproduce should be something like:
diff --git a/test/functional/feature_assumeutxo.py b/test/functional/feature_assumeutxo.py
index da7cabf87c..0c4eb126ed 100755
--- a/test/functional/feature_assumeutxo.py
+++ b/test/functional/feature_assumeutxo.py
@@ -68,8 +68,9 @@ class AssumeutxoTest(BitcoinTestFramework):
parsing_error_code = -22
bad_magic = 0xf00f00f000
with open(bad_snapshot_path, 'wb') as f:
- f.write(bad_magic.to_bytes(5, "big") + valid_snapshot_contents[5:])
+ f.write(valid_snapshot_contents[:11]+(1).to_bytes(4, "little")+valid_snapshot_contents[15:])
assert_raises_rpc_error(parsing_error_code, "Unable to parse metadata: Invalid UTXO set snapshot magic bytes. Please check if this is indeed a snapshot file or if you are using an outdated snapshot format.", node.loadtxoutset, bad_snapshot_path)
+ assert False
self.log.info(" - snapshot file with unsupported version")
for version in [0, 2]:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't think this was intentional by the author of the pull request;
No, it wasn't intentional, I didn't consider the possibility of incorrect metadata - might simply use snapshot_start_block->nHeight
instead of base_blockheight
though I think?
From the issue:
How? I could see how situations described in #30288 would require to add the block hash in a hypothetical scenario where we only saved the height - but we are in the reverse scenario where we already have the block hash, so I can't really think of a scenario where haveing the block height would help. I also didn't find any discussion about the reason for the block height in #29612 on a quick glance (but didn't expand every comment). |
The height in the metadata clearly tells us which height the hash is expected at. Not having the same hash at that height clearly tells us that the node is on a different chain than was expected by the metadata. That is better than just saying "we don't know this hash for whatever reason".
It was suggested here and when I added it along with the other suggested values, all reviewers ACK-ed it.
Do you mean the serialized tx hash here? That's different from the file content hash. We do have the serialized hash in the trusted assumeutxo data but not the file hash. I don't think we want to have it in there because we want to force users of the data to calculate it themselves. And if there is a mismatch it's just the same as with the block height where it seems to be an issue for you that it may mismatch. Or are you arguing that the file hash should be added to the chainparams as well? It's unclear to me, honestly.
The block height is intended as additional information that can help users debug what the issue is, like the error message in the suggested change above where it is removed. It doesn't need to be checked for that. If the file is corrupt or the network is found in an unexpected state, it can be helpful as additional information as described above.
I don't think it's fair criticism to say "this is confusing when I introduce a magic bitflip in this exact place". There are literally infinite places all over the code base where that is true and we won't get anywhere if this is the bar we need to meet. If I can just flip any bit in your block storage or utxo set, we are completely screwed and most importantly you probably wouldn't even see an error until it's too late while in the case discussed here, there is a failure right away (if we check) or there is no issue at all because the value is not used for anything critical as outlined before. |
The arguments here don't convince me yet that the height really has to be dropped necessarily but I also seem to be less passionate about this issue than @maflcko et al. So I opened the suggested approach of removing the block height in #30598 and I pinged other reviewers of the original metadata PR to weigh in shortly. Ultimately what matters most to me is that we finally get this feature into the hands of users and I am fine to go along with either approach if it avoids further blocking of that. |
Ah, ok, my point is that the height is implied by the block hash, together with the block index db of the node. If the block tree db doesn't contain a header with that block hash, then we abort anyway. If it contains the index but we don't have a registered snapshot for this height in the chainparams, we can look the attempted height up from the block index, e.g. for a user-facing error message. |
Not sure if this is a convincing argument for anyone, but I can at least see a point of keeping the block height for purely informative purposes, in case the snapshot is processed by external tools which don't have access to the block index, e.g. #27432 (or, at some point even the https://linux.die.net/man/1/file utility, which can not only determine a file type, but also emit a short meta-data summary, if properly implemented in the magic file). Wouldn't it be an alternative to simply keep the format as-is, but use the block height only for an immediate sanity check in Bitcoin Core, i.e. throw if it doesn't match the included block hash? |
Just to clarify, I am happy to review and ack this approach here, but I haven't acked the current state of the pull request, because it leaves bugs unfixed. I haven't tried it, but #30516 (comment) (the error message for the user will not be: "Corrupt metadata, block height corrupt", but instead "A forked headers-chain with more work than the chain with the snapshot base block header exists. Please proceed to sync without AssumeUtxo.") should still be reproducible on top of this branch currently. If you keep the metadata, it should be renamed to Other than that, I agree this is mostly a bike-shedding discussions whether or not external tools (or error messages) display the height and hash, or just the hash. |
@maflcko: I think the simplest (and IMHO most obvious) solution would be to just verify the height as well, then all of the problems you described should be solved? E.g. with something like diff --git a/src/validation.cpp b/src/validation.cpp
index d8c1c27aae..612bb960e5 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -5660,10 +5660,11 @@ util::Result<void> ChainstateManager::ActivateSnapshot(
{
LOCK(::cs_main);
- if (!GetParams().AssumeutxoForBlockhash(base_blockhash).has_value()) {
+ auto maybe_assumeutxo_data = GetParams().AssumeutxoForBlockhash(base_blockhash);
+ if (!maybe_assumeutxo_data.has_value() || maybe_assumeutxo_data->height != base_blockheight) {
auto available_heights = GetParams().GetAvailableSnapshotHeights();
std::string heights_formatted = util::Join(available_heights, ", ", [&](const auto& i) { return util::ToString(i); });
- return util::Error{strprintf(Untranslated("assumeutxo block hash in snapshot metadata not recognized (hash: %s, height: %s). The following snapshot heights are available: %s"),
+ return util::Error{strprintf(Untranslated("assumeutxo block hash/height in snapshot metadata not recognized (hash: %s, height: %s). The following snapshot heights are available: %s"),
base_blockhash.ToString(),
base_blockheight,
heights_formatted)}; |
I haven't tried to compile that suggestion, but it'll probably fail |
Indeed compiling this emits warning
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested this on master
for bad_block_hash in [bogus_block_hash, prev_block_hash]: | ||
for bad_height in [bogus_height, signed_overflow_height]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the description: "The first commit shows how to reproduce the error shown in #30514 and how it could be fixed with a minimal change. The second commit adds an explicit sanitizer that will probably be the preferred approach. I can squash the commits but first I would like to hear conceptual feedback since #30514 suggested to remove the block height instead."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was still hunting for the conceptual consensus and didn't change this yet. I think we are coming to an end though (see my comment below). It would be great if you could indicate where you stand on this though @ismaelsadeeq . If more support for the approach here I will squash the commits and implement the suggested improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fjahr, I reviewed the code primarily to address PRs targeting issues in the v28.0 milestone. I haven’t gone through all the conceptual discussions here, as I’m still getting up to speed with the direction of the assumeutxo project.
@@ -5639,7 +5639,7 @@ util::Result<void> ChainstateManager::ActivateSnapshot( | |||
bool in_memory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AssumeutxoData::height
is int
, IMO would be better for them to be the same type, so should we consider changing it also to uint32_t
? this will avoid the issue #30516 (comment) during comparison or having to explicitly cast one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would probably have to cast somewhere else then because nHeight
is int
.
Thanks, that was not clear to me indeed.
I don't understand how we can fix that issue to work perfectly for all cases. IMO it is unfixable because we can not know if a value that doesn't match is corrupt or not a match. If there is a bitflip in the version field we will show an error message that the version doesn't match and not that the version field is corrupt. If there is a bitflip in the network magic we will show an error that the network doesn't match and not that the network magic is corrupt and so on... |
A little recap so far: While I still don’t find the arguments against having the height included convincing I am currently leaning towards going with the approach in #30598 and removing it for now. My reasoning is as follows:
The primary downside to removing the height seems to be that we need to produce a new torrent but I hope @Sjors would be fine with that and I am currently the only reviewer of #28553 as well. I will keep this PR here open for discussion for at least a few more hours at least since there may be some contributors in other timezones that were not able to comment yet. But if no new findings arise I will move forward with #30598 for reasons outlined above. |
I don't mind making a new torrent. I'm still puzzled why all of a sudden a single integer in a serialized file is such a big deal that removing it is simpler than fixing it. But I haven't reviewed this PR yet. It's true that we can add it back later, but it seems wasteful if we first have to review #30598 only to revert it later and still having to review the changes here. Even if we never want it back, it seems very likely we'll add an integer to some other serialized thing in the future. We should just be able to do that. But again, maybe there's something really special about the integer here that I missed. Approach wise, it makes sense to me to have two commits. |
This PR fixes the undefined behavior that the sanitizer picked up in #30514. I think that's sufficient for now. We can improve handling of corrupt and/or malicious files later, which (unless I missed something) is what @maflcko's critique is about. Afaik any corrupt (accidentally or otherwise) snapshot file will ultimately be rejected by the node because Assuming I didn't miss anything in my reasoning above: ACK 51f197b Using 98e119d I was able to reproduce the test failure by undoing the change in For the second commit maybe add to the commit message:
Though it's easier if you change the first commit in a way that avoids this code churn in the second commit, i.e. by copy-pasting rather than expanding the loop. |
@@ -99,6 +99,9 @@ class SnapshotMetadata | |||
} | |||
|
|||
s >> m_base_blockheight; | |||
if (m_base_blockheight > static_cast<uint32_t>(std::numeric_limits<int>::max())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
51f197b maybe use int32_t
instead of int
for clarity?
std::numeric_limits<int32_t>::max())
I don't think this is UB. Personally I couldn't care less about a implicit fuzz integer issue, and I don't think any production user cares either, because fuzz tests aren't run in production. I don't see the value in only silencing the fuzz target, but leaving the other reported bugs unfixed. If silencing the fuzz target was a goal, it could be done with a trivial temporary one-line patch to the suppressions file. Once there is a real fix, the temporary suppression can be removed. I just don't see the point of changing real code temporarily, and then revisit it in a "follow-up". Either the reported bugs are worthy to fix, in which case they should be fixed, or they are not, in which case they shouldn't be fixed and this can just be replaced with a sanitizer suppression. |
It doesn't seem like we can come to an agreement quickly here, so I am closing this in favor of #30598 so we can move forward with the mainnet params. I would appreciate your reviews there, thank you! |
00618e8 assumeutxo: Drop block height from metadata (Fabian Jahr) Pull request description: Fixes #30514 which has more context and shows how the issue can be reproduced. Since the value in question is removed, there is no test to add to reproduce anything. This is an alternative approach to #30516 with much of the [code being suggested there](#30516 (comment)). ACKs for top commit: maflcko: re-ACK 00618e8 🎌 achow101: ACK 00618e8 theStack: Code-review ACK 00618e8 ismaelsadeeq: Re-ACK 00618e8 mzumsande: ACK 00618e8 Tree-SHA512: db9575247bae838ad7742a27a216faaf55bb11e022f9afdd05752bb09bbf9614717d0ad64304ff5722a16bf41d8dea888af544e4ae26dcaa528c1add0269a4a8
Fixes #30514 which has more context.
The block height in the assumeutxo metadata was implicitly cast to
int
which could cause it to be interpreted as signed.The first commit shows how to reproduce the error shown in #30514 and how it could be fixed with a minimal change. The second commit adds an explicit sanitizer that will probably be the preferred approach. I can squash the commits but first I would like to hear conceptual feedback since #30514 suggested to remove the block height instead.