-
Notifications
You must be signed in to change notification settings - Fork 37.7k
wallet: Implement independent BDB parser #26606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
e18fdbf
to
f135c19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some review nibbles.
src/streams.h
Outdated
@@ -530,6 +530,14 @@ class AutoFile | |||
// | |||
// Stream subset | |||
// | |||
void seek(int64_t offset, int origin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1d093bb: Note for other reviewers: yes
src/streams.h
Outdated
if (!file) | ||
throw std::ios_base::failure("CAutoFile::read: file handle is nullptr"); | ||
if (fseek(file, offset, origin) != 0) | ||
throw std::ios_base::failure(feof(file) ? "CAutoFile::seek: end of file" : "CAutoFile::seek: fseek failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1d093bb: maybe log the exact error code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of annoying to do that, so I'll leave it as is. The other functions do the same.
// file COPYING or http://www.opensource.org/licenses/mit-license.php. | ||
|
||
#ifndef BITCOIN_WALLET_MIGRATE_H | ||
#define BITCOIN_WALLET_MIGRATE_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1224f45: maybe call this file bdb_ro_wallet_db.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent is to have all of the migration stuff in these files, so I will leave it as is.
Concept ACK. Have you looked into a fuzz target for this parser? |
I haven't. If someone would like to write one, I'd be happy to add it. |
2d61534
to
f2dc015
Compare
Adding a fuzz target that compares this to the real thing seems useful. Given that this PR changes It seems a bit daunting to compare the low level DBD parsing code in this PR to the real thing. Having a fuzzer also reduces the need to be very thorough in that. I'm not sure how much confidence I have in the existing test coverage for dumpwallet. |
944a020
to
f108c6b
Compare
Byteswapped databases make it easier to test opening and deserializing other endian databases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-ACK d51fbab
reACK d51fbab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK d51fbab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK d51fbab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK d51fbab
Convinced myself that this parser works reliably partly by looking at the original code, partly by extending the test framework's BDB parser to match this PR's logic (see #30125), which helped a lot in understanding the structure. Also tested with a few random signet legacy wallets that I still had around, though those were all quite small.
Btw, I couldn't manage to find or create a BDB wallet where the DELETE flag is set in a record header (would be nice to exercise that in a test, though it's hard to verify that it's really set without inspecting manually, or e.g. adding debug messages on deserialization), but the code to handle that is simple enough and looks correct.
error.original == "Page number mismatch" || | ||
error.original == "Bad btree level" || | ||
error.original == "Bad page size" || | ||
error.original == "File size is not a multiple of page size" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: isn't thrown in the parser (commented out), hence can be removed here
error.original == "File size is not a multiple of page size" || |
error.original == "Unexpected number of entries in outer database root page" || | ||
error.original == "Subdatabase has an unexpected name" || | ||
error.original == "Subdatabase page number has unexpected length" || | ||
error.original == "Unexpected inner database page type" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: isn't thrown anywhere, can be removed
error.original == "Unexpected inner database page type" || |
@@ -24,11 +24,14 @@ class ToolWalletTest(BitcoinTestFramework): | |||
def add_options(self, parser): | |||
self.add_wallet_options(parser) | |||
parser.add_argument("--bdbro", action="store_true", help="Use the BerkeleyRO internal parser when dumping a Berkeley DB wallet file") | |||
parser.add_argument("--swap-bdb-endian", action="store_true",help="When making Legacy BDB wallets, always make then byte swapped internally") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo nit
parser.add_argument("--swap-bdb-endian", action="store_true",help="When making Legacy BDB wallets, always make then byte swapped internally") | |
parser.add_argument("--swap-bdb-endian", action="store_true",help="When making Legacy BDB wallets, always make them byte swapped internally") |
I wonder if it makes sense to collect the fuzz inputs to qa-assets, from testers that still have them. However, the fuzz CI does not compile with BDB, so that'll probably need to be adjusted first. |
…l page sizes) This aims to complete our test framework BDB parser to reflect our read-only BDB parser in the wallet codebase. This could be useful both for making review of bitcoin#26606 easier and to also possibly improve our functional tests for the BDB parser by comparing with an alternative implementation.
Ported to the CMake-based build system in hebasto#220. |
6825523 connection: run async cleanups in LIFO not FIFO order (Ryan Ofsky) Pull request description: Run Connection class asynchronous cleanups in LIFO not FIFO order, which is more natural ordering and prevents a bitcoin wallet shutdown deadlock when the connection to the node process is broken. Also add comments better documenting cleanup order. This change fixes one of two shutdown deadlocks in the bitcoin wallet when the node process is killed in bitcoin/bitcoin#10102 as of bitcoin/bitcoin@3f12b43. The other deadlock is caused by the `ProxyServerCustom<WalletLoader>` destructor in that PR and is described in commit efe42cc from #100. The bitcoin wallet shutdown deadlocks have probably been around for some time, but were not encountered because before bitcoin/bitcoin@0b75315 from bitcoin/bitcoin#26606 there weren't any tests which killed the bitcoin node process and required the bitcoin-wallet process to shut down gracefully. Top commit has no ACKs. Tree-SHA512: cbbedcc71e4c3ebb541d95f2444151c4058fff3b6357bed5c1b2929d5311fd2a621aeed1af7f56a6159bb3c1ac8abf051acc22830f748edaa6b61ecd557f5074
…thout requiring BDB 8ce3739 test: verify wallet is still active post-migration failure (furszy) 771bc60 wallet: Use LegacyDataSPKM when loading (Ava Chow) 61d872f wallet: Move MigrateToDescriptor and DeleteRecords to LegacyDataSPKM (Ava Chow) b231f4d wallet: Move LegacyScriptPubKeyMan::IsMine to LegacyDataSPKM (Ava Chow) 7461d0c wallet: Move LegacySPKM data storage and handling to LegacyDataSPKM (Ava Chow) 517e204 Change MigrateLegacyToDescriptor to reopen wallet as BERKELEY_RO (Ava Chow) Pull request description: #26606 introduced `BerkeleyRODatabase` which is an independent parser for BDB files. This PR uses this in legacy wallet migration so that migration will continue to work once the legacy wallet and BDB are removed. `LegacyDataSPKM` is introduced to have the minimum data and functions necessary for a legacy wallet to be loaded for migration. ACKs for top commit: cbergqvist: ACK 8ce3739 theStack: Code-review ACK 8ce3739 furszy: Code review ACK 8ce3739 Tree-SHA512: dccea12d6c597de15e3e42f97ab483cfd069e103611200279a177e021e8e9c4e74387c4f45d2e58b3a1e7e2bdb32a1d2d2060b1f8086c03eeaa0c68579d9d54e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an inconsistency in BDB itself as the failure is on BDB loading the database file. Additionally, since this test case passes on linux, I'm not entirely sure what to do about it. This will also be resolved once BDB is removed. |
…l page sizes) This aims to complete our test framework BDB parser to reflect our read-only BDB parser in the wallet codebase. This could be useful both for making review of bitcoin#26606 easier and to also possibly improve our functional tests for the BDB parser by comparing with an alternative implementation.
…s, support all page sizes) d45eb39 test: compare BDB dumps of test framework parser and wallet tool (Sebastian Falbesoner) 01ddd9f test: complete BDB parser (handle internal/overflow pages, support all page sizes) (Sebastian Falbesoner) Pull request description: This PR adds missing features to our test framework's BDB parser with the goal of hopefully being able to read all legacy wallets that are created with current and past versions of Bitcoin Core. This could be useful both for making review of #26606 easier and to also possibly improve our functional tests for the wallet BDB-ro parser by additionally validating it with an alternative implementation. The second commits introduces a test that create a legacy wallet with huge label strings (in order to create overflow pages, i.e. pages needed for key/value data than is larger than the page size) and compares the dump outputs of wallet tool and the extended test framework BDB parser. It can be exercised via `$ ./test/functional/tool_wallet.py --legacy`. BDB support has to be compiled in (obviously). For some manual tests regarding different page sizes, the following patch can be used: ```diff diff --git a/src/wallet/bdb.cpp b/src/wallet/bdb.cpp index 38cca32..1bf39323d3 100644 --- a/src/wallet/bdb.cpp +++ b/src/wallet/bdb.cpp @@ -395,6 +395,7 @@ void BerkeleyDatabase::Open() DB_BTREE, // Database type nFlags, // Flags 0); + pdb_temp->set_pagesize(1<<9); /* valid BDB pagesizes are from 1<<9 (=512) to <<16 (=65536) */ if (ret != 0) { throw std::runtime_error(strprintf("BerkeleyDatabase: Error %d, can't open database %s", ret, strFile)); ``` I verified that the newly introduced test passes with all valid page sizes between 512 and 65536. ACKs for top commit: achow101: ACK d45eb39 furszy: utACK d45eb39 brunoerg: code review ACK d45eb39 Tree-SHA512: 9f8ac80452545f4fcd24a17ea6f9cf91b487cfb1fcb99a0ba9153fa4e3b239daa126454e26109fdcb72eb1c76a4ee3b46fd6af21dc318ab67bd12b3ebd26cfdd
Split from #26596
This PR adds
BerkeleyRODatabase
which is an independent implementation of a BDB file parser. It provides read only access to a BDB file, and can therefore be used as a read only database backend for wallets. This will be used for dumping legacy wallet records and migrating legacy wallets without the need for BDB itself.Wallettool's
dump
command is changed to useBerkeleyRODatabase
instead ofBerkeleyDatabase
(andCWallet
itself) to demonstrate that this parser works and to test it against the existing wallettool functional tests.