wallet: Load database records in a particular order #24914

achow101 · 2022-04-18T21:18:28Z

Currently when we load a wallet, we just iterate through all of the records in the database and add them completely statelessly. However we have some records which do rely on other records being loaded before they are. To deal with this, we use CWalletScanState to hold things temporarily until all of the records have been read and then we load the stateful things.

However this can be slow, and with some future improvements, can cause some pretty drastic slowdowns to retain this pattern. So this PR changes the way we load records by choosing to load the records in a particular order. This lets us do things such as loading a descriptor record, then finding and loading that descriptor's cache and key records. In the future, this will also let us use IsMine when loading transactions as then IsMine will actually be working as we now always load keys and descriptors before transactions.

In order to get records of a specific type, this PR includes some refactors to how we do database cursors. Functionality is also added to retrieve a cursor that will give us records beginning with a specified prefix.

Lastly, one thing that iterating the entire database let us do was to find unknown records. However even if unknown records were found, we would not do anything with this information except output a number in a log line. With this PR, we would no longer be aware of any unknown records. This does not change functionality as we don't do anything with unknown records, and having unknown records is not an error. Now we would just not be aware that unknown records even exist.

DrahtBot · 2022-04-19T01:59:23Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	MarcoFalke, furszy, ryanofsky
Concept ACK	theStack, pk-b2, laanwj
Approach ACK	w0xlt

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27920 (wallet: bugfix, always use apostrophe for spkm descriptor ID by furszy)
#27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
#27286 (wallet: Keep track of the wallet's own transaction outputs in memory by achow101)
#26836 (wallet: simplify addressbook migration and encapsulate access by furszy)
#26728 (wallet: Have the wallet store the key for automatically generated descriptors by achow101)
#26596 (wallet: Migrate legacy wallets to descriptor wallets without requiring BDB by achow101)
#25907 (wallet: rpc to add automatically generated descriptors by achow101)
#22341 (rpc: add getxpub by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

theStack · 2022-05-02T08:13:14Z

Concept ACK

w0xlt

Approach ACK.
Getting rid of CWalletScanState and ReadKeyValue and moving the read logic to WalletBatch::LoadWallet makes the code clearer and easier to understand.

pk-b2 · 2022-05-06T20:52:28Z

Concept ACK

pk-b2 · 2022-05-06T21:00:55Z

src/wallet/sqlite.cpp

+    std::unique_ptr<SQLiteCursor> cursor = std::make_unique<SQLiteCursor>(start_range, end_range);
+    if (!cursor) return nullptr;
+
+    const char* stmt_text = "SELECT key, value FROM main WHERE key >= ? AND key < ?";


> explain query plan SELECT key, value FROM main WHERE key >= ? AND key < ?

Verified that this uses an index avoiding a table scan

SEARCH TABLE main USING INDEX sqlite_autoindex_main_1 (key>? AND key<?)`

src/wallet/sqlite.cpp

laanwj · 2022-05-31T16:51:08Z

Concept ACK on querying specifically what is needed from the database when loading. Maybe some day we don't even have to read all records into memory.

ryanofsky

Code review ACK 973d910

Looks good, probably ready for merge with a third ACK.

src/wallet/walletdb.cpp

ryanofsky

Code review ACK 7cb7740. Just more suggested error handling tweaks since last review

furszy

reACK 7cb7740

src/wallet/walletdb.cpp

maflcko

lgtm ACK 7cb7740 🍖

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: lgtm ACK 7cb774048c5f04b3e20d95e794c5a350ea4eff97 🍖
wuc4npVuTb3Yb6tNza4MK7G2Z6qzHiajO9EsxCW7ib4DxRzWDxs552TGpGVPJxZ5qafuYSW65Z4/WLbHadprCA==

maflcko · 2023-06-26T14:40:59Z

src/wallet/walletdb.cpp

+    result = std::max(result, locked_utxo_res.m_result);
+
+    // Load orderposnext record
+    // Note: There should only be one ORDERPOSNEXT record with nothing trailng the type


dc1ef26: trailing (typo)

maflcko · 2023-06-27T10:28:17Z

src/wallet/walletdb.cpp

+struct LoadResult
+{
+    DBErrors m_result{DBErrors::LOAD_OK};
+    std::string m_error{};


nit in 20d1fc5: I think this is still wrong? Currently this only collects the first error (in cases where assigning the error is guarded by an empty check). Also, it is unused by the caller. Suggested fix:

diff --git a/src/wallet/walletdb.cpp b/src/wallet/walletdb.cpp index 1fab6a7f9c..e466c1c477 100644 --- a/src/wallet/walletdb.cpp +++ b/src/wallet/walletdb.cpp @@ -754,7 +754,6 @@ static DBErrors LoadWalletFlags(CWallet* pwallet, DatabaseBatch& batch) EXCLUSIV struct LoadResult { DBErrors m_result{DBErrors::LOAD_OK}; - std::string m_error{}; int m_records{0}; }; @@ -786,9 +785,10 @@ static LoadResult LoadRecords(CWallet* pwallet, DatabaseBatch& batch, const std: std::string type; ssKey >> type; assert(type == key); - DBErrors record_res = load_func(pwallet, ssKey, ssValue, result.m_error); + std::string log_error{}; + DBErrors record_res = load_func(pwallet, ssKey, ssValue, log_error); if (record_res != DBErrors::LOAD_OK) { - pwallet->WalletLogPrintf("%s\n", result.m_error); + pwallet->WalletLogPrintf("%s\n", log_error); } result.m_result = std::max(result.m_result, record_res); ++result.m_records;

Also, the commit description seems to imply behavior changed where it didn't, no?

Suggested range diff:

8: 20d1fc5061 ! 8: 0bf8188513 walletdb: Refactor legacy wallet record loading into its own function @@ Commit message Exceptions are handled on a per-record type basis, rather than globally. For private keys in a legacy wallet and the encryption keys, - a deserialization error will result in an early return rather than + a deserialization error will result in a contined loading of the remaining keys, with the resulting wallet - failing to load. For other record types, errors that were previously + failing to load, unchanged from previous behavior. For other record types, errors that were previously handled by the global exception handler will result in loading failure + , also skipping the continued loading of remaining keys, instead of a non-critical error.

For "other record types", could also unify the section with the section in the next commit, which should be the same?

Exception handling for these records changes to a per-record type basis, rather than globally. This results in some records now failing with a critical error rather than a non-critical one.

Also, could mention that for other record types, previous early returns from LoadWallet like UNEXPECTED_LEGACY_ENTRY would now be scheduled for later, and it continues to load more prefix-keys?

Implemented the suggested changes.

Also, could mention that for other record types, previous early returns from LoadWallet like UNEXPECTED_LEGACY_ENTRY would now be scheduled for later, and it continues to load more prefix-keys?

I don't think that's necessarily true? I'm planning on revisiting the error handling in a followup.

You can see the early return in LoadWallet in master here:

bitcoin/src/wallet/walletdb.cpp

Line 863 in 7ee4121

return DBErrors::UNEXPECTED_LEGACY_ENTRY;

In this pull you can see that it calls std::max and then continues with LoadWallet:

bitcoin/src/wallet/walletdb.cpp

Lines 1165 to 1168 in 3c83b1d

result = std::max(LoadLegacyWalletRecords(pwallet, *m_batch, last_client), result);

// Load descriptors

result = std::max(LoadDescriptorWalletRecords(pwallet, *m_batch, last_client), result);

What I meant was that there are records that now early return but didn't previously, but I suppose that's the opposite of what you were talking about.

src/wallet/walletdb.cpp

maflcko · 2023-06-27T12:26:43Z

Also, it may be good to add tests for currently untested code: https://marcofalke.github.io/b-c-cov/total.coverage/src/wallet/wallet.cpp.gcov.html#2926

ryanofsky · 2023-06-27T13:56:39Z

With 3 up-to-date acks, and all the review that's happened so far, I think this looks ready to merge now. But will wait to see if @achow101 wants to respond to newest comments.

I like Marco's error handling suggestions since they simplify code and just print all the errors. If it turns out there are this cases where this repeats too many errors, it seems like it would be easy to add a bit of code to LoadRecords that just compares new error messages to previous and adds [Repeated ## times] to the log.

It would also be great to to look at coverage link #24914 (comment) and list places where we should try to add missing test coverage.

maflcko · 2023-06-27T14:00:51Z

If it turns out there are this cases where this repeats too many errors

I think that'd be the case on current master as well. My comment was that this pull may repeat the first error message, even though there may be a second one.

ryanofsky · 2023-06-27T14:12:16Z

I think that'd be the case on current master as well. My comment was that this pull may repeat the first error message, even though there may be a second one.

Makes sense, I missed that. I thought the code was only trying to print one error message, and didn't notice it would print it multiple times. I like your fix of just printing all the messages instead of printing one, but either way to fix this seems fine.

furszy · 2023-06-27T14:48:55Z

Also, it may be good to add tests for currently untested code: https://marcofalke.github.io/b-c-cov/total.coverage/src/wallet/wallet.cpp.gcov.html#2926

I actually have a PR planned to remove most of those lines. The string errors there are duplicated from the internal walletdb errors. We can just bubble up the error instead of swallowing it.

Also, in the unit tests, we usually call CWallet::LoadWallet instead of CWallet::Create, thus why some of those paths are untested (like the UNKNOWN_DESCRIPTOR one, which is tested here).

That being said, totally agree on adding much more coverage to the area.

Instead of loading legacy wallet records as we come across them when iterating the database, load them explicitly. Exception handling for these records changes to a per-record type basis, rather than globally. This results in some records now failing with a critical error rather than a non-critical one.

Instead of loading descriptor wallet records as we come across them when iterating the database, loading them explicitly. Exception handling for these records changes to a per-record type basis, rather than globally. This results in some records now failing with a critical error rather than a non-critical one.

Instead of loading address book records as we come across them when iterating the database, load them explicitly Due to exception handling changes, deserialization errors are now treated as critical. The error message for noncritical errors has also been updated to reflect that there's more data that could be missing than just address book entries and tx data.

Instead of loading tx records as we come across them when iterating the database, load them explicitly.

Instead of loading active spkm records as we come across them when iterating the database, load them explicitly. Due to exception handling changes, deserialization errors are now treated as critical.

Instead of dealing with these records when iterating the entire database, find and handle them explicitly. Loading of OLD_KEY records is bumped up to a LOAD_FAIL error as we will not be able to use these types of keys which can lead to users missing funds.

Instead of loading decryption keys as we iterate the database, load them explicitly.

Instead of iterating the database to load the wallet, we now load particular kinds of records in an order that we want them to be loaded. So it is no longer necessary to iterate the entire database to load the wallet.

Co-Authored-By: Ryan Ofsky <ryan@ofsky.org>

maflcko · 2023-06-27T15:19:39Z

re-ACK 3c83b1d 🍤

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK 3c83b1d884b419adece95b335b6e956e7459a7ef 🍤
LjPNOXHR9a4Og46x/6TZNAoki0QrUodvQbWJAyOV1qwc7WRRNkI99Gn13HSmcRTEqR3pSz6zwG+lAATKwQF8CA==

furszy

reACK 3c83b1d

ryanofsky

Code review ACK 3c83b1d. Just Marco's suggested error handling fixes since last review

Thompson1985

Veiwed

achow101 added the Wallet label Apr 18, 2022

achow101 force-pushed the wallet-load-order branch from 1597068 to f160c0c Compare April 20, 2022 18:19

achow101 mentioned this pull request Apr 26, 2022

[Idea] Wallet UTXO Pruning #19942

Open

DrahtBot mentioned this pull request Apr 26, 2022

scripted-diff: rename BytePtr to AsBytePtr #24989

Merged

DrahtBot added the Needs rebase label Apr 26, 2022

achow101 force-pushed the wallet-load-order branch from f160c0c to 3a27791 Compare April 26, 2022 19:16

DrahtBot removed the Needs rebase label Apr 26, 2022

w0xlt reviewed May 2, 2022

View reviewed changes

pk-b2 reviewed May 6, 2022

View reviewed changes

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved

DrahtBot added the Needs rebase label May 12, 2022

achow101 force-pushed the wallet-load-order branch 2 times, most recently from 2fd3b53 to bbd849d Compare May 16, 2022 14:56

DrahtBot removed the Needs rebase label May 16, 2022

achow101 force-pushed the wallet-load-order branch from bbd849d to 7b42ca1 Compare May 16, 2022 15:39

DrahtBot mentioned this pull request May 16, 2022

tidy: add modernize-use-default-member-init #25108

Merged

achow101 force-pushed the wallet-load-order branch from 7b42ca1 to 69cbe3c Compare May 17, 2022 22:54

DrahtBot added the Needs rebase label May 18, 2022

achow101 force-pushed the wallet-load-order branch from 69cbe3c to c23dfc2 Compare May 18, 2022 18:07

DrahtBot removed the Needs rebase label May 18, 2022

DrahtBot mentioned this pull request Jun 8, 2022

Add DataStream without ser-type and ser-version and use it where possible #25296

Closed

ryanofsky approved these changes Jun 26, 2023

View reviewed changes

src/wallet/walletdb.cpp Show resolved Hide resolved

achow101 force-pushed the wallet-load-order branch from 973d910 to 7cb7740 Compare June 26, 2023 22:05

ryanofsky approved these changes Jun 26, 2023

View reviewed changes

DrahtBot requested a review from furszy June 26, 2023 22:35

furszy approved these changes Jun 26, 2023

View reviewed changes

src/wallet/walletdb.cpp Outdated Show resolved Hide resolved

maflcko approved these changes Jun 27, 2023

View reviewed changes

achow101 and others added 9 commits June 27, 2023 11:00

walletdb: refactor tx loading

6fabb7f

Instead of loading tx records as we come across them when iterating the database, load them explicitly.

walletdb: refactor active spkm loading

c978c6d

Instead of loading active spkm records as we come across them when iterating the database, load them explicitly. Due to exception handling changes, deserialization errors are now treated as critical.

walletdb: refactor decryption key loading

cd211b3

Instead of loading decryption keys as we iterate the database, load them explicitly.

walletdb: Remove loading code where the database is iterated

2636844

Instead of iterating the database to load the wallet, we now load particular kinds of records in an order that we want them to be loaded. So it is no longer necessary to iterate the entire database to load the wallet.

doc: Add release note for wallet loading changes

3c83b1d

Co-Authored-By: Ryan Ofsky <ryan@ofsky.org>

achow101 force-pushed the wallet-load-order branch from 7cb7740 to 3c83b1d Compare June 27, 2023 15:08

DrahtBot requested review from furszy and ryanofsky June 27, 2023 15:19

furszy approved these changes Jun 27, 2023

View reviewed changes

ryanofsky approved these changes Jun 27, 2023

View reviewed changes

ryanofsky merged commit d9c7c2f into bitcoin:master Jun 27, 2023

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 30, 2023

Merge bitcoin#24914: wallet: Load database records in a particular order

23bcef6

Thompson1985 reviewed Oct 29, 2023

View reviewed changes

bitcoin locked and limited conversation to collaborators Oct 29, 2024

	result = std::max(LoadLegacyWalletRecords(pwallet, *m_batch, last_client), result);

	// Load descriptors
	result = std::max(LoadDescriptorWalletRecords(pwallet, *m_batch, last_client), result);

wallet: Load database records in a particular order #24914

wallet: Load database records in a particular order #24914

Conversation

achow101 commented Apr 18, 2022

Uh oh!

DrahtBot commented Apr 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

theStack commented May 2, 2022

Uh oh!

w0xlt left a comment

Choose a reason for hiding this comment

Uh oh!

pk-b2 commented May 6, 2022

Uh oh!

pk-b2 May 6, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

laanwj commented May 31, 2022

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

furszy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

maflcko Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

maflcko Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

maflcko Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

achow101 Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

maflcko Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

achow101 Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maflcko commented Jun 27, 2023

Uh oh!

ryanofsky commented Jun 27, 2023

Uh oh!

maflcko commented Jun 27, 2023

Uh oh!

ryanofsky commented Jun 27, 2023

Uh oh!

furszy commented Jun 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Jun 27, 2023

Uh oh!

furszy left a comment

Choose a reason for hiding this comment

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Thompson1985 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DrahtBot commented Apr 19, 2022 •

edited

Loading

furszy commented Jun 27, 2023 •

edited

Loading