-
Notifications
You must be signed in to change notification settings - Fork 954
Bonsai Archive with State Proofs #8918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
a5b4d22
to
23eafc1
Compare
Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io>
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first round of my review, added some comments and asked some questions
: storageFormat == DataStorageFormat.BONSAI | ||
? DataStorageConfiguration.DEFAULT_BONSAI_CONFIG | ||
: DataStorageConfiguration.DEFAULT_BONSAI_ARCHIVE_CONFIG) | ||
: storageFormat == DataStorageFormat.X_BONSAI_ARCHIVE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a switch will e better in this case ?
* @param number The height of the block whose hash should be retrieved. | ||
* @return The hash of the block at the given height. | ||
*/ | ||
Optional<Hash> getBlockHashByNumberSafe(long number); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the diff with getBlockHashByNumber ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this change can be removed. The intention was to add a synchronized
version of getBlockHashByNumber()
to resolve a race condition I was observing testing the state-proof code, similar to those fixed in #6344 and #6140. But I refactored the calling code and I don't think the synchronized
version of getBlockHashByNumber()
is currently required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getBlockHeaderSafe()
I added is still required and called from BonsaiArchiveProofsWorldStateProvider
so I'll leave that in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tidied this up in the latest commits
private Optional<BlockHeader> getCheckpointStateStartBlock( | ||
final Blockchain blockchain, final Hash targetHash) { | ||
long nearestCheckpointBlock = | ||
(((blockchain.getBlockHeader(targetHash).get().getNumber() + trieNodeCheckpointInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure the block will be always here ?
if (checkpointBlock.isPresent()) { | ||
return cachedWorldStorageManager | ||
.getWorldState(checkpointBlock.get().getHash()) | ||
.or(() -> cachedWorldStorageManager.getHeadWorldState(blockchain::getBlockHeaderSafe)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this case possible ?
final BlockHeader checkpointBlock, | ||
final Hash targetBlockHash) { | ||
|
||
// "Create" (fake) a world state representation at the next checkpoint block after the target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure to understand what is the fake worldstate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just setting the block; like that we are reading the right trie nodes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think I could clarify the comment wording here. "Fake" perhaps isn't the correct term. It's really just taking the current world state for the current chain head, and then starting the roll back process by asserting "this world state is for block X". The rollback logic interprets the asserted block number correctly when determining which trie logs it needs to roll. I think I will refactor the function names & comments to be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the function name in the latest commit, and reworded the comment.
this.trieNodeCheckpointInterval = trieNodeCheckpointInterval; | ||
} | ||
|
||
private Optional<BonsaiContext> getStateTrieArchiveContextForWrite( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have another strategy or rename the class, flat is for the flat part not the trie
Bytes keyNearest = | ||
calculateArchiveKeyWithMaxSuffix( | ||
getStateArchiveContextForRead(storage), | ||
Bytes.concatenate(accountHash, location).toArrayUnsafe()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe pushing Bytes.concatenate(accountHash, location).toArrayUnsafe()) into a byte[] field and use it for the size
Using tuweni is impacting performance so if we can avoid some call it's better
accountHash.size()
- location.size()
+ 8)) // TODO - change for CONST length | ||
.filter( | ||
found -> | ||
Bytes.concatenate(accountHash, location).commonPrefixLength(found.key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here we are doing the concatenate multiple time and this one is really slow
final WorldStateProofProvider worldStateProofProvider = | ||
new WorldStateProofProvider( | ||
new WorldStateStorageCoordinator(ws.getWorldStateStorage())); | ||
return mapper.apply( | ||
worldStateProofProvider.getAccountProof( | ||
ws.getWorldStateRootHash(), accountAddress, accountStorageKeys)); | ||
blockHeader.getStateRoot(), accountAddress, accountStorageKeys)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the getWorldStateRootHash will not be the good one ?
.map( | ||
hash -> | ||
hash.equals(rootHash) | ||
|| (blockHash != null && trieLogStorage.containsKey(blockHash.toArrayUnsafe()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this modification necessary ?
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
da1096e
to
f120446
Compare
Signed-off-by: Matthew Whitehead <matthew.whitehead@kaleido.io>
PR description
(Replaces previous draft PR #8669 after having completed a fairly involved merge.)
Known TODOs:
X_BONSAI_ARCHIVE_PROOFS
andX_BONSAI_ARCHIVE_PROOFS
as discrete DB format types, or whether we should just have proofs enabled for all bonsai archive DBsThis PR follows on from the first Bonsai Archive PR and enhances it to provide full state proofs for Bonsai Archive state.
X_BONSAI_ARCHIVE_PROOFS
X_BONSAI_ARCHIVE
andX_BONSAI_ARCHIVE_PROOFS
options are merged into a singleBONSAI_ARCHIVE
data storage format. Currently I think it will be useful to have separate options while in experimental to make it easier to recreate any issues with and without the state proof behaviour.The aim of this feature is to provide feature parity with FOREST DB and allow us to finally start the process of removing FOREST DB from Besu.
Testing
Aside from the updated tests in the PR which exercise the new flat DB format by adding
X_BONSAI_ARCHIVE_PROOFS
to various existing tests, I have run a number of QBFT chains using a combination ofFOREST
,BONSAI
, andX_BONSAI ARCHIVE_PROOFS
nodes and exercised a variety of state update & state proof requests.I have configured a node to sync with Ethereum mainnet and it is currently at block 7m without issues:
In addition to syncing with mainnet, I've created 2 test scripts which exercise
eth_getTransactionCount
,eth_getStorageAt
, andeth_getProof
for known accounts/states in the first 3m blocks. Theeth_getProof
script uses proofs obtained from aFOREST
node synced up to 3+m blocks, and then useseth_getProof
against theBONSAI
archive node to check that the proofs returned byBONSAI
match those returned byFOREST
.Script for eth_getProof testing on mainnet
Script for testing historic account states on mainnet
#!/bin/bash nonceResponseMatches () { echo -n "Checking nonce == $2 for account $3, block $4 - " if [[ "${1^^}" == "${2^^}" ]]; then echo OK else echo "Unexpected JSON/RPC response. $1 != $2" exit 1 fi } storageResponseMatches () { echo -n "Checking storage slot $1 == $3 for account $4, block $5 - " if [[ "${2^^}" == "${3^^}" ]]; then echo OK else echo "Unexpected JSON/RPC response. $2 != $3" exit 1 fi } # Retrive the transaction count for a number of different accounts in the first 1,000,000 blocks of Ethereum L1 # Some of these have been specfically selected for accounts that change several times in a block, or that change in several contiguous blocks. # Block 150003 - 2 transactions from the same sender. Check nonce for blocks 150002, 150003, and 150004 ACCOUNT="0x32Be343B94f860124dC4fEe278FDCBD38C102D88" BLOCK="0x249F2" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x805" $ACCOUNT $BLOCK BLOCK="0x249F3" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x807" $ACCOUNT $BLOCK BLOCK="0x249F4" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x807" $ACCOUNT $BLOCK # Blocks 138719 and 138720 - transactions from the same sender in 2 contiguous blocks. Check blocks 138718, 138719, 138720 and 138721 ACCOUNT="0x1DCb8d1F0FCc8CbC8C2d76528E877F915e299fbE" BLOCK="0x21DDE" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x59" $ACCOUNT $BLOCK BLOCK="0x21DDF" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x5a" $ACCOUNT $BLOCK BLOCK="0x21DE0" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x5b" $ACCOUNT $BLOCK BLOCK="0x21DE1" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x5b" $ACCOUNT $BLOCK # Some storage lookups for slots that are known to change # Blocks 2018260, 2020000, 2300000 for a specific smart contract ACCOUNT="0x684282178b1d61164FEbCf9609cA195BeF9A33B5" BLOCK="0x1ECBD4" SLOT="0x5" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000003" $ACCOUNT $BLOCK SLOT="0x7" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000001" $ACCOUNT $BLOCK SLOT="0xa" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000000" $ACCOUNT $BLOCK BLOCK="0x1ED2A0" SLOT="0x5" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000004" $ACCOUNT $BLOCK SLOT="0x7" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000001" $ACCOUNT $BLOCK SLOT="0xa" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000000" $ACCOUNT $BLOCK BLOCK="0x231860" SLOT="0x5" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000005" $ACCOUNT $BLOCK SLOT="0x7" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000001" $ACCOUNT $BLOCK SLOT="0xa" STORAGE_VAL=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getStorageAt","params":["'$ACCOUNT'","'$SLOT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` storageResponseMatches "$SLOT" $STORAGE_VAL "0x0000000000000000000000000000000000000000000000000000000000000001" $ACCOUNT $BLOCK # Some other random checks for nonce on accounts at later blocks # Block 2000000 ACCOUNT="0x32Be343B94f860124dC4fEe278FDCBD38C102D88" BLOCK="0x1E8480" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x1EFC6" $ACCOUNT $BLOCK # Block 3000000 ACCOUNT="0xEA674fdDe714fd979de3EdF0F56AA9716B898ec8" BLOCK="0x2DC6C0" NONCE=`curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionCount","params":["'$ACCOUNT'","'$BLOCK'"],"id":1}' http://127.0.0.1:8545 | jq .result -r` nonceResponseMatches $NONCE "0x10AA05" $ACCOUNT $BLOCK exit 0