Bonsai archive feature #7475

matthew1001 · 2024-08-16T15:19:39Z

PR description

Introduces a new (experimental) "Bonsai Archive" DB mode which creates a full archive of the chain it syncs with. This allows JSON/RPC calls to be made with historic blocks as context, for example eth_getBalance to get the balance of an account at a historic block, or eth_call to simulate a transaction at a given block in history.

The PR is intended to provide part of the function currently offered by the (now deprecated) FOREST DB mode. Specifically it allows state to be queried at an arbitrary block in history, but does not currently offer eth_getProof for said state. A subsequent PR will implement eth_getProof for historic blocks.

Summary of the overall design & changes

More detailed design is covered in https://lf-hyperledger.atlassian.net/wiki/spaces/BESU/pages/22156895/Bonsai+archive+Kikori+DB+feature)
Also see discord discussion https://discord.com/channels/905194001349627914/1261348506648707133/1261348712047706243

This PR builds on PR #5865 which proved the basic concept of archiving state in the Bonsai flat DB by suffixing entries with the block in which they were changed.

For example the state for account 0x0e79065B5F11b5BD1e62B935A600976ffF3754B9 at block 37834 is stored as

<account-hash><block-num-hex> = 0x9ab656e8fa2a1029964289c9a189083db258ca4b46ebaa374477e069b8f47dec00000000000093ca

In order to minimise performance degradation over time, historic state and storage entries in the DB are "archived" by moving them into a separate DB segment.

Where account state is stored in segment ACCOUNT_INFO_STATE, state that has been archived is stored in ACCOUNT_INFO_STATE_ARCHIVE. Likewise where storage is held in segment ACCOUNT_STORAGE_STORAGE, archived storage entries are stored in ACCOUNT_STORAGE_ARCHIVE.

An example Rocks DB query to retrieve the state of the example account above would be:

ldb --db=. get --column_family=ACCOUNT_INFO_STATE_ARCHIVE --key_hex --value_hex 0x9ab656e8fa2a1029964289c9a189083db258ca4b46ebaa374477e069b8f47dec00000000000093ca

Creating a Bonsai Archive node

The PR introduces an entirely new data storage format (as opposed to making it a configuration option of the existing BONSAI storage format.

To create a bonsai archive node simply set --data-storage-format=x_bonsai_archive when creating it.

An existing FOREST or BONSAI node cannot be migrated to BONSAI_ARCHIVE mode.

Storage requirements

An archive node intrinsically requires more storage space than a non-archive node. Every state update is retained in the archive DB segments as outlined above. An archive node for the holesky testnet as of the raising of this PR requires approximately 160Gi of storage.

Sync time

In order to create an archive of an entire chain, FULL sync mode must be used. This PR does not prevent SNAP syncing an archive node, but this will result in only a partial archive of the chain.

While the node is performing a FULL sync with the chain it is also migrating entries from the regular DB segments to the archive DB segments. Overall this increases the time to create the archive node. For a public chain this might require 1 week or more to complete syncing and archiving.

Signed-off-by: Jason Frame <jason.frame@consensys.net>

…se constructor that reuses worldStateStorage so that we don't lose values in the EvmToolSpecTests Signed-off-by: Jason Frame <jason.frame@consensys.net>

Signed-off-by: Jason Frame <jason.frame@consensys.net>