-
Notifications
You must be signed in to change notification settings - Fork 683
Description
High-level tracking issue for general storage optimization efforts. This issue can be expanded over time.
At present (mid 2023), depending on their configuration, Tendermint-based nodes use large quantities of storage space. This has significant cost implications for operators. We aim to implement strategies to reduce and/or offload certain data stored in order to reduce operators' costs.
The two main problems that are present in the CometBFT storage layer:
- We have a very big storage footprint
- Querying stored data (whether supporting RPC queries or Comet retrieving consensus data structures) is not optimized and in some cases proven to be very efficient
To address these problems, we first need to build understanding of:
- Workloads : What we store, how frequently we access it, what are the characteristics of the stored data (and this list will be expanded).
- The database backend: database features, design goals and optimization possibilities.
The work to be done can be broken down in the following main subsections:
- Understand and simplify CometBFT database backend #48
The end result of this work should be CometBFT optimized for a single storage backend which ultimately results in a significant reduction in both storage access time and on disk storage footprint.
To reach this goal we envision the following steps :
- Define external users and use cases for the CometBFT storage layer #68
Preliminary investigation to identify the users and workloads of the storage backend (how they query the nodes, what are their common pain points with regards to storage, collection of issues to address). - Establish the baseline and future requirements for the storage backend #63
- Benchmark current storage improvements #1044
- Understand the workload: What data are we storing, in what format
- Establish and implement the relevant metrics to understand storage workloads #46
- Storage evaluation baseline: measure and report current storage related behaviour of CometBFT #67
- Evaluate database engines according to requirements and decide which one to optimize #64
- Refactor CometBFT to use a single underlying database #1039
- Add support for users to migrate to the chosen backend
Tune CometBFT to address storage related bottlenecks
Part of this section covers addressing issues found during the benchmarking and investigation process outlined above. Another part addresses concrete issues reported by users. While part of this issues cannot be fully addressed before the analysis above, some optimizations can be performed on CometBFT as it is today - marked with * .
- storage: Alternative representation of the genesis file. #1037 *
The Genesis file can be large and surpass internal DB file size limitations (3GB for RocksDB). - Batch commits to the state and block store #1040 *
- Reconsider representation of keys for state and block store #1041
- Reconstruct state using iterators rather than storing it as an entry. ( depends on previous point)
- Pruning of blockstore is not reflected in storage used
- Investigate why Tendermint disk storage keeps growing over time informalsystems/interchain#1
- Add in-process compaction support to databases #49
- storage+indexer: Indexer is not pruned #169
It seems that for users pruning the indexer is not as high a priority and well understood as reducing the footprint and reducing the potential DoS vector querying it can be.
CometBFT stores and allows querying of data not essential for consensus
We need to Identify the functionalities we want to support within Tendermint and offload non-critical data and functionality.
- Implement ADR-101 PoC targeting
main
#816
This implementation provides users with an API to implement their own event indexing and prune the full nodes who store events at the moment. - Write a data companion based on ADR 101
CometBFT currently maintains its own [WAL](https://github.com/cometbft/cometbft/blob/101bf50e715d6a10c8135392166c35bdae94972e/consensus/wal.go) - is this even necessary, given that the underlying database should actually be taking care of this? It is another source of complexity and potential point of failure in the system that the team has to maintain.
Original issue: tendermint/tendermint#9881