Storage optimization

High-level tracking issue for general storage optimization efforts. This issue can be expanded over time.

At present (mid 2023), depending on their configuration, Tendermint-based nodes use large quantities of storage space. This has significant cost implications for operators. We aim to implement strategies to reduce and/or offload certain data stored in order to reduce operators' costs. 

The two main problems that are present in the CometBFT storage layer:
1. We have a very big storage footprint
2. Querying stored data (whether supporting RPC queries or Comet retrieving consensus data structures) is not optimized and in some cases proven to be very efficient

To address these problems, we first need to build understanding of:
- Workloads : What we store, how frequently we access it, what are the characteristics of the stored data (and this list will be expanded).
- The database backend: database features, design goals and optimization possibilities.

The work to be done can be broken down in the following main subsections:

------
* [x] #48
The end result of this work should be CometBFT optimized for a single storage backend which ultimately results in a significant reduction in both storage access time and on disk storage footprint. 

To reach this goal we envision the following steps :
  * [x] #68  
  Preliminary investigation to identify the users and workloads of the storage backend (how they query the nodes, what are their common pain points with regards to storage, collection of issues to address). 
  * [x] #63
     * [x] #1044
     * [x] Understand the workload: What data are we storing, in what format
     * [x] #46 
     * [x] #67 
  * [ ] #64 
  * [x] #1039 
  * [x]  Add support for users to migrate to the chosen backend

------
**Tune CometBFT to address storage related bottlenecks** 

Part of this section covers addressing issues found during the benchmarking and investigation process outlined above. Another part addresses concrete issues reported by users. While part of this issues cannot be fully addressed before the analysis above, some optimizations can be performed on CometBFT as it is today - marked with * . 
 * [x] #1037 *
        The Genesis file can be large and surpass internal DB file size limitations (3GB for RocksDB). 
 * [x] #1040 *
 * [x] #1041
 * [ ] Reconstruct state using iterators rather than storing it as an entry.  ( depends on previous point)
 * [x] Pruning of blockstore is not reflected in storage used 
      * [x]  https://github.com/informalsystems/interchain/issues/1
      * [x] #49 
      * [x] #169 
      It seems that for users pruning the indexer is not as high a priority and well understood as reducing the footprint and reducing the potential DoS vector querying it can be.  
------
**CometBFT stores and allows querying of data not essential for consensus**
  We need to Identify the functionalities we want to support within Tendermint and offload non-critical data and functionality.
  * [x] #816 
 This implementation provides users with an API to implement their own event indexing and prune the full nodes who store events at the moment.
  * [ ]  Write a  data companion based on ADR 101

-----
* [ ] #50 

----
**CometBFT currently maintains its own [WAL]**(https://github.com/cometbft/cometbft/blob/101bf50e715d6a10c8135392166c35bdae94972e/consensus/wal.go) - is this even necessary, given that the underlying database should actually be taking care of this? It is another source of complexity and potential point of failure in the system that the team has to maintain.


-----
Original issue: https://github.com/tendermint/tendermint/issues/9881

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Storage optimization #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Storage optimization #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions