Skip to content

Conversation

TheQuantumPhysicist
Copy link
Contributor

@TheQuantumPhysicist TheQuantumPhysicist commented Apr 19, 2022

This PR separates the storage abstraction layer (e.g., leveldb layer) from the bitcoin core application layer. This has many benefits:

  • Easier testing for the bitcoin core application layer without requiring an open database
  • Easier replacement for the storage library in case it's decided to do it in the future with a simple implementation of an interface
  • Ensure that any other future implementations can live to the same standard we have now by testing different implementations the same way
  • Better interface description to understand the expectations from the implementations of the interface
  • Much easier bench-marking separate from serialization

This change has caused zero modification to the bitcoin core application source code, as I deliberately maintained everything from the outside. This change is a drop-in replacement to everything that uses leveldb.

Things I failed to do without breaking backwards compatibility, so I skipped:

  • Use DBIndex to decide the prefix of the storage. The problem here is that some key/value database systems provide other mechanisms for separating between different database "tables", so to say. In our case, we use a char as a prefix. For example, lmdb provides an integer index. The problem here is that when keys are submitted as a Span, prepending a prefix is a relatively expensive operation as it requires copying the whole key. I don't see how we can do this without that. This is a conflict between the serialization and the storage layer that I couldn't resolve.

Things I'd love to achieve in the next PR(s):

  • Create an in-memory database that can easily replace leveldb and assist in tests (Given that tests use in-memory db as was pointed out in the comments, we may or may not do this)
  • Move the block and block-undo storage behind this interface and use DBIndex as a way to tell the implementation how to store the data (in files or in leveldb)

@jamesob
Copy link
Contributor

jamesob commented Apr 19, 2022

Create an in-memory database that can easily replace leveldb and assist in tests

FWIW, unittests already run leveldb in-memory: https://github.com/bitcoin/bitcoin/blob/master/src/test/util/setup_common.cpp#L207-L208

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 19, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25202 (log: Use severity-based logging for leveldb/libevent messages, reverse LogPrintLevel order by laanwj)
  • #25172 (refactor: use std:: prefix for std lib funcs by fanquake)
  • #24232 (assumeutxo: add init and completion logic by jamesob)
  • #22663 (dbwrapper: properly destroy objects in case CDBWrapper ctor throws by Crypt-iQ)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@TheQuantumPhysicist
Copy link
Contributor Author

TheQuantumPhysicist commented Apr 19, 2022

The functional test wallet_send.py seems to be flaky, unrelated to this PR.

@luke-jr
Copy link
Member

luke-jr commented Apr 19, 2022

Not sure if this is desirable. LevelDB is de facto part of the consensus layer, and may not be safe to just substitute willy-nilly.

@TheQuantumPhysicist
Copy link
Contributor Author

LevelDB is de facto part of the consensus layer

In fact, this PR is the first step to solve that problem. LevelDB should be only a key/value store. While I totally understand your concern, this PR will make it possible to quantify our reliance on LevelDB specifics, if any. In no way is this PR meant to advocate for substituting LevelDB in a negligent fashion. We advocate only for a correct design to better understand the bitcoin protocol.

@luke-jr
Copy link
Member

luke-jr commented Apr 19, 2022

It's NOT solvable. Adding another layer in the middle can only introduce problems, not solve it.

@TheQuantumPhysicist
Copy link
Contributor Author

It's NOT solvable. Adding another layer in the middle can only introduce problems, not solve it.

Even though I understand your concerns, the claim that this is not solvable requires evidence, which is lacking here. I'm not even fully onboard on the idea that LevelDB is part of the "consensus layer", but I understand that LevelDB is part of the consensus library; after all it's a key/value store and it should be just that. The solution to this fear, however, is not keeping it untouched (because it was touched before, quite often over the last decade). But the solution is to quantify how it works and make it part of the protocol and run proper tests, like it has always been done in this repo.

And I'm happy to do more to make this PR more appealing and address any concerns, whether it's tests or something else.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@achow101
Copy link
Member

Closing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.

@achow101 achow101 closed this Oct 12, 2022
@TheQuantumPhysicist
Copy link
Contributor Author

Closing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.

I am interested. But no one seems to have said anything for me to react.

@ryanofsky
Copy link
Contributor

Reopening so this can be discussed. The code change seems very simple, but more reviewers need to weigh in on whether having this abstraction is good or bad.

@TheQuantumPhysicist, since the first listed benefit of the change is "Easier testing for the bitcoin core application layer" maybe you could motivate this change a little more by adding new tests that take advantage of the abstraction.

@ryanofsky ryanofsky reopened this Oct 12, 2022
@TheQuantumPhysicist
Copy link
Contributor Author

Reopening so this can be discussed. The code change seems very simple, but more reviewers need to weigh in on whether having this abstraction is good or bad.

@TheQuantumPhysicist, since the first listed benefit of the change is "Easier testing for the bitcoin core application layer" maybe you could motivate this change a little more by adding new tests that take advantage of the abstraction.

I'm more than happy to add tests once there's agreement on the PR. Quite frankly I'm a little disappointed that no one cared, which is OK, but let's not get ahead of ourselves adding more code and work before people show the least interest.

And to get on the practicality train, the reviewer can see now there's abstract interfaces now for storage and iteration over storage, which can be used to define behavior or even mock the storage to simulate errors with something like gmock. It's not hard to see how interface-behavior can be well-defined and tested.

Cheers!

@achow101
Copy link
Member

I am interested. But no one seems to have said anything for me to react.

This PR has needed rebase for a while. Often reviewers will not look at PRs that need rebasing as such review will be invalidated following rebase, and may also no longer be applicable.

@TheQuantumPhysicist
Copy link
Contributor Author

I am interested. But no one seems to have said anything for me to react.

This PR has needed rebase for a while. Often reviewers will not look at PRs that need rebasing as such review will be invalidated following rebase, and may also no longer be applicable.

I'll take a look and see if It's not needed anymore.

@bitcoin bitcoin locked and limited conversation to collaborators Oct 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants