-
Notifications
You must be signed in to change notification settings - Fork 37.7k
RPC: add new listmempooltransactions
#29016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow bitcoin-core clients to get more interesting information out of the mempool more easily, including by filtering for 'latest txs'. Eventually would be a good interface for grabbing all mempool transaction data, including information about evicted transactions (which would be useful for avoiding situations where removed transactions contained data important for bitcoin applications). This would require the addition of a mempool tx log, however the `mempool_sequence` id plus this call and the use of a ring buffer(?) would allow for clients to get complete scans of anything a bitcoin core node had seen in its mempool. That's future work. For now we start with the interface over existing information.
Make the `listmempooltransactions` rpc available via the REST api
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29016. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
||
if (verbose) { | ||
// We could also calculate fees etc for this transaction, but yolo. | ||
TxToUniv(e.GetTx(), /*block_hash=*/uint256::ZERO, /*entry=*/txentry, /*inclue_hex=*/false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the tidy ci is getting upset here
rpc/mempool.cpp:462:82: error: argument name 'inclue_hex' in comment does not match parameter name 'include_hex' [bugprone-argument-comment,-warnings-as-errors]
462 | TxToUniv(e.GetTx(), /*block_hash=*/uint256::ZERO, /*entry=*/txentry, /*inclue_hex=*/false);
| ^~~~~~~~~~~~~~~
| /*include_hex=*/
./core_io.h:57:88: note: 'include_hex' declared here
57 | void TxToUniv(const CTransaction& tx, const uint256& block_hash, UniValue& entry, bool include_hex = true, bool without_witness = false, const CTxUndo* txundo = nullptr, TxVerbosity verbosity = TxVerbosity::SHOW_DETAILS);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, missing test coverage for the new command:
Uncovered RPC commands:
- listmempooltransactions
This seems like the best option imo. If we want to return the transaction hex (iiuc that should suffice if you are looking to get the tx data itself) we could add another level of verbosity to
Not sure I understand what the difficulty was?
Assuming you mean the sequence emitted from |
Can you say more about this divergence or point me to a doc that contains more info? I should add that one problem with the existing I attempted to find a rationale for this but was unsuccessful (some code reorgs had removed the original code commit in the blame). Re: returning the raw tx: that suffices but is suboptimal for an application that is scanning for particular output creation or input inclusion in a proposed tx as it creates additional work on the client. Speaking of computation, note that this approach omits the computation required for finding ancestor/descendent txs in the mempool; that seemed like unnecessary work for clients that are solely interested in just the tx info. |
Yes, will add test coverage when this approach gets a concept ack. Instead, working usage of this proposed code change has been included in the original PR |
Concept ACK |
1 similar comment
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you mean the sequence emitted from Transaction{AddedTo,RemovedFrom}Mempool, note that this is not actually the case (they're usually off by 1 or more due to when the sequence increments happen).
Can you say more about this divergence or point me to a doc that contains more info?
It's more that they just happen to use the same counter, but that doesn't mean they match.
The mempool entry's sequence number is assigned here;
Lines 854 to 856 in fcacbab
const uint64_t entry_sequence = bypass_limits ? 0 : m_pool.GetSequence(); | |
entry.reset(new CTxMemPoolEntry(ptx, ws.m_base_fees, nAcceptTime, m_active_chainstate.m_chain.Height(), entry_sequence, | |
fSpendsCoinbase, nSigOpsCost, lock_points.value())); |
And the notification fires along with a counter increment here:
Line 1272 in fcacbab
GetMainSignals().TransactionAddedToMempool(tx_info, m_pool.GetAndIncrementSequence()); |
This means the notification is +1 of the entry sequence, +n if it's in a package. edit: Woops I might be wrong and that's the same number. A better example is the entry sequences are all the same but are incremented for the notifications in a package. The entry sequence is also hard coded to 0 for a transaction came from a reorg.
It doesn't look like they were meant to match up exactly, though this potential confusion was acknowledged prior to merge.
I should add that one problem with the existing
getrawmempool
RPC semantics is that currently requesting the mempool_sequence and asking for verbose results is specifically disallowed.I attempted to find a rationale for this but was unsuccessful (some code reorgs had removed the original code commit in the blame).
I'm not sure why that's the case either, the only context I can see is the edits in this comment after pushing those lines, which indicate it was causing a test failure. I can't open the travis link anymore. Pinging @instagibbs in case he remembers what happened 3+ years ago. Maybe there was a subtle bug, or maybe it was the easiest way to make the tests pass and return the results to the client without adding more params.
Re: returning the raw tx: that suffices but is suboptimal for an application that is scanning for particular output creation or input inclusion in a proposed tx as it creates additional work on the client.
Note that you can use gettxspendingprevout
to search for mempool transactions spending a particular output, since the mempool does index by that. A watchonly wallet will also scan for its transactions in mempool as they appear.
Speaking of computation, note that this approach omits the computation required for finding ancestor/descendent txs in the mempool; that seemed like unnecessary work for clients that are solely interested in just the tx info.
Don't worry, there's no extra computation, it's all cached in the mempool entry which is already pulled in.
Anyway, I think it makes sense to have more params/levels to control filters and results for getrawmempool
. It seems reasonable to have a higher verbosity level or more granular controls to give you the results you want.
Can you describe in more detail what args you want to pass and what results you want returned? I have no objection to making life easier for clients, it's just that the PR as it looks right now seems like we'd have 2 RPCs that do very similar things, and uses mempool entry sequence in a way that might be buggy.
// We skip anything not requested. | ||
if (e.GetSequence() < sequence_start) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do note that a mempool tx from a recently disconnected block would be skipped here
I have some bad news, I do not. |
🐙 This pull request conflicts with the target branch and needs rebase. |
Are you still working on this? Looks like there are outstanding questions. |
Yes, haven't had time recently due to bitcoin++ work.
My use case is to scan for new transactions depositing to addresses or new scripts pertaining to a wallet which is not managed by core. Please see linked PR for CLN.
As mentioned in the PR, the data model that this returns is distinct from
Found an exception to this: package acceptance sequencing is broken and breaks this assumption. After some research, it turns out that we can't expose the mempool_sequence until we fix an existing bug in the package acceptance's mempool sequence assignment code. Currently, we assign every tx in a package the same sequence number, but notify via ZMQ using an incrmenting one. The problem stems from copying the behavior the single tx acceptance used for sequence assignment (which is quite well done) to the package context, which broke assumptions the single tx was able to rely on. This should be fixed regardless of the status of this PR, and should be considered a blocker imo for this being shipped. If someone else wants to fix it, that'd be great. |
Allow bitcoin-core clients to get more interesting information out of the mempool more easily, including by filtering for 'latest txs'. Eventually would be a good interface for grabbing all mempool transaction data, including information about evicted transactions (which would be useful for avoiding situations where removed transactions contained data important for bitcoin applications). This would require the addition of a mempool tx log, however the `mempool_sequence` id plus this call and the use of a ring buffer(?) would allow for clients to get complete scans of anything a bitcoin core node had seen in its mempool. That's future work. For now we start with the interface over existing information. Github-Pull: bitcoin#29016 Rebased-From: cb466f4
Make the `listmempooltransactions` rpc available via the REST api Github-Pull: bitcoin#29016 Rebased-From: 0700847
Are you still working on this? |
Yes but won't have any progress to show til May or so. Thanks |
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
1 similar comment
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
I've reviewed and tested the listmempooltransactions RPC implementation. Here's what I did:
The test verifies:
Feel free to cherry-pick the test from my branch. Let me know if you'd like me to make any adjustments to the test coverage. |
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
Closing for now, due to lack of progress for more than a year. Please leave a comment, if you want this reopened. Anyone may also open a new pull request, referencing this one, and including a short summary of any relevant discussion comments from here. |
Allow bitcoin-core clients to get more interesting information out of the mempool more easily, including by filtering for 'latest txs'. Eventually would be a good interface for grabbing all mempool transaction data, including information about evicted transactions (which would be useful for avoiding situations where removed transactions contained data important for bitcoin applications). This would require the addition of a mempool tx log, however the `mempool_sequence` id plus this call and the use of a ring buffer(?) would allow for clients to get complete scans of anything a bitcoin core node had seen in its mempool. That's future work. For now we start with the interface over existing information. Github-Pull: bitcoin#29016 Rebased-From: cb466f4
Make the `listmempooltransactions` rpc available via the REST api Github-Pull: bitcoin#29016 Rebased-From: 0700847
Allow bitcoin-core clients to get more interesting information out of the mempool more easily, including by filtering for 'latest txs'. Eventually would be a good interface for grabbing all mempool transaction data, including information about evicted transactions (which would be useful for avoiding situations where removed transactions contained data important for bitcoin applications). This would require the addition of a mempool tx log, however the `mempool_sequence` id plus this call and the use of a ring buffer(?) would allow for clients to get complete scans of anything a bitcoin core node had seen in its mempool. That's future work. For now we start with the interface over existing information. Github-Pull: bitcoin#29016 Rebased-From: cb466f4
Make the `listmempooltransactions` rpc available via the REST api Github-Pull: bitcoin#29016 Rebased-From: 0700847
Proposed Update
Add a new RPC endpoint,
listmempooltransactions
. Takes as args astart_sequence
andverbose
.Returns:
entry_sequence
where each entry'sentry_sequence
>= the provided `start_sequence.entry_sequence
.Builds on work done in #19572.
Rationale
The current mempool RPCs are lacking an ability to scan for updates in a more efficient manner. You can subscribe for updates via the ZMQ pipeline, but even this is inefficient to recover from if your consumer app falls offline.
ZMQ also is a push protocol, and doesn't provide a rate-limiting mechanism.
In the case of core-lightning, we don't assume access to bitcoin-core/ZMQ, so we're unable to do efficient mempool querying. There have been some recent CVEs come to light where having optional access to mempool updates may prove useful.
Paging, filtering by last seen, and the addition of full tx data in the call makes mempool data more readily available to any client app. This is good for self-sovereignty as it reduces the need for additional dependencies and app data sources to efficiently query and parse mempool data.
Other Things I Considered
My initial attempt at this modified
getrawmempool
tostart_sequence
to allow for filtering, andThat was pretty ugly however, given that the current data model for
getrawmempool
is around a concept of "mempool entry" data. This new RPC in contrast only returns transaction data (and does not report on information about other mempool dependencies etc; for that you should still callgetrawmempool
orgetmempoolentry
).You could add a
start_sequence
to the existinggetrawmempool
to help with entry data paging/querying but that's secondary to my goals for fetching relevant txdata from the mempool.Additions/Changes
This PR could be further improved by
page_size
argument which allows a calling application to limit the number of results returned.Future Work
The current RPC only supports finding added mempool transactions. It'd be interesting to experiment with keeping track of evicted/not mined transactions and adding them to the results.
This would require:
sequence_action
field for results, which indicates whether the sequence is for the tx'saddition
oreviction
.You could also keep track of tx movements into blocks, but this seems less useful/urgent in general.
Usage
You can see an example of this implemented and used in CLN for finding paid onchain invoices in this branch: ElementsProject/lightning@master...niftynei:lightning:nifty/mempool-scan
Note that the CLN implementation doesn't currently keep the mempool_sequence in disk, so it'll reload/rescan the mempool at start. This may be desirable?
Here's how a caller would use this
Note that this works well with ZMQ as the
mempool_sequence
number is identical.