Skip to content

Conversation

niftynei
Copy link
Contributor

@niftynei niftynei commented Dec 6, 2023

Proposed Update

Add a new RPC endpoint, listmempooltransactions. Takes as args a start_sequence and verbose.

Returns:

  • if verbose false, list of txids + their entry_sequence where each entry's entry_sequence >= the provided `start_sequence.
  • if verbose true, raw tx output info including each entry's entry_sequence.

Builds on work done in #19572.

Rationale

The current mempool RPCs are lacking an ability to scan for updates in a more efficient manner. You can subscribe for updates via the ZMQ pipeline, but even this is inefficient to recover from if your consumer app falls offline.

ZMQ also is a push protocol, and doesn't provide a rate-limiting mechanism.

In the case of core-lightning, we don't assume access to bitcoin-core/ZMQ, so we're unable to do efficient mempool querying. There have been some recent CVEs come to light where having optional access to mempool updates may prove useful.

Paging, filtering by last seen, and the addition of full tx data in the call makes mempool data more readily available to any client app. This is good for self-sovereignty as it reduces the need for additional dependencies and app data sources to efficiently query and parse mempool data.

Other Things I Considered

My initial attempt at this modified getrawmempool to

  • take a start_sequence to allow for filtering, and
  • adding the txdata (inputs outpoints + outputs (amount + scriptPubKeys)).

That was pretty ugly however, given that the current data model for getrawmempool is around a concept of "mempool entry" data. This new RPC in contrast only returns transaction data (and does not report on information about other mempool dependencies etc; for that you should still call getrawmempool or getmempoolentry).

You could add a start_sequence to the existing getrawmempool to help with entry data paging/querying but that's secondary to my goals for fetching relevant txdata from the mempool.

Additions/Changes

This PR could be further improved by

  • Add a page_size argument which allows a calling application to limit the number of results returned.

Future Work

The current RPC only supports finding added mempool transactions. It'd be interesting to experiment with keeping track of evicted/not mined transactions and adding them to the results.

This would require:

  • An additional memory buffer (perhaps a configurable memory limited ring buffer?) for evicted tx data and the mempool sequence of the eviction.
  • Adding a sequence_action field for results, which indicates whether the sequence is for the tx's addition or eviction.

You could also keep track of tx movements into blocks, but this seems less useful/urgent in general.

Usage

You can see an example of this implemented and used in CLN for finding paid onchain invoices in this branch: ElementsProject/lightning@master...niftynei:lightning:nifty/mempool-scan

Note that the CLN implementation doesn't currently keep the mempool_sequence in disk, so it'll reload/rescan the mempool at start. This may be desirable?

Here's how a caller would use this

1) start node
2) listmempooltransactions 0 to get backlog
3) call listmempooltransactions again with `start_sequence` set to `mempool_sequence` result from last call

Note that this works well with ZMQ as the mempool_sequence number is identical.

Allow bitcoin-core clients to get more interesting information out
of the mempool more easily, including by filtering for 'latest txs'.

Eventually would be a good interface for grabbing all mempool
transaction data, including information about evicted transactions
(which would be useful for avoiding situations where removed
transactions contained data important for bitcoin applications). This
would require the addition of a mempool tx log, however the
`mempool_sequence` id plus this call and the use of a ring buffer(?)
would allow for clients to get complete scans of anything a bitcoin
core node had seen in its mempool.

That's future work. For now we start with the interface over existing
information.
Make the `listmempooltransactions` rpc available via the REST api
@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 6, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29016.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK kristapsk, bordalix

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #24748 (test/BIP324: functional tests for v2 P2P encryption by stratospher)
  • #19463 (Prune locks by luke-jr)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.


if (verbose) {
// We could also calculate fees etc for this transaction, but yolo.
TxToUniv(e.GetTx(), /*block_hash=*/uint256::ZERO, /*entry=*/txentry, /*inclue_hex=*/false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the tidy ci is getting upset here

rpc/mempool.cpp:462:82: error: argument name 'inclue_hex' in comment does not match parameter name 'include_hex' [bugprone-argument-comment,-warnings-as-errors]
  462 |             TxToUniv(e.GetTx(), /*block_hash=*/uint256::ZERO, /*entry=*/txentry, /*inclue_hex=*/false);
      |                                                                                  ^~~~~~~~~~~~~~~
      |                                                                                  /*include_hex=*/
./core_io.h:57:88: note: 'include_hex' declared here
   57 | void TxToUniv(const CTransaction& tx, const uint256& block_hash, UniValue& entry, bool include_hex = true, bool without_witness = false, const CTxUndo* txundo = nullptr, TxVerbosity verbosity = TxVerbosity::SHOW_DETAILS);

Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, missing test coverage for the new command:

Uncovered RPC commands:
  - listmempooltransactions

@glozow
Copy link
Member

glozow commented Dec 18, 2023

My initial attempt at this modified getrawmempool to

* take a `start_sequence` to allow for filtering, and

* adding the txdata (inputs outpoints + outputs (amount + scriptPubKeys)).

This seems like the best option imo. If we want to return the transaction hex (iiuc that should suffice if you are looking to get the tx data itself) we could add another level of verbosity to getrawmempool / getmempoolentry.

That was pretty ugly however, given that the current data model for getrawmempool is around a concept of "mempool entry" data. This returns transaction data (and does not report on information about other mempool dependencies etc; for that you should still call getrawmempool or getmempoolentry).

Not sure I understand what the difficulty was? getrawmempool should give dependencies if you use verbose (see the "depends" field).

Note that this works well with ZMQ as the mempool_sequence number is identical.

Assuming you mean the sequence emitted from Transaction{AddedTo,RemovedFrom}Mempool, note that this is not actually the case (they're usually off by 1 or more due to when the sequence increments happen).

@DrahtBot DrahtBot mentioned this pull request Dec 19, 2023
@niftynei
Copy link
Contributor Author

niftynei commented Jan 7, 2024

Assuming you mean the sequence emitted from Transaction{AddedTo,RemovedFrom}Mempool, note that this is not actually the case (they're usually off by 1 or more due to when the sequence increments happen).

Can you say more about this divergence or point me to a doc that contains more info?

I should add that one problem with the existing getrawmempool RPC semantics is that currently requesting the mempool_sequence and asking for verbose results is specifically disallowed.

I attempted to find a rationale for this but was unsuccessful (some code reorgs had removed the original code commit in the blame).

Re: returning the raw tx: that suffices but is suboptimal for an application that is scanning for particular output creation or input inclusion in a proposed tx as it creates additional work on the client.

Speaking of computation, note that this approach omits the computation required for finding ancestor/descendent txs in the mempool; that seemed like unnecessary work for clients that are solely interested in just the tx info.

@niftynei
Copy link
Contributor Author

niftynei commented Jan 7, 2024

Also, missing test coverage for the new command:

Yes, will add test coverage when this approach gets a concept ack.

Instead, working usage of this proposed code change has been included in the original PR

@kristapsk
Copy link
Contributor

Concept ACK

1 similar comment
@bordalix
Copy link

bordalix commented Jan 7, 2024

Concept ACK

Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you mean the sequence emitted from Transaction{AddedTo,RemovedFrom}Mempool, note that this is not actually the case (they're usually off by 1 or more due to when the sequence increments happen).

Can you say more about this divergence or point me to a doc that contains more info?

It's more that they just happen to use the same counter, but that doesn't mean they match.
The mempool entry's sequence number is assigned here;

bitcoin/src/validation.cpp

Lines 854 to 856 in fcacbab

const uint64_t entry_sequence = bypass_limits ? 0 : m_pool.GetSequence();
entry.reset(new CTxMemPoolEntry(ptx, ws.m_base_fees, nAcceptTime, m_active_chainstate.m_chain.Height(), entry_sequence,
fSpendsCoinbase, nSigOpsCost, lock_points.value()));

And the notification fires along with a counter increment here:
GetMainSignals().TransactionAddedToMempool(tx_info, m_pool.GetAndIncrementSequence());

This means the notification is +1 of the entry sequence, +n if it's in a package. edit: Woops I might be wrong and that's the same number. A better example is the entry sequences are all the same but are incremented for the notifications in a package. The entry sequence is also hard coded to 0 for a transaction came from a reorg.
It doesn't look like they were meant to match up exactly, though this potential confusion was acknowledged prior to merge.

I should add that one problem with the existing getrawmempool RPC semantics is that currently requesting the mempool_sequence and asking for verbose results is specifically disallowed.

I attempted to find a rationale for this but was unsuccessful (some code reorgs had removed the original code commit in the blame).

I'm not sure why that's the case either, the only context I can see is the edits in this comment after pushing those lines, which indicate it was causing a test failure. I can't open the travis link anymore. Pinging @instagibbs in case he remembers what happened 3+ years ago. Maybe there was a subtle bug, or maybe it was the easiest way to make the tests pass and return the results to the client without adding more params.

Re: returning the raw tx: that suffices but is suboptimal for an application that is scanning for particular output creation or input inclusion in a proposed tx as it creates additional work on the client.

Note that you can use gettxspendingprevout to search for mempool transactions spending a particular output, since the mempool does index by that. A watchonly wallet will also scan for its transactions in mempool as they appear.

Speaking of computation, note that this approach omits the computation required for finding ancestor/descendent txs in the mempool; that seemed like unnecessary work for clients that are solely interested in just the tx info.

Don't worry, there's no extra computation, it's all cached in the mempool entry which is already pulled in.

Anyway, I think it makes sense to have more params/levels to control filters and results for getrawmempool. It seems reasonable to have a higher verbosity level or more granular controls to give you the results you want.

Can you describe in more detail what args you want to pass and what results you want returned? I have no objection to making life easier for clients, it's just that the PR as it looks right now seems like we'd have 2 RPCs that do very similar things, and uses mempool entry sequence in a way that might be buggy.

Comment on lines +454 to +456
// We skip anything not requested.
if (e.GetSequence() < sequence_start)
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do note that a mempool tx from a recently disconnected block would be skipped here

@instagibbs
Copy link
Member

what happened 3+ years ago

I have some bad news, I do not.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

@maflcko
Copy link
Member

maflcko commented Feb 22, 2024

Are you still working on this? Looks like there are outstanding questions.

@niftynei
Copy link
Contributor Author

niftynei commented Feb 23, 2024

Yes, haven't had time recently due to bitcoin++ work.

Note that you can use gettxspendingprevout to search for mempool transactions spending a particular output

My use case is to scan for new transactions depositing to addresses or new scripts pertaining to a wallet which is not managed by core. Please see linked PR for CLN.

to have more params/levels to control filters and results for getrawmempool.

As mentioned in the PR, the data model that this returns is distinct from getrawmempool. This is specifically for transaction model data; getrawmempool's data model is around mempool entries.

Note that this works well with ZMQ as the mempool_sequence number is identical.

Found an exception to this: package acceptance sequencing is broken and breaks this assumption.

After some research, it turns out that we can't expose the mempool_sequence until we fix an existing bug in the package acceptance's mempool sequence assignment code.

Currently, we assign every tx in a package the same sequence number, but notify via ZMQ using an incrmenting one. The problem stems from copying the behavior the single tx acceptance used for sequence assignment (which is quite well done) to the package context, which broke assumptions the single tx was able to rely on. This should be fixed regardless of the status of this PR, and should be considered a blocker imo for this being shipped. If someone else wants to fix it, that'd be great.

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 13, 2024
Allow bitcoin-core clients to get more interesting information out
of the mempool more easily, including by filtering for 'latest txs'.

Eventually would be a good interface for grabbing all mempool
transaction data, including information about evicted transactions
(which would be useful for avoiding situations where removed
transactions contained data important for bitcoin applications). This
would require the addition of a mempool tx log, however the
`mempool_sequence` id plus this call and the use of a ring buffer(?)
would allow for clients to get complete scans of anything a bitcoin
core node had seen in its mempool.

That's future work. For now we start with the interface over existing
information.

Github-Pull: bitcoin#29016
Rebased-From: cb466f4
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 13, 2024
Make the `listmempooltransactions` rpc available via the REST api

Github-Pull: bitcoin#29016
Rebased-From: 0700847
@achow101
Copy link
Member

achow101 commented Apr 9, 2024

Are you still working on this?

@niftynei
Copy link
Contributor Author

Yes but won't have any progress to show til May or so. Thanks

@fanquake fanquake marked this pull request as draft April 11, 2024 07:16
@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 9, 2024

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

1 similar comment
@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 6, 2024

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@AndreaDiazCorreia
Copy link

I've reviewed and tested the listmempooltransactions RPC implementation. Here's what I did:

The test verifies:

  • Basic mempool sequence behavior
  • Transaction tracking
  • Sequence filtering functionality
  • Results structure in both normal and verbose modes

Feel free to cherry-pick the test from my branch.

Let me know if you'd like me to make any adjustments to the test coverage.

@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@maflcko
Copy link
Member

maflcko commented Feb 17, 2025

Closing for now, due to lack of progress for more than a year. Please leave a comment, if you want this reopened. Anyone may also open a new pull request, referencing this one, and including a short summary of any relevant discussion comments from here.

@maflcko maflcko closed this Feb 17, 2025
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Feb 22, 2025
Allow bitcoin-core clients to get more interesting information out
of the mempool more easily, including by filtering for 'latest txs'.

Eventually would be a good interface for grabbing all mempool
transaction data, including information about evicted transactions
(which would be useful for avoiding situations where removed
transactions contained data important for bitcoin applications). This
would require the addition of a mempool tx log, however the
`mempool_sequence` id plus this call and the use of a ring buffer(?)
would allow for clients to get complete scans of anything a bitcoin
core node had seen in its mempool.

That's future work. For now we start with the interface over existing
information.

Github-Pull: bitcoin#29016
Rebased-From: cb466f4
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Feb 22, 2025
Make the `listmempooltransactions` rpc available via the REST api

Github-Pull: bitcoin#29016
Rebased-From: 0700847
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Jun 6, 2025
Allow bitcoin-core clients to get more interesting information out
of the mempool more easily, including by filtering for 'latest txs'.

Eventually would be a good interface for grabbing all mempool
transaction data, including information about evicted transactions
(which would be useful for avoiding situations where removed
transactions contained data important for bitcoin applications). This
would require the addition of a mempool tx log, however the
`mempool_sequence` id plus this call and the use of a ring buffer(?)
would allow for clients to get complete scans of anything a bitcoin
core node had seen in its mempool.

That's future work. For now we start with the interface over existing
information.

Github-Pull: bitcoin#29016
Rebased-From: cb466f4
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Jun 6, 2025
Make the `listmempooltransactions` rpc available via the REST api

Github-Pull: bitcoin#29016
Rebased-From: 0700847
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.