-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Add feerate histogram to getmempoolinfo #15836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2f27f15
to
80fbf80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
I think that out-of-the-box we can expose some stats like this.
I think it may be useful to include the current timestamp in the response - some client can just run a cron script to call getmempoolinfo
and store it (without changing the JSON response).
throw std::runtime_error( | ||
RPCHelpMan{"getmempoolinfo", | ||
"\nReturns details on the active state of the TX memory pool.\n", | ||
{}, | ||
{ | ||
{"with_fee_histogram", RPCArg::Type::BOOL, /* default */ "false", "True for including the fee histogram in the response"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this parameter, it could have fee_histogram_bins
(that defaults to []
which means no histogram is included in the response). This would replace the above feelimits
and also avoids breaking clients implementation.
src/rpc/blockchain.cpp
Outdated
std::vector<uint64_t> count(feelimits.size(), 0); | ||
std::vector<uint64_t> fees(feelimits.size(), 0); | ||
|
||
LOCK(pool.cs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should move this up (done in #15474).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That pull was merged, please rebase and remove this lock.
@@ -629,7 +639,8 @@ static const struct { | |||
{"/rest/block/notxdetails/", rest_block_notxdetails}, | |||
{"/rest/block/", rest_block_extended}, | |||
{"/rest/chaininfo", rest_chaininfo}, | |||
{"/rest/mempool/info", rest_mempool_info}, | |||
{"/rest/mempool/info", rest_mempool_info_basic}, | |||
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just start to use query parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would eventually be better but not scope of this PR (following the current scheme).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still voting against query parameters. With REST the general preference seems to be to turn parameters into URL segments, and query parameters tend to be avoided because they look ugly and are hard to remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a "standard" here but with REST usually the URL path identifies a resource, a collection of resources, or an action - the verb is also relevant. But parameters are usually set in the URL query, order independent and can be optional. I also think this is more flexible, for instance, you could support ...?verbose=true
in all endpoints (just an example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram
is never called.
|
||
// distribute feerates into feelimits | ||
for (size_t i = 0; i < feelimits.size(); i++) { | ||
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong but if feelimits
is sorted then && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])
is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beside, it could avoid linear search by using std::find
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong but if
feelimits
is sorted then&& (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])
is not necessary.
Yes, but then then for loop on line 1532 should be in reverse order: for (int i = feelimits.size() - 1; i >= 0; i--) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Useful addition 👍 . Tested RPC output and help man output. Agree with @promag on the current timestamp. Perhaps go with a name-based argument e.g. fee_histogram=true
from the start?
|
||
// distribute feerates into feelimits | ||
for (size_t i = 0; i < feelimits.size(); i++) { | ||
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be efficient to memoize feelimits.size() - 1
? (if the compiler doesn't optimize it automatically, my C++ is rusty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])
can be removed, the dependency on feelimits
being sorted would need a regression test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be efficient to memoize
feelimits.size() - 1
? (if the compiler doesn't optimize it automatically, my C++ is rusty)
It shouldn't impact performance either way.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
@@ -1496,16 +1496,76 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool) | |||
ret.pushKV("mempoolminfee", ValueFromAmount(std::max(pool.GetMinFee(maxmempool), ::minRelayTxFee).GetFeePerK())); | |||
ret.pushKV("minrelaytxfee", ValueFromAmount(::minRelayTxFee.GetFeePerK())); | |||
|
|||
if (with_fee_histogram) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this directly into getmempoolinfo
? Or another helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about another call, but extending mempoolinfo
with an option for "more data" seems to be most allied with other calls where one can get more extended infos on option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean just have the code outside this function. The RPC would then call both MempoolInfoToJSON
and also JSONMempoolInfoAddHistogram
(or whatever this code gets called)
80fbf80
to
c97a9dd
Compare
c97a9dd
to
5d47656
Compare
Concept ACK / tACK 9ef9325 |
Concept ACK. This gives me a new warning on build:
|
5d47656
to
2f97b31
Compare
Fixed the lock issue. |
2f97b31
to
b94292a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,2 @@ | |||
// Add predefined macros for your project here. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these files and maybe update .gitignore?
@@ -629,7 +639,8 @@ static const struct { | |||
{"/rest/block/notxdetails/", rest_block_notxdetails}, | |||
{"/rest/block/", rest_block_extended}, | |||
{"/rest/chaininfo", rest_chaininfo}, | |||
{"/rest/mempool/info", rest_mempool_info}, | |||
{"/rest/mempool/info", rest_mempool_info_basic}, | |||
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram
is never called.
Needs rebase |
info_sub.pushKV("from_feerate", feelimits[i]); | ||
info_sub.pushKV("to_feerate", i == feelimits.size() - 1 ? std::numeric_limits<int64_t>::max() : feelimits[i + 1]); | ||
total_fees += fees[i]; | ||
info.pushKV(std::to_string(feelimits[i]), info_sub); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should change this to use ToString
Concept ACK, it would be super useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. This would be useful.
I tested the rpc, help man output and created a plot for the output from my node: mempool.png
If anyone else wants to try this i used this script: https://gist.github.com/dergoegge/ec73e0de2d858d2e75bf31a6a7b3e6b2
@jonasschnelli I also rebased this for you here: https://github.com/dergoegge/bitcoin/tree/histogram_rebase
} | ||
} | ||
CAmount total_fees = 0; //track total amount of available fees in mempool | ||
UniValue info(UniValue::VOBJ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON objects are unordered collections, so maybe using an array for "fee_histogram" would make more sense since it would always stay sorted.
Concept ACK, I will be using this if available for both bwt and esplora/electrs. Electrum Personal Server can also benefit from it. |
@jonasschnelli This https://github.com/kiminuo/bitcoin/tree/feature/2021-03-Feerate-histogram is an attempt to do the rebase work and apply a few review comments: Applied review comments
Test commands $ ./bitcoin-cli -testnet getmempoolinfo true # To test the new behavior $ test/functional/test_runner.py mempool_fee_histogram.py # To run the new test $ ./bitcoin-cli -testnet help getmempoolinfo # bitcoind has to run for this command to succeed :(
getmempoolinfo ( with_fee_histogram )
Returns details on the active state of the TX memory pool.
Arguments:
1. with_fee_histogram (boolean, optional, default=false) True for including the fee histogram in the response
Result:
{ (json object)
"loaded" : true|false, (boolean) True if the mempool is fully loaded
"size" : n, (numeric) Current tx count
"bytes" : n, (numeric) Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted
"usage" : n, (numeric) Total memory usage for the mempool
"total_fee" : n, (numeric) Total fees for the mempool in BTC, ignoring modified fees through prioritizetransaction
"maxmempool" : n, (numeric) Maximum memory usage for the mempool
"mempoolminfee" : n, (numeric) Minimum fee rate in BTC/kB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee
"minrelaytxfee" : n, (numeric) Current minimum relay fee for transactions
"unbroadcastcount" : n, (numeric) Current number of transactions that haven't passed initial broadcast yet
"fee_histogram" : { (json object)
"<feerate-group>" : { (json object) Object per feerate group
"sizes" : n, (numeric) Cumulated size of all transactions in feerate group
"count" : n, (numeric) Amount of transactions in feerate group
"fees" : n, (numeric) Cumulated fee of all transactions in feerate group
"from_feerate" : n, (numeric) Group contains transaction with feerates equal or greater than this value
"to_feerate" : n (numeric) Group contains transaction with feerates less than than this value
},
"total_fees" : n (numeric) Total available fees in mempool
}
}
Examples:
> bitcoin-cli getmempoolinfo
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/ Output on testnet (2021-03-07) ./bitcoin-cli -testnet getmempoolinfo true JSON output {
"loaded": true,
"size": 73,
"bytes": 19620,
"usage": 108816,
"total_fee": 0.00833952,
"maxmempool": 300000000,
"mempoolminfee": 0.00001000,
"minrelaytxfee": 0.00001000,
"unbroadcastcount": 0,
"fee_histogram": {
"1": {
"sizes": 6615,
"count": 38,
"fees": 7817,
"from_feerate": 1,
"to_feerate": 2
},
"2": {
"sizes": 1553,
"count": 5,
"fees": 3852,
"from_feerate": 2,
"to_feerate": 3
},
"3": {
"sizes": 251,
"count": 2,
"fees": 784,
"from_feerate": 3,
"to_feerate": 4
},
"4": {
"sizes": 285,
"count": 2,
"fees": 1356,
"from_feerate": 4,
"to_feerate": 5
},
"5": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 5,
"to_feerate": 6
},
"6": {
"sizes": 166,
"count": 1,
"fees": 1130,
"from_feerate": 6,
"to_feerate": 7
},
"7": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 7,
"to_feerate": 8
},
"8": {
"sizes": 225,
"count": 1,
"fees": 2000,
"from_feerate": 8,
"to_feerate": 10
},
"10": {
"sizes": 168,
"count": 1,
"fees": 1808,
"from_feerate": 10,
"to_feerate": 12
},
"12": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 12,
"to_feerate": 14
},
"14": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 14,
"to_feerate": 17
},
"17": {
"sizes": 1581,
"count": 1,
"fees": 31200,
"from_feerate": 17,
"to_feerate": 20
},
"20": {
"sizes": 332,
"count": 2,
"fees": 8040,
"from_feerate": 20,
"to_feerate": 25
},
"25": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 25,
"to_feerate": 30
},
"30": {
"sizes": 2037,
"count": 4,
"fees": 64410,
"from_feerate": 30,
"to_feerate": 40
},
"40": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 40,
"to_feerate": 50
},
"50": {
"sizes": 2768,
"count": 4,
"fees": 143913,
"from_feerate": 50,
"to_feerate": 60
},
"60": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 60,
"to_feerate": 70
},
"70": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 70,
"to_feerate": 80
},
"80": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 80,
"to_feerate": 100
},
"100": {
"sizes": 1079,
"count": 7,
"fees": 110042,
"from_feerate": 100,
"to_feerate": 120
},
"120": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 120,
"to_feerate": 140
},
"140": {
"sizes": 1998,
"count": 3,
"fees": 300000,
"from_feerate": 140,
"to_feerate": 170
},
"170": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 170,
"to_feerate": 200
},
"200": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 200,
"to_feerate": 250
},
"250": {
"sizes": 371,
"count": 1,
"fees": 100000,
"from_feerate": 250,
"to_feerate": 300
},
"300": {
"sizes": 191,
"count": 1,
"fees": 57600,
"from_feerate": 300,
"to_feerate": 400
},
"400": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 400,
"to_feerate": 500
},
"500": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 500,
"to_feerate": 600
},
"600": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 600,
"to_feerate": 700
},
"700": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 700,
"to_feerate": 800
},
"800": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 800,
"to_feerate": 1000
},
"1000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 1000,
"to_feerate": 1200
},
"1200": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 1200,
"to_feerate": 1400
},
"1400": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 1400,
"to_feerate": 1700
},
"1700": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 1700,
"to_feerate": 2000
},
"2000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 2000,
"to_feerate": 2500
},
"2500": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 2500,
"to_feerate": 3000
},
"3000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 3000,
"to_feerate": 4000
},
"4000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 4000,
"to_feerate": 5000
},
"5000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 5000,
"to_feerate": 6000
},
"6000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 6000,
"to_feerate": 7000
},
"7000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 7000,
"to_feerate": 8000
},
"8000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 8000,
"to_feerate": 10000
},
"10000": {
"sizes": 0,
"count": 0,
"fees": 0,
"from_feerate": 10000,
"to_feerate": 9223372036854775807
},
"total_fees": 833952
}
} |
I have forked this PR: #21422 and I'm willing to continue working on that. |
UniValue info(UniValue::VOBJ); | ||
for (size_t i = 0; i < feelimits.size(); i++) { | ||
UniValue info_sub(UniValue::VOBJ); | ||
info_sub.pushKV("sizes", sizes[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is sizes? I've noticed that when adding up all the sizes when the mempool is full, that the number doesn't stay fixed as I would have expected when the mempool is full - so it's not the number of bytes used in memory to store the tx - so, what is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the histogram is based on fee rates intervals. The histogram is modeled using three vectors:
std::vector<uint64_t> sizes(feelimits.size(), 0);
std::vector<uint64_t> count(feelimits.size(), 0);
std::vector<uint64_t> fees(feelimits.size(), 0);
where sizes[0]
represents cumulative size of txs belonging to the first fee rate interval [1, 2)
, sizes[1]
represents cumulative size of txs belonging to the second fee rate interval [2, 3)
, etc.
Line 1527 shows that we add size
to sizes[i]
where size
is defined as int size = (int)e.GetTxSize();
which is defined as:
size_t CTxMemPoolEntry::GetTxSize() const
{
return GetVirtualTransactionSize(nTxWeight, sigOpCost);
}
What is virtual size of a transaction? This is explained here: https://bitcoin.stackexchange.com/questions/92689/how-is-the-size-of-a-bitcoin-transaction-calculated.
HTH!
promo: You may have a look at #21422 too. :)
I'm chasing "concept ACK"s for the reborn version of this PR - namely #21422. Any other feedback is welcome, I have time to do modifications if needed to increase the chance of getting the PR to be merged. |
@jonasschnelli the simple plot example doesn't display (seems the website is down). |
@rebroad The link redirects to https://bitcoin.jonasschnelli.ch/mempool-histogram/ but there is a bug (missing slash). Anyway, there are no data at the moment. |
Closing this, given it's been taken over in #21422. |
This follows the approach of adding statistical information to Bitcoin Core that would otherwise be inefficient to calculate outside of the codebase.
Adds an optional feerate histogram to
getmempoolinfo
.The concept and code is heavily inspired by the stats @jhoenicke runs (https://github.com/jhoenicke/mempool).
If someone has a good idea how to make the feerate-groups dynamic but also semi-constant for similar fee environments, please comment.
If this is feature we'd like to have in master (concept ACKs), I'd continue this with writing tests.
A simple plot of the data is here.
RPC output sample is here.