-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Add fee_est tool for debugging fee estimation code #10443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
43fa494
to
ec5f6ec
Compare
This is interesting but still no comment. Concept ACK I'll play around with this a bit.
|
When estimate log doesn't have enough data, |
For perfect output, there should be 3rd dimension - time or current block height when the tx was first seen :-) |
Concept ACK. I think this could be a useful tool. Could this be split into two PRs to aid reviewers? The first PR would cover the Parsing and graphing json files seems like a problem that has probably been solved many times before. If I was approaching this, I'd look at implementing this as a script in /contrib or a separate repository. Was there a particular reason you chose to implement this as a new C++ program? |
da0e959
to
88e26a1
Compare
87e104b
to
f49e21d
Compare
I think this is useful. However I'm not sure the analysis tool belongs in this repository. As it's specific to developers debugging the fee estimation code it's not something we want to ship with the distribution, or install by default. One place it could be is e.g. https://github.com/bitcoin-core/bitcoin-maintainer-tools . You could still document or refer to it from here. |
Agree that fee_est tool shouldn't be installed, since it's a tool specifically for made for modifying and debugging fee estimation code. But it would be awkward to use and maintain from a separate repository because it links and calls into the fee estimation code. (The tool works by piping historical data into the fee estimator so it's possible to make experimental changes to fee estimation and see how those changes affect its output and internal state.) As far as build / distribution is concerned I think it makes sense to think of it more like a unit test or benchmark than like a maintainer tool. |
Yes, I agree, that's unfortunately true. That part needs to stay in the repository with the rest of the code.
I agree - especially with the references to cloudflare CDN and such. It feels a bit ugly to have that in the C++ code. Better to spit out e.g. JSON or CSV or whatever is convenient, then use a separate script for pretty formatting. This would also keep open the option of using different (non-web) visualization tools. |
The last travis run for this pull request was 54 days ago and is thus outdated. To trigger a fresh travis build, this pull request should be closed and re-opened. |
Haven't dug into the code, but I'm leaning toward Concept ACK, Approach NACK As someone who is planning to do fee estimation research in the near future, I don't think I'd use something like this the way it's currently written. I would much prefer something that logs Bitcoin Core's live fee rate estimations, or some option that would allow me to tell Bitcoin Core it's at a certain chain height, feed it some mempool data, and then ask it for fee estimations. I also agree that the tool for analyzing the data shouldn't be in this repo. Much better if we can figure out what data is missing for people who want to do fee estimation analysis and provide an option for logging that data. Perhaps a good next step would be to talk to people who are currently doing fee estimation work and see what data would help with their projects/PRs? @ClaraShk @darosior @xekyo ? |
@josibake thanks for looking at this. I don't think I understand what the idea behind an approach NACK would be based on what you are saying, though. It sounds more like you're asking for an additional feature which would be easy to implement on top of the first commit, not pointing to a drawback of the approach. If you want to gather live fee estimates, the first commit outputs a structured jsonl log of fee estimation events, so it would be easy to add the actual fee estimates to this log, or add internal metrics used to produce the fee estimates, or add any other data points that might be interesting. It is also possible to take the existing log files this PR already generates, and feed them to the fee estimation algorithm by modifying the tool in the second commit to produce live fee estimates or a history of estimates over time. If you just think I should drop the second commit, or trim it down, I think that would be reasonable. The point of the second commit is mainly to provide example code that demonstrates how to read the JSONL log file and how to call the fee estimator API. The idea is to avoid needing reinvent the wheel each time you want to monitor fee estimation behavior, or reproduce a bug, or determine the effects of a change.
This is actually what the second commit provides. The .jsonl log contains chain and mempool data. You can feed it to the test program in the second commit, and it calls fee estimation code to produce fee estimates. It can produce a graph of the fee estimates, and calculate how accurate they were by looking at all the transactions that got confirmed and outputting the mean squared error of how many blocks it actually took the transactions to be confirmed vs how many blocks the fee estimation algorithm predicted it would take to get them confirmed based on their feerate. This PR is a few years old, so I would probably implement the second commit a little differently today. Instead of adding a standalone c++ executable that calls fee estimator code and produces graphs and metrics, I might want to just add a python module that exposes the |
Thanks for the thorough response @ryanofsky. I realize I didn't explain myself very well in my initial comment.
Since we already record mempool events in the debug log file (this is what @jamesob and I are using to log mempool events in bmon), it feels a bit strange to have special code for logging transactions again here for fee estimation. It's possible I'm not understanding something here, though: is there some special information in the log you are writing that we can't get from the debug log file?
I think what I was envisioning was even simpler: have a flag to make Bitcoin Core log |
I think so. The json log captures all inputs to fee estimator to used update its internal state. It includes not just mempool events but also information about which transactions are included in blocks, flush events, and contents of
I think I don't actually understand the proposal or don't understand the intended use-case. Use-cases for PR this are to:
Is your proposal basically to save a const UniValue& block = value["block"];
if (block.isObject()) {
AutoFile file{fsbridge::fopen(strprintf("fee_estimates-%07d.dat", block["height"].getInt<int>());, "wb")};
estimator.Write(file);
} Another advantage of this change is that it decouples Lines 189 to 200 in 7b45d17
to: Lines 191 to 201 in 26240e1
There are probably lots of changes that could be made to this PR to improve it, or trim it down if it is doing too much, but in general I think takes a good first step of pulling things out of Footnotes |
🐙 This pull request conflicts with the target branch and needs rebase. |
There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
Going to close this because I think @josibake is working on a different approach which will try replace current fee estimator, instead of just instrumenting it like this PR. There's also another PR #28368 which decouples the fee estimator from the mempool, and might make it easier to replay fee estimation events like the changes in this PR, even though it only provides an API, not a CLI tool. |
This PR adds an
-estlog
option for saving live fee estimation data from a bitcoin node, and afee_est
command line tool for processing the data and testing fee estimation code.The idea is to make it easier to test improvements to fee estimation like #10199 in a more systematic and reproducible way.
Some documentation is in
src/test/fee_est/README.md
Sample log file:
est.log.xz
(65M)Sample
fee_est.cpp
graph output: