-
Notifications
You must be signed in to change notification settings - Fork 37.7k
coinselection: Optimize BnB exploration #32150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32150. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. LLM Linter (✨ experimental)Possible typos and grammar issues:
drahtbot_id_4_m |
c70f1d3
to
613df18
Compare
Concept ACK |
I ran benchmarking on master and benchmarking on this branch since it's fairly large change. master:
looks like a marginal improvement in bechmarked performance. |
613df18
to
5ac0d02
Compare
Rebased on #29532 |
5ac0d02
to
44bab80
Compare
TestBnBSuccess("Select biggest UTXO", utxo_pool, /*selection_target=*/5 * CENT, /*expected_input_amounts=*/{5 * CENT}, /*expected_attempts=*/2); | ||
TestBnBSuccess("Select two UTXOs", utxo_pool, /*selection_target=*/4 * CENT, /*expected_input_amounts=*/{1 * CENT, 3 * CENT}, /*expected_attempts=*/6); | ||
TestBnBSuccess("Select all UTXOs", utxo_pool, /*selection_target=*/9 * CENT, /*expected_input_amounts=*/{1 * CENT, 3 * CENT, 5 * CENT}, /*expected_attempts=*/6); | ||
TestBnBSuccess("Select smallest UTXO", utxo_pool, /*selection_target=*/1 * CENT, /*expected_input_amounts=*/{1 * CENT}, /*expected_attempts=*/3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, it looks like the backtrack iterations are no longer recorded which effectively makes this probably more of an optimization than a refactor. Now it's possible more solutions can be found since the number of times the loop will run is effectively more improving search resolution. May be worth adding a note of that to the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I renamed the PR and added more explanation to the opening comment.
If I remember right the Coin Selection benchmark is rather non-representative. I have a todo to update it somewhere in my backlog. |
Good point. I took a quick look, and the current benchmark runs BnB for only Instead of keeping this in your backlog, shouldn't there be an issue opened so it doesn't get overlooked? Maybe someone else has the bandwidth to pick it up. Possibly I could but I also have a large backlog of I'd like to work on. Also, I note there is also
|
9cfab3b
to
155d6d9
Compare
155d6d9
to
4475517
Compare
The preceding PR #29532 got merged, so this is now ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the last 2 commits "opt: Skip evaluation of equivalent input sets" and "opt: Skip evaluation of equivalent input sets" could be combined. The change the last commit makes is to the comment introduced in the commit before it, and removing a line introduced in the commit before it. We generally don't want to add code just to immediately remove it, so I think these can be squashed together.
result.RecalculateWaste(cost_of_change, cost_of_change, CAmount{0}); | ||
assert(best_waste == result.GetWaste()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 9580c0e "coinselection: rewrite BnB in CoinGrinder-style"
Why is the waste calculation being dropped? result
will be returned with an incorrect waste value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The waste calculation here is semantically incorrect and it is unnecessary, because the waste of all coin selection results is recalculated after they are collected, before they are compared to correctly assess bump fees.
I don’t think we want to pull the bump fee calculation into each coin selection algorithm, but maybe we can pass the entire coin selection parameters object, so that we can calculate a semantically correct waste score for coin selection results, although it may still need to be amended for bump fees afterwards.
Let’s fix this in a follow-up properly. And also add some tests for the waste results of coin selection algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I addressed everything, except the issue for the follow-up.
result.RecalculateWaste(cost_of_change, cost_of_change, CAmount{0}); | ||
assert(best_waste == result.GetWaste()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The waste calculation here is semantically incorrect and it is unnecessary, because the waste of all coin selection results is recalculated after they are collected, before they are compared to correctly assess bump fees.
I don’t think we want to pull the bump fee calculation into each coin selection algorithm, but maybe we can pass the entire coin selection parameters object, so that we can calculate a semantically correct waste score for coin selection results, although it may still need to be amended for bump fees afterwards.
Let’s fix this in a follow-up properly. And also add some tests for the waste results of coin selection algorithms.
51ae802
to
cf36d1a
Compare
Ready for Review |
Sorry, I forgot to reply to this comment:
I think of the last two commits as two separate optimizations. The idea that you only have to explore all combinations with the first of several clones in itself can use explaining. The separate step to present the realization that we can skip any UTXO that ties the predecessor on effective value due to the way the UTXO sorting is tie-broken by minimal waste seemed easier to review. The second commit also adds six more lines of documentation and shows per the test that the iteration count is further reduced. I prefer to keep them as two commits. |
ACK cf36d1a |
The expected iteration count demonstrates how the following improvements reduce iterations will help catch any regressions in the future.
cf36d1a
to
792ed0d
Compare
Fixed the merge conflict with #33060. |
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
In the original implementation of BnB, the state of the search is backtracked by explicitly walking back to the omission branch and then testing again. This retests an equivalent candidate set as before, e.g., after backtracking from {ABC}, it would evaluate {AB_}, before trying {AB_D}, but {AB_} is equivalent to {AB} which was tested before. CoinGrinder tracks the state of the search instead by remembering which UTXO was last added and explicitly shifting from that UTXO directly to the next, so after {ABC}, it will immediately move on to {AB_D}. We replicate this approach here. The description of the two optimizations is removed from the documentation as they will only be implented in a later commit.
BnB may not be able to exhaustively search all potentially interesting combinations for large UTXO pools, so we keep track of whether the search was terminated by the iteration limit.
At high feerates adding more inputs will increase the waste score. If the current waste is already higher than the best selection’s we cannot improve upon the best selection. All solutions that include the current selection with more additional inputs must be worse than the best selection so far: SHIFT This optimization only works at high feerates, because at low feerates, adding more inputs decreases waste, so this condition would exit prematurely. We would never attempt input sets with higher weight than the prior best selection, even though we would prefer those at low feerates.
Introduces a dedicated data structure to track the total effective_value available in the remaining UTXOs at each index of the UTXO pool. In contrast to the original approach in BnB, this allows us to immediately jump to a lower index instead of visiting every UTXO to add back their eff_value to the lookahead.
When two successive UTXOs match in effective value and weight, we can skip the second if the prior is not selected: adding it would create an equivalent input set to a previously evaluated. E.g. if we have three UTXOs with effective values {5, 3, 3} of the same weight each, we want to evaluate {5, _, _}, {5, 3, _}, {5, 3, 3}, {_, 3, _}, {_, 3, 3}, but skip {5, _, 3}, and {_, _, 3}, because the first 3 is not selected, and we therefore do not need to evaluate the second 3 at the same position in the input set. If we reach the end of the branch, we must SHIFT the previously selected UTXO group instead.
When two successive UTXOs differ in waste but match in effective value, we can skip the second if the first is not selected, because all input sets we can generate by swapping out a less wasteful UTXOs with a more wastefull UTXO of matching effective value would be strictly worse. Also expand documentation of Branch and Bound.
792ed0d
to
cdc25b4
Compare
size_t curr_try = 0; | ||
SelectionResult result(selection_target, SelectionAlgorithm::BNB); | ||
bool is_done = false; | ||
bool is_feerate_high = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this ought to be is_fee_high
? also this could just be a function on utxo utxo_pool.at(0).is_fee_high()
alternatively .is_fee_expensive()
if (next_utxo >= utxo_pool.size() - 1) { | ||
// Reached end of UTXO pool skipping clones: SHIFT instead | ||
should_shift = true; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm could this be is_done = true;
instead of break? In general I think it would be better to replace all is_done = true;
with break
and then change the loop condition to while curr_try < TOTAL_TRIES
. That way, the entire condition which checks that the curr_try > TOTAL_TRIES
can be removed. In short, the combination of the usage of break
with is_done
could be simplified by always using break and then change the loop invariant to while .. < TOTAL_TRIES
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, this breaks the inner while loop. Disregard my previous comment.
I reviewed this and so far it looks good. I plan to give it one more pass over shortly to finish review. |
* exploration would cause redundant work: e.g., given the UTXOs A, A', and B, where A and A' are | ||
* clones, naive exploration would combine (read underscore to as omission): | ||
* [{}, {A}, {A, A'}, {A, A', B}, {A, _, B}, {_, A'}, {_, A', B}, {_, _, B}]. | ||
* In this case the input set candidates {A} and {A'} als well as {A, B} and {A', B} are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in cdc25b4 s/als/as
// Adding more UTXOs only increases fees and cannot be better: SHIFT | ||
should_shift = true; | ||
// The amount exceeding the selection_target (the "excess"), would be dropped to the fees: it is waste. | ||
CAmount curr_excess = curr_amount - selection_target; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 7596e49. Now that the set is sorted by waste as tie breaker, if the curr_excess
is zero, there will be no future solution with a better waste score. Therefore adding a condition testing for zero would stop the search sooner.
while (true) { | ||
bool should_shift{false}, should_cut{false}; | ||
// Select `next_utxo` | ||
OutputGroup& utxo = utxo_pool[next_utxo]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 7596e49 is there a mechanism up the stack that prevents this from being a utxo_pool length of zero? I notice in coin-grinder also that a utxo_pool of no length here would cause an exception.
// find any solutions. Redirect to exploring the Omission branch of the penultimate selected UTXO (i.e. | ||
// set `next_utxo` to one after the penultimate selected, then deselect the last two selected UTXOs) | ||
deselect_last(); | ||
should_shift = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 7596e49
nit: extra space here before the =. Same issue on L156
@@ -55,11 +55,11 @@ struct { | |||
* cost of creating and spending a change output. The algorithm uses a depth-first search on a binary | |||
* tree. In the binary tree, each node corresponds to the inclusion or the omission of a UTXO. UTXOs | |||
* are sorted by their effective values and the tree is explored deterministically per the inclusion | |||
* branch first. At each node, the algorithm checks whether the selection is within the target range. | |||
* branch first. For each new input set candidate, the algorithm checks whether the selection is within the target range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 7596e49
nit: line length here and in L62 breaks from the rest of the paragraph.
* At that point, the last included UTXO is deselected and the corresponding omission branch explored | ||
* instead. The search ends after the complete tree has been searched or after a limited number of tries. | ||
* instead starting by adding the subsequent UTXO. The search ends after the complete tree has been searched or after a limited number of tries. | ||
* | ||
* The search continues to search for better solutions after one solution has been found. The best |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is not part of the diff, but consider rewording this sentence to "The search continues for better solutions.." so that search isn't used twice redundantly in the same sentence.
ACK cdc25b4 I also tested the updates on secondary group of tests I wrote and there where no unexpected failures. I still have mild preference to using a loop condition of while As for variable naming on cosmetic note, I made a note of some alternate suggestions where I feel like it's shorter and/or more concise. (just a suggestion):
|
should_shift = true; | ||
// The amount exceeding the selection_target (the "excess"), would be dropped to the fees: it is waste. | ||
CAmount curr_excess = curr_amount - selection_target; | ||
CAmount curr_waste = curr_selection_waste + curr_excess; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 7596e49: FYI my mutation testing bot reported the following mutant as not killed:
@@ -201,7 +201,7 @@ util::Result<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_pool
should_shift = true;
// The amount exceeding the selection_target (the "excess"), would be dropped to the fees: it is waste.
CAmount curr_excess = curr_amount - selection_target;
- CAmount curr_waste = curr_selection_waste + curr_excess;
+ CAmount curr_waste = curr_selection_waste - curr_excess;
if (curr_waste <= best_waste) {
// New best solution
This PR rewrites the implementation of the BnB coinselection algorithm
to skip the duplicate evaluation of previously visited input selections.
In the original implementation of BnB, the state of the search is
backtracked by explicitly walking back to the omission branch and then
testing again. This retests an equivalent candidate set as before, e.g.,
after backtracking from {ABC}, it would evaluate {AB_}, before trying
{AB_D}, but {AB_} is equivalent to {AB} which was tested before.
CoinGrinder tracks the state of the search instead by remembering which
UTXO was last added and explicitly shifting from that UTXO directly to
the next, so after {ABC}, it will immediately move on to {AB_D}. We
replicate this approach here.
As fewer nodes are visited, this approach will enumerate more possible
combinations than the original implementation given the same limit for
iterations.