test: verify spend from 999-of-999 taproot multisig wallet #28212

eriknylund · 2023-08-03T16:26:06Z

Verify that a 999-of-999 taproot multisig wallet is possible and can spend from it, and that neither a 1-of-0 nor 1-of-1000 is allowed.

The tests will require some time to run. On my Mac M1 the new tests run in about 40 seconds.

Had to bump fee rate to resolve assertions in sendtoaddress and psbt methods:

code -5: assert rpc_online.gettransaction(txid)["confirmations"] > 0
code -26: min relay fee not met

Fixes #28179

DrahtBot · 2023-08-03T16:26:09Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

No conflicts as of last run.

l0rinc · 2023-08-03T16:49:57Z

test/functional/wallet_taproot.py

+        # Test 1-of-1000, should not succeed
+        try:
+            self.do_test_k_of_n(1, 1000)
+        except JSONRPCException as e:


won't this pass if no exception is raised?
Shouldn't we add something like either:

try: self.do_test_k_of_n(1, 1000) assert False, "Expected exception was not raised for 1-of-1000 test" except JSONRPCException as e:

or via unittest:

with self.assertRaises(JSONRPCException) as cm: self.do_test_k_of_n(1, 1000) self.assertEqual(cm.exception.error["code"], -5) self.assertIn("Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", cm.exception.error["message"])

?

Thanks for pointing this out 🙏 I agree it shouldn't be possible for this test to succeed so an exception should be raised.

assert_raises_rpc_error is probably what you want

@paplorinc I like the second approach, it seems a bit cleaner and clearer to me. However, it doesn't seem to be used in other functional tests so maybe the first approach is better to be more consistent?

@instagibbs Thanks! I was looking at that option too, however I couldn't see a way to use it in this context without having to change a lot of the code, but maybe I'm missing something obvious? 🤔

if something should raise an error and a specific code/message, you use assert_raises_rpc_error. Just take a look on how it's used elsewhere?

# Test that 1-of-1000 and 999-of-1000 raise exception for k in [1, 999]: assert_raises_rpc_error(-5, "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", self.do_test_k_of_n, k, 1000)

or maybe even:

for k in [0, 1, 999, 1000]:

if that makes sense.

I much appreciate the input @paplorinc, this was the syntax I was looking for 🙏 I saw the first version of your message and immediately thought about doing the other part ([0, 1, 999, 1000]). You read my mind, this makes very much sense! ~~Also, doing [1, 999] would be quite time consuming because each run lasts about a second so in total it would need 1000 seconds to complete.~~ Nvm, I was thinking about testing the whole range when in fact it's just two tests. I'll update the PR to support the version with four tests.

eriknylund · 2023-08-04T05:45:02Z

test/functional/wallet_taproot.py

@@ -387,6 +388,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
        psbt_online.unloadwallet()
        psbt_offline.unloadwallet()

+    def do_test_k_of_n(self, k, n):


@Sjors Given my basic understanding of output descriptors, I'd much appreciate your thoughts on this block of code, to make sure it meets the issue criterias and if this is a good way to test it.

I understand descriptors, but Python less well :-) What is this function trying to do?

It's a function that allows setup and testing of any k-of-n wallet. It builds on the idea of the existing test that @darosior pointed to, which starts on L495

bitcoin/test/functional/wallet_taproot.py

Line 495 in 64440bb

rnd_pos = random.randrange(MAX_PUBKEYS_PER_MULTI_A)

where a random number of n is used.

The part that differs is how the [KEY_1,...,KEY_N] array inside the multi_a is built - where the previous function uses ([H_POINT] * rnd_pos) up until k1 and then ([H_POINT] * (MAX_PUBKEYS_PER_MULTI_A - 1 - rnd_pos) after, the new function uses k1 for the whole array of.

However, because this previous test is not documented, I'm not entirely sure about the H_POINT parts do and if I need them in my test somehow, or if the way I've written it is enough to verify the 999-of-999 and 1-of-1000 cases.

l0rinc · 2023-08-04T07:12:47Z

test/functional/wallet_taproot.py

-            assert e.error["code"] == -5 and "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive" in e.error["message"]
+        # Test that all of the combinations 0-of-1000, 1-of-1000 and 999-of-1000 and 1000-of-1000 raise exception
+        for k in [0, 1, 999, 1000]:
+            assert_raises_rpc_error(-5, "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", self.do_test_k_of_n, k, 1000)


isn't it weird that the error message states "Cannot have 1000 keys" and k is ignored?

bitcoin/src/script/descriptor.cpp

Line 1381 in 7edce77

error = strprintf("Cannot have %u keys in multi_a; must have between 1 and %d keys, inclusive", providers.size(), MAX_PUBKEYS_PER_MULTI_A);

Based on the error the k doesn't even matter (it fails for 500 as well), so is it even important to test multiple k values here?

Maybe keep it simple and just test 1 of 1000.

In that case I also don't think you need the k_of_n helper function.

I guess that would do the trick, however I'm unsure if by doing so we modify the test to test the assertion, which seems a bit incomplete/unintuitive to me, rather than testing the edge cases in k-of-N wallets? 🤔 If you understand what I mean 😄 If you don't think it's an issue, I'm all for keeping it simple and just test 1-of-1000.

Well the "edge" here is n = 999, but this test is varying k, not n.

Ah you're right. I'll simplify the test to only do 1-of-1000.

Would it make sense to test the other edge, 1-of-0?

maflcko · 2023-08-04T07:18:53Z

The linter is failing. Also, please squash your commits according to https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits

Sjors · 2023-08-04T09:33:26Z

test/functional/wallet_taproot.py

@@ -300,7 +304,7 @@ def do_test_sendtoaddress(self, comment, pattern, privmap, treefn, keys_pay, key
            test_balance = int(rpc_online.getbalance() * 100000000)
            ret_amnt = random.randrange(100000, test_balance)
            # Increase fee_rate to compensate for the wallet's inability to estimate fees for script path spends.
-            res = rpc_online.sendtoaddress(address=self.boring.getnewaddress(), amount=Decimal(ret_amnt) / 100000000, subtractfeefromamount=True, fee_rate=200)
+            res = rpc_online.sendtoaddress(address=self.boring.getnewaddress(), amount=Decimal(ret_amnt) / 100000000, subtractfeefromamount=True, fee_rate=500)


Why are you changing the fee rate?

If I didn't bump the fee rate I would run into the two error codes mentioned in the PR message:

Had to bump fee rate to resolve assertions in sendtoaddress and psbt methods: code -5: assert rpc_online.gettransaction(txid)["confirmations"] > 0 code -26: min relay fee not met

The previous fee_rate = 200 limit seems to be sufficient up until n ~ 500. I assume this is because the script size increases with higher numbers of n.

Fee rate is the fee per vbyte, so the fee goes up as the script size goes up. But this may be related to #26573

l0rinc · 2023-08-04T12:42:30Z

test/functional/wallet_taproot.py

+    def do_test_k_of_n(self, k, n):
+        self.do_test(
+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
+            f"tr($2/*,multi_a({k}" + (",$1/*" * n) + "))",


Slightly unrelated: are there any tests that check multisig with different keys rather than repeating the same one, to make it more representative of a real-world scenario?
I'm just getting familiar with the codebase, I'm not sure whether it's important or not.

I think it's very related and also my concern, is this test as good as one with K different keys. I'm hoping @Sjors can provide some insight here. My understanding of the other tests in this file is limited because they are not documented, but they seem to be doing something similar to this. On the other hand I could of course do what the previous test did, to use the "H_POINT" approach and just throw one XPRV in there somewhere. If that makes the test more real-world like or improves the test in some other way I'm all for it.

Testing different keys makes sense, but let's keep this PR simple and leave it to a followup.

SambhavsCreation

I think it looks good. I think the concerns about testing with different keys are not of critical importance right now. Something could be done about it in the future.

maflcko · 2023-08-17T09:23:42Z

test/functional/wallet_taproot.py

-from test_framework.util import assert_equal
+from test_framework.util import (
+    assert_equal,
+    assert_raises_rpc_error


Suggested change

assert_raises_rpc_error

assert_raises_rpc_error,

style-nit: Add the missing comma here? Otherwise someone will have to touch this line again when appending something.

A (force) push should also make CI less red.

Thanks for the comments. 🙏 I added the comma, rebased onto master and force pushed to trigger a new build to get the mac CI through.

maflcko · 2023-09-15T06:26:43Z

Merge commits are not allowed in pull requests, please reset or rebase your changes.

Sjors

Thanks for the rebase. At least one of the CI failures is due to a timeout, see my inline comment.

Sjors · 2023-12-14T11:23:39Z

test/functional/wallet_taproot.py

+    def do_test_k_of_n(self, k, n):
+        self.do_test(
+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
+            f"tr($2/*,multi_a({k}" + (",$1/*" * n) + "))",


Testing different keys makes sense, but let's keep this PR simple and leave it to a followup.

Sjors · 2023-12-14T11:38:15Z

test/functional/wallet_taproot.py

+        self.do_test(
+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
+            f"tr($2/*,multi_a({k}" + (",$1/*" * n) + "))",
+            [True, False],


Note to other reviewers: this privmap argument determines for which keys an xpriv is inserted into the descriptor.

Sjors · 2023-12-14T11:47:00Z

test/functional/wallet_taproot.py

@@ -499,6 +510,20 @@ def run_test(self):
            [True, False],
            lambda k1, k2: (key(k2), [multi_a(1, ([H_POINT] * rnd_pos) + [k1] + ([H_POINT] * (MAX_PUBKEYS_PER_MULTI_A - 1 - rnd_pos)))])
        )
+
+        # Test 999-of-999
+        self.do_test_k_of_n(999, 999)


Unfortunately this leads to extreme CPU usage and a sendtoaddress RPC timeout even on my 2019 MacBook Pro.

I suggest sticking to a lower maximum for now, so we at least get this useful helper function, and leaving a note like:

// TODO: fix or work around performance bottleneck in `sendtoaddress` // in order to test the maximum Taproot allowed 999-of-999.

My Macbook can easily handle 1-of-100. The 30 second timeout is hit somewhere between 1-of-750 and 1-of-1000. The CI machines may be slower.

In term of fixing the bottleneck, you could try the more modern send RPC though I doubt that makes a difference. Maybe @achow101 has an idea?

I'll lower it to 100 unless @achow101 thinks the send RPC approach would work efficiently here.

@Sjors just to make sure, did you mean 100-of-100? Or should I leave this out completely and just keep the 1-of-0 and 1-of-1000 tests further down that utilises the same new helper method?

# Test 100-of-100 self.do_test_k_of_n(100, 100)

The CI machines seem to make it through without timeout:
https://github.com/bitcoin/bitcoin/actions/runs/7221218374/job/19675746505?pr=28212#step:7:3594
https://github.com/bitcoin/bitcoin/actions/runs/7221218374/job/19675746796?pr=28212#step:27:330

However, the test runs for 15 minutes on the Win64 job and for 20 minutes on the macOS job, making it the longest running functional test by far.

If it still takes that long you probably want to pick a smaller K and/or N.

did you mean 100-of-100? Or should I leave this out completely and just keep the 1-of-0 and 1-of-1000 tests further down that utilises the same new helper method?

The other two tests only check the failure case. A 1-of-100 would check the success path. 100-of-100 might be fine too, but maybe slower?

100-of-100 seems to run okay, less than 16 sec to complete the whole test. However, after the rebase I see the same behavior on my Mac M1 as you're seeing on your 2019 MacBook Pro. Even the 1-of-999 test doesn't pass without an RPC timeout, which it did before! So perhaps there's been some change while the PR has been open that now causes the test to perform slower? 🤔

1-of-999 gives me:
'sendall' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)

999-of-999 gives me
'sendtoaddress' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)

Interesting. Try cleaning your working directory (e.g. git clean -dfx) and building again. To try on an older version of master, you can you check out any old commit and then do a git cherry-pick ....

Interesting. Try cleaning your working directory (e.g. git clean -dfx) and building again. To try on an older version of master, you can you check out any old commit and then do a git cherry-pick ....

Git bisecting this I found that the tests start timing out with commit 4f473ea

Are you able to verify this on your setup?

@darosior would you be able to check as well? 🙏

This is how I did it:

git checkout 4f473ea515bc77b9138323dab8a741c063d32e8f git cherry-pick 387c12e0813a863bc9728777719acbafe7b12a34 make python3 test/functional/wallet_taproot.py ... test_framework.authproxy.JSONRPCException: 'sendtoaddress' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)

Test previous commit:

git checkout HEAD~1 git cherry-pick 387c12e0813a863bc9728777719acbafe7b12a34 make python3 test/functional/wallet_taproot.py ... 2023-12-15T20:46:16.689000Z TestFramework (INFO): Tests successful

Great find! Let's track this in a separate issue: #29098

Sjors · 2023-12-14T11:47:29Z

test/functional/wallet_taproot.py

@@ -387,6 +390,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
        psbt_online.unloadwallet()
        psbt_offline.unloadwallet()

+    def do_test_k_of_n(self, k, n):
+        self.do_test(
+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",


...,XPRIV_{n}

…d can spend from it, and that neither a 1-of-0 nor 1-of-1000 is allowed

darosior · 2024-04-09T15:04:11Z

This has not seen any activity for a while. It's depending on #29098 (comment), which i unfortunately do not plan on implementing right now. If you do i'm happy to review, but in the meantime i think this PR can be closed and potentially re-opened once the performance of the satisfier is reasonable.

tnndbtc · 2025-01-23T06:00:33Z

@darosior and @eriknylund #31719 is submitted to fix the performance issue by falling back to original algorithm that use the first k available signatures for satisfying a k-of-n multisig.

Please take a look and see whether this could be a proper candidate. Thanks

Implements test coverage for Bitcoin Core issue bitcoin#28179, proving that 999-of-999 Taproot multisig descriptors work correctly while maintaining performance requirements. Changes: - Add C++ unit tests in descriptor_tests.cpp for 999-key multi_a validation - Add Python functional test for optimized 999-of-999 multisig operations - Test edge cases: 998-key (valid), 1000-key (invalid), threshold validation - Verify descriptor parsing, address generation, and spending patterns - Performance optimized: completes in ~400ms vs 40+ seconds in failed attempts This addresses the core requirements from issue bitcoin#28179: - Prove 999-of-999 multisig works (structure and spendability) - Reject 1000+ key multisigs with proper error messages - Test both 1-of-1000 and 999-of-1000 rejection cases - Maintain performance for production use Resolves previous failed attempts in PR bitcoin#28212 and bitcoin#31719 through optimized testing approach that validates functionality without triggering signature satisfier performance bottlenecks. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

DrahtBot added the Tests label Aug 3, 2023

eriknylund force-pushed the musig-test-improvements branch from 89fbc68 to 1bc3a89 Compare August 3, 2023 16:33

DrahtBot added the CI failed label Aug 3, 2023

l0rinc reviewed Aug 3, 2023

View reviewed changes

fanquake requested a review from Sjors August 3, 2023 16:50

DrahtBot removed the CI failed label Aug 3, 2023

eriknylund commented Aug 4, 2023

View reviewed changes

l0rinc reviewed Aug 4, 2023

View reviewed changes

DrahtBot added the CI failed label Aug 4, 2023

Sjors reviewed Aug 4, 2023

View reviewed changes

DrahtBot mentioned this pull request Aug 4, 2023

Wallet: don't underestimate the fees when spending a Taproot output #26573

Closed

eriknylund force-pushed the musig-test-improvements branch from 8cfd270 to 7b75803 Compare August 4, 2023 12:04

l0rinc reviewed Aug 4, 2023

View reviewed changes

DrahtBot mentioned this pull request Aug 4, 2023

wallet: Avoid underpaying transaction fees when signing taproot spends #23502

Closed

DrahtBot removed the CI failed label Aug 4, 2023

eriknylund force-pushed the musig-test-improvements branch from 7b75803 to 274381e Compare August 7, 2023 05:40

eriknylund mentioned this pull request Aug 9, 2023

test: 999 of 999 multisig #28179

Open

SambhavsCreation approved these changes Aug 10, 2023

View reviewed changes

eriknylund force-pushed the musig-test-improvements branch from 274381e to 2dcfce1 Compare August 16, 2023 10:42

DrahtBot added the CI failed label Aug 16, 2023

maflcko reviewed Aug 17, 2023

View reviewed changes

eriknylund force-pushed the musig-test-improvements branch from 2dcfce1 to 4e7b33e Compare August 17, 2023 09:30

DrahtBot added the Needs rebase label Aug 17, 2023

eriknylund force-pushed the musig-test-improvements branch from 4e7b33e to c1fa172 Compare August 17, 2023 10:03

DrahtBot removed Needs rebase CI failed labels Aug 17, 2023

eriknylund force-pushed the musig-test-improvements branch from 45ae5d2 to 0034ffe Compare September 15, 2023 06:31

DrahtBot mentioned this pull request Sep 28, 2023

Deprecate SubtractFeeFromOutputs #24142

Closed

eriknylund force-pushed the musig-test-improvements branch from 0034ffe to 030ee8a Compare November 14, 2023 17:19

DrahtBot added the CI failed label Nov 14, 2023

Sjors reviewed Dec 14, 2023

View reviewed changes

test: verify that a 999-of-999 taproot multisig wallet is possible an…

387c12e

…d can spend from it, and that neither a 1-of-0 nor 1-of-1000 is allowed

eriknylund force-pushed the musig-test-improvements branch from 030ee8a to 387c12e Compare December 15, 2023 11:02

Sjors mentioned this pull request Dec 16, 2023

Performance decrease after tapscript miniscript #29098

Open

1 task

achow101 closed this Apr 9, 2024

tnndbtc mentioned this pull request Jan 23, 2025

miniscript: fixes #29098 by only use first k valid signatures #31719

Open

gianlucamazza mentioned this pull request Aug 22, 2025

test: Add comprehensive tests for 999-of-999 Taproot multisig #33239

Closed

4 tasks

test: verify spend from 999-of-999 taproot multisig wallet #28212

test: verify spend from 999-of-999 taproot multisig wallet #28212

Uh oh!

Conversation

eriknylund commented Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

l0rinc Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

l0rinc Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eriknylund Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

l0rinc Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sjors Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko commented Aug 4, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sjors Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SambhavsCreation left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eriknylund commented Aug 3, 2023 •

edited

Loading

DrahtBot commented Aug 3, 2023 •

edited

Loading

l0rinc Aug 3, 2023 •

edited

Loading

l0rinc Aug 3, 2023 •

edited

Loading

eriknylund Aug 4, 2023 •

edited

Loading

l0rinc Aug 4, 2023 •

edited

Loading

Sjors Aug 4, 2023 •

edited

Loading

Sjors Aug 4, 2023 •

edited

Loading

Sjors left a comment •

edited

Loading

Sjors Dec 14, 2023 •

edited

Loading