-
Notifications
You must be signed in to change notification settings - Fork 37.7k
descriptors: inference process, do not return unparsable multisig descriptors #31404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31404. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
4e27172
to
50ebd73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make, functional tests successful.
# Verify it is valid only under segwit rules, not for bare multisig nor p2sh. | ||
assert 'desc' not in rpc_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_equal('multisig', rpc_result['type'])
Nit: This assertion can also be added. I modified the test case to have 21 keys, and the type
changed to nonstandard
, which is correct. p2sh
and desc
values are present in that output.
Do you think it's worth adding the test case with 21 pubkeys as well to cover the nonstandard
scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_equal('multisig', rpc_result['type'])
Nit: This assertion can also be added. I modified the test case to have 21 keys, and the
type
changed tononstandard
, which is correct.p2sh
anddesc
values are present in that output.Do you think it's worth adding the test case with 21 pubkeys as well to cover the
nonstandard
scenario?
Hmm, I find the type
return field misleading. Based on the code, it relates to the output script type, which in this case can only be represented as p2wsh and not as a bare multisig. This is because (1) bare multisig scripts with +3 pubkeys are non-standard, and (2) bare multisig scripts with +15 pubkeys are consensus-invalid.
So, I'm not entirely convinced about the type assertion at this point. It seems we should improve the type field RPC description, correct the RPC to actually return "non-standard" for bare multisigs with +3 pubkeys, and perhaps introduce the "consensus-invalid" (or simply "invalid") type
too.
@@ -87,7 +87,7 @@ bool IsStandard(const CScript& scriptPubKey, const std::optional<unsigned>& max_ | |||
unsigned char m = vSolutions.front()[0]; | |||
unsigned char n = vSolutions.back()[0]; | |||
// Support up to x-of-3 multisig txns as standard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
088bbb8: Should probably replace // Support up to x-of-3 multisig txns as standard
with
// Support up to x-of-MAX_BARE_MULTISIG_PUBKEYS_NUM multisig txns as standard
The internal script-to-descriptor inference procedure shouldn't return invalid descriptors or ones that fail the string-to-descriptor parsing process. These two procedures should always support seamless roundtrips. This inlines `InferScript`/`InferDescriptor` to the actual descriptors `ParseScript` logic for the multisig path, ensuring only valid descriptors are produced. Examples where this presents an issue: 1) Because the legacy wallet allowed the import of invalid scripts, the process could end up inferring and storing an unparsable descriptor, which crashes the software during any subsequent wallet load 2) The `decodescript()` RPC command currently return unusable multisig descriptors. For example, providing a P2SH multisig redeem script with more than 520 bytes results in a descriptor when it shouldn't due to its large size.
50ebd73
to
e1d6179
Compare
} | ||
|
||
// Verify we don't exceed consensus limits | ||
if (ctx == ParseScriptContext::P2SH && script.size() > MAX_SCRIPT_ELEMENT_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e1d6179: IIUC trying to infer a multisig redeem script that is too large under a p2sh context will return addr(...)#...
instead of nullptr. Is this what you were going for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC trying to infer a multisig redeem script that is too large under a p2sh context will return
addr(...)#...
instead of nullptr. Is this what you were going for?
This diff ensures that nullptr
is returned under p2sh context (if that's what you were going for) and adds a test case that covers this if block.
diff --git a/src/script/descriptor.cpp b/src/script/descriptor.cpp
index a2eb764706..7bc2f8cc88 100644
--- a/src/script/descriptor.cpp
+++ b/src/script/descriptor.cpp
@@ -2259,6 +2259,7 @@ std::unique_ptr<DescriptorImpl> InferScript(const CScript& script, ParseScriptCo
if (provider.GetCScript(scriptid, subscript)) {
auto sub = InferScript(subscript, ParseScriptContext::P2SH, provider);
if (sub) return std::make_unique<SHDescriptor>(std::move(sub));
+ return nullptr;
}
}
if (txntype == TxoutType::WITNESS_V0_SCRIPTHASH && (ctx == ParseScriptContext::TOP || ctx == ParseScriptContext::P2SH)) {
diff --git a/src/test/descriptor_tests.cpp b/src/test/descriptor_tests.cpp
index 24b8f2f793..4c0c0bc273 100644
--- a/src/test/descriptor_tests.cpp
+++ b/src/test/descriptor_tests.cpp
@@ -431,17 +431,19 @@ void CheckMultipath(const std::string& prv,
}
}
-void CheckInferDescriptor(const std::string& script_hex, const std::string& expected_desc, const std::vector<std::string>& hex_scripts, const std::vector<std::pair<std::string, std::string>>& origin_pubkeys)
+// Helper function to set up a FlatSigningProvider with scripts and public keys for descriptor inference testing
+FlatSigningProvider CreateInferSigningProvider(const std::vector<std::string>& hex_scripts, const std::vector<std::pair<std::string, std::string>>& origin_pubkeys)
{
- std::vector<unsigned char> script_bytes{ParseHex(script_hex)};
- const CScript& script{script_bytes.begin(), script_bytes.end()};
-
FlatSigningProvider provider;
+
+ // Add scripts to provider
for (const std::string& prov_script_hex : hex_scripts) {
std::vector<unsigned char> prov_script_bytes{ParseHex(prov_script_hex)};
const CScript& prov_script{prov_script_bytes.begin(), prov_script_bytes.end()};
provider.scripts.emplace(CScriptID(prov_script), prov_script);
}
+
+ // Add public keys and their origins to provider
for (const auto& [pubkey_hex, origin_str] : origin_pubkeys) {
CPubKey origin_pubkey{ParseHex(pubkey_hex)};
provider.pubkeys.emplace(origin_pubkey.GetID(), origin_pubkey);
@@ -472,12 +474,32 @@ void CheckInferDescriptor(const std::string& script_hex, const std::string& expe
}
}
+ return provider;
+}
+
+void CheckInferDescriptor(const std::string& script_hex, const std::string& expected_desc, const std::vector<std::string>& hex_scripts, const std::vector<std::pair<std::string, std::string>>& origin_pubkeys)
+{
+ std::vector<unsigned char> script_bytes{ParseHex(script_hex)};
+ const CScript& script{script_bytes.begin(), script_bytes.end()};
+
+ FlatSigningProvider provider = CreateInferSigningProvider(hex_scripts, origin_pubkeys);
std::string checksum{GetDescriptorChecksum(expected_desc)};
std::unique_ptr<Descriptor> desc = InferDescriptor(script, provider);
BOOST_CHECK_EQUAL(desc->ToString(), expected_desc + "#" + checksum);
}
+void CheckNoInferDescriptor(const std::string& script_hex, const std::vector<std::string>& hex_scripts, const std::vector<std::pair<std::string, std::string>>& origin_pubkeys)
+{
+ std::vector<unsigned char> script_bytes{ParseHex(script_hex)};
+ const CScript& script{script_bytes.begin(), script_bytes.end()};
+
+ FlatSigningProvider provider = CreateInferSigningProvider(hex_scripts, origin_pubkeys);
+
+ std::unique_ptr<Descriptor> desc = InferDescriptor(script, provider);
+ BOOST_CHECK(!desc);
+}
+
}
BOOST_FIXTURE_TEST_SUITE(descriptor_tests, BasicTestingSetup)
@@ -988,6 +1010,9 @@ BOOST_AUTO_TEST_CASE(descriptor_test)
CheckInferDescriptor("76a914a31725c74421fadc50d35520ab8751ed120af80588ac", "pkh(04c56fe4a92d401bcbf1b3dfbe4ac3dac5602ca155a3681497f02c1b9a733b92d704e2da6ec4162e4846af9236ef4171069ac8b7f8234a8405b6cadd96f34f5a31)", {}, {{"04c56fe4a92d401bcbf1b3dfbe4ac3dac5602ca155a3681497f02c1b9a733b92d704e2da6ec4162e4846af9236ef4171069ac8b7f8234a8405b6cadd96f34f5a31", ""}});
// Infer pk() from p2pk with uncompressed key
CheckInferDescriptor("4104032540df1d3c7070a8ab3a9cdd304dfc7fd1e6541369c53c4c3310b2537d91059afc8b8e7673eb812a32978dabb78c40f2e423f7757dca61d11838c7aeeb5220ac", "pk(04032540df1d3c7070a8ab3a9cdd304dfc7fd1e6541369c53c4c3310b2537d91059afc8b8e7673eb812a32978dabb78c40f2e423f7757dca61d11838c7aeeb5220)", {}, {{"04032540df1d3c7070a8ab3a9cdd304dfc7fd1e6541369c53c4c3310b2537d91059afc8b8e7673eb812a32978dabb78c40f2e423f7757dca61d11838c7aeeb5220", ""}});
+
+ // Infer sh(very_large_multisig) from p2sh fails
+ CheckNoInferDescriptor("a914b0d5c60fe453cd72cd4942db2cfe01b2d084b0ee87", {"01142103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb22103b0da749730dc9b4b1f4a14d6902877a92541f5368778853d9c4a0cb7802dcfb20114ae"}, {});
}
BOOST_AUTO_TEST_SUITE_END()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a more delicate topic. InferScript
can return nullptr
in valid scenarios too. It shouldn't just fail in every case as you're doing there.
For example, if the script is a P2SH output and the provider lacks the redeem script (watch-only scenario), the expected behavior is to return a addr()
. But with your changes, it would return `nullptr instead.
We need a different way to signal errors from inner InferScript()
calls. Could be a boolean or.. a different return object like a util::Result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if the script is a P2SH output and the provider lacks the redeem script (watch-only scenario), the expected behavior is to return a
addr()
. But with your changes, it would return `nullptr instead.
Are you certain? I checked this and I couldn't verify this claim. If the redeem script is not available in the provider, provider.GetCScript(scriptid, subscript)
should evaluate to false
and the expected addr()
desc will be returned. I tested this by adding the following test case for the example you gave CheckInferDescriptor("a914b0d5c60fe453cd72cd4942db2cfe01b2d084b0ee87", "addr(3Hp2wZXSKK7Zw2hJEE2pU2Qcs6pPt9t5cG)", {}, {});
and it passed.
Have I missed something?
@@ -1105,7 +1105,7 @@ static RPCHelpMan gettxout() | |||
{RPCResult::Type::STR_AMOUNT, "value", "The transaction value in " + CURRENCY_UNIT}, | |||
{RPCResult::Type::OBJ, "scriptPubKey", "", { | |||
{RPCResult::Type::STR, "asm", "Disassembly of the output script"}, | |||
{RPCResult::Type::STR, "desc", "Inferred descriptor for the output"}, | |||
{RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making the RPC output field optional, can we not fallback to always returning a raw
/ addr
descriptor if inferring returns something that would be invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making the RPC output field optional, can we not fallback to always returning a
raw
/addr
descriptor if inferring returns something that would be invalid?
Hmm, I would prefer to avoid returning addr()
descriptors when the script cannot be redeemed due to consensus rules. Mainly because it opens up the remote possibility of users sending money to it (since the address is there).
This isn't an issue in this gettxout()
case because the script comes from an unspent output in the UTXO or mempool, meaning the worst case scenario is a non-standard script. However, it could be problematic in other places that uses InferDescriptor()
like createmultisig()
(if we ever feed the internal InferDescriptor
call with invalid data) or the current migration process (which will no longer be the case after #31495).
We can return raw
/ addr
for non-standard scripts. This would let us assert that the descriptor always exist at gettxout()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems InferDescriptor
can still produce invalid descriptors due to invalid keys. e.g.:
Hex script: 5141046161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161610051ae
descriptor (invalid): multi(1,0461616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616100)#jqqp7208
.
|
Could turn into draft while CI is failing and feedback is waiting to be addressed? |
🐙 This pull request conflicts with the target branch and needs rebase. |
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
The internal script-to-descriptor inference procedure shouldn't return invalid
descriptors or ones that fail the string-to-descriptor parsing process. These
two procedures should always support seamless roundtrips.
This inlines
InferScript
/InferDescriptor
to the actual descriptorsParseScript
logic for the multisig path, ensuring only valid descriptors areproduced.
Examples where this presents an issue:
Because the legacy wallet allowed the import of invalid scripts, the process
could end up inferring and storing an unparsable descriptor, which crashes
the software during any subsequent wallet load.
Note:
This issue will no longer exist after #31378, because we will discard multisig
scripts that are consensus-invalid or descriptors-incompatible prior to calling
the descriptors inference procedure but fixing this also at the descriptors level
prevents future misuses.
The
decodescript()
RPC command currently return unusable multisig descriptors.For example, providing a P2SH multisig redeem script with more than 520 bytes results
in a descriptor when it shouldn't due to its large size.