Skip to content

Conversation

zaidmstrr
Copy link
Contributor

@zaidmstrr zaidmstrr commented Jun 27, 2025

Addresses comment and this.

The PR #31375 got merged and enables -named by default in the bitcoin rpc interface; bitcoin rpc corresponds to bitcoin-cli -named as it's just a wrapper. Now, the problem arises when we try to parse the positional paramater which might contain "=" character. This splits the parameter into two parts first, before the "=" character, which treats this as the parameter name, but the other half is mostly passed as an empty string. Here, the first part of the string is an unknown parameter name; thus, an error is thrown. These types of errors are only applicable to those RPCs which might contain the = character as a parameter. Some examples are finalizepsbt, decodepsbt, verifymessage etc.

This is the one example of the error in finalizepsbt RPC:

./bitcoin-cli -named -regtest finalizepsbt cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O+g1htqivkANAwAAAAAAFgAUir7MzgyzDnRMjdkVa7d+Dwr07jsAAAAAAAAAAAA=
error code: -8
error message:
Unknown named parameter cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O+g1htqivkANAwAAAAAAFgAUir7MzgyzDnRMjdkVa7d+Dwr07jsAAAAAAAAAAAA

This PR fixes this by adding a vRPCStringParams table that identifies parameters that need special handling in -named parameter mode. The parser now recognises these parameters and handles strings with "=" char correctly, preventing them from being incorrectly split as parameter assignments.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 27, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32821.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK ryanofsky
Concept ACK yuvicc
Stale ACK BrandonOdiwuor

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33230 (cli: Handle arguments that can be either JSON or string by achow101)
  • #32784 (wallet: derivehdkey RPC to get xpub at arbitrary path by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@Sjors
Copy link
Member

Sjors commented Jun 27, 2025

I ran into this issue in #32784 while editing the multisig tutorial. 14e2125 fixes it.

@ryanofsky
Copy link
Contributor

Concept ACK 14e2125. Thanks for the fix!

If possible it would be good to add python tests to check the new behavior and I also agree with comment that this change doesn't need to be limited to base64 parameters and could probably be extended to other string parameters. But maybe the "If parameter is unknown for a given known string method, then treat the whole string as positional" behavior might have to be dropped in that case.

I think this fix is little different from my suggestion in #31375 (comment) because I was thinking of continuing to treat unknown parameters as named not positional, but maybe this approach is better. Would have to think about it. The implementation is also a little different than what I had in mind because I was thinking of extending the vRPCConvertParams table with type information instead of adding a separate table, but your approach might be simpler.

@zaidmstrr
Copy link
Contributor Author

If possible it would be good to add python tests to check the new behavior

Sure, I'm also thinking of adding a python test, but the main problem is, this issue is only reproducible when there exists = padding or in between, so while creating a PSBT, I need to look at how I can achieve this. Or maybe by calling RPCs with a label.

I also agree with #32821 (comment) that this change doesn't need to be limited to base64 parameters and could probably be extended to other string parameters. But maybe the "If parameter is unknown for a given known string method, then treat the whole string as positional" behavior might have to be dropped in that case.

To extend this feature to any string parameters, we simply need to add an entry in the vRPCStringParams table. I tested with getnewaddress which requires a label (optionally) and it's working fine. Without adding the entry in vRPCStringParams it doesn't work if we provide a label that consists of the = char in it.

And the line you mentioned to remove is problematic if we remove it, as it's only got invoked if the RPC method is correct and matches with our table. And if the parameter name is not known then it's invoked; otherwise not. This is safe to do because we already make sure that the calling method is present in the table. Otherwise, if the user passed the correct method name but an incorrect parameter, then the targeted RPC will fail with an error.

@Sjors
Copy link
Member

Sjors commented Jun 27, 2025

If possible it would be good to add python tests to check the new behavior

Sure, I'm also thinking of adding a python test, but the main problem is, this issue is only reproducible when there exists = padding or in between, so while creating a PSBT, I need to look at how I can achieve this.

You could just manually generate a random PSBT on regtest, hardcode it into a test and call analyzepsbt. It doesn't matter if it's unspendable.

Or you could even just call these methods with abc=efg since it doesn't even need to be a validly encoded psbt.

@zaidmstrr
Copy link
Contributor Author

I generalised this change to any string parameter which might contain the = char in it. I also added the functional test for most of the RPCs I included in the new table. For any new RPC method which might need to be parsed this way with -named enabled, one can add the new entry of that RPC in the vRPCStringParams table.

@zaidmstrr zaidmstrr changed the title rpc: Handle -named argument parsing where Base64 encoding is used rpc: Handle -named argument parsing where '=' character is used Jun 29, 2025
@yuvicc
Copy link
Contributor

yuvicc commented Jun 29, 2025

Concept ACK

./build/bin/bitcoin-cli -named  finalizepsbt cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O=g1htqivkANAwAAAAAAFgAUir7MzgyzDnRMjdkVa7d+Dwr07jsAAAAAAAAAAAA=

error code: -8
error message:
Unknown named parameter cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O

Commit 59fcefc

./build/bin//bitcoin-cli -named  finalizepsbt cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O+g1htqivkANAwAAAAAAFgAUir7MzgyzDnRMjdkVa7d+Dwr07jsAAAAAAAAAAAA=
{
  "psbt": "cHNidP8BAJoCAAAAAqvNEjSrzRI0q80SNKvNEjSrzRI0q80SNKvNEjSrzRI0AAAAAAD9////NBLNqzQSzas0Es2rNBLNqzQSzas0Es2rNBLNqzQSzasBAAAAAP3///8CoIYBAAAAAAAWABQVQBGVs/sqFAmC8HZ8O+g1htqivkANAwAAAAAAFgAUir7MzgyzDnRMjdkVa7d+Dwr07jsAAAAAAAAAAAA=",
  "complete": false
}

Will review the code soon!

@zaidmstrr zaidmstrr requested a review from maflcko June 29, 2025 14:24
Copy link
Contributor

@BrandonOdiwuor BrandonOdiwuor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested ACK e246741

Confirmed that commit 59fcefc correctly handles named argument parsing for parameters containing = characters—an issue currently reproducible on master.

<commit: 59fcefc>
Screenshot 2025-06-29 at 19 49 25

<branch: master>
Screenshot 2025-06-29 at 19 50 42

@DrahtBot DrahtBot requested a review from ryanofsky June 29, 2025 17:07
@Sjors
Copy link
Member

Sjors commented Jul 4, 2025

From CI:

assert not "non_witness_utxo" in mining_node.decodepsbt(signed_psbt["psbt"])["inputs"][0]

I didn't check the code, but that's usually a cryptic way of saying that signing the psbt failed.

@zaidmstrr
Copy link
Contributor Author

Signing the PSBT works fine while running locally, and removing non-witness UTXO from the PSBT.

@maflcko
Copy link
Member

maflcko commented Jul 4, 2025

Signing the PSBT works fine while running locally, and removing non-witness UTXO from the PSBT.

What command did you use to run locally?

@zaidmstrr
Copy link
Contributor Author

What command did you use to run locally?

I first checked by adding two log statements in rpc_psbt.py the first log to check the psbt before signing and the second log after walletprocesspsbt where CI is reporting an error. And then manually parsing the result with the help of decodepsbt to check whether the PSBT afterwalletprocesspsbt contains non-witness UTXO, but it doesn't in my case, so the test passed successfully.

I also ran all the functional test cases with —extended and all tests passed.

@maflcko
Copy link
Member

maflcko commented Jul 4, 2025

you'll have to use --usecli to run the tests, otherwise the cli is not used

@Prabhat1308
Copy link
Contributor

Trying to debug the issue I see some irregularities in the how the test behaves in rpc mode and cli mode.

diff
diff --git a/test/functional/rpc_psbt.py b/test/functional/rpc_psbt.py
index a241978b5f..f92e8d6e88 100755
--- a/test/functional/rpc_psbt.py
+++ b/test/functional/rpc_psbt.py
@@ -135,8 +135,64 @@ class PSBTTest(BitcoinTestFramework):
         assert "non_witness_utxo" in mining_node.decodepsbt(psbt_new.to_base64())["inputs"][0]

         # Have the offline node sign the PSBT (which will remove the non-witness UTXO)
+
+        decoded_before = mining_node.decodepsbt(psbt_new.to_base64())
+
+        self.log.info(f"=== DEBUG: CLI Mode = {self.options.usecli} ===")
+
+        # Log address and input details
+        offline_addr_info = offline_node.getaddressinfo(offline_addr)
+        self.log.info(f"Offline address: {offline_addr}")
+        self.log.info(f"Address type: {offline_addr_info.get('type', 'unknown')}")
+
+        decoded_before = mining_node.decodepsbt(psbt_new.to_base64())
+
+        self.log.info(f"=== DEBUG: CLI Mode = {self.options.usecli} ===")
+
+        # Log address and input details
+        offline_addr_info = offline_node.getaddressinfo(offline_addr)
+        self.log.info(f"Offline address: {offline_addr}")
+        self.log.info(f"Address type: {offline_addr_info.get('type', 'unknown')}")
+        self.log.info(f"Script type: {offline_addr_info.get('script', 'unknown')}")
+        self.log.info(f"Is witness: {offline_addr_info.get('iswitness', False)}")
+        self.log.info(f"Witness version: {offline_addr_info.get('witness_version', 'none')}")
+        self.log.info(f"Witness program: {offline_addr_info.get('witness_program', 'none')}")
+
+        # Log PSBT input details before signing
+        self.log.info(f"BEFORE signing - input[0] keys: {list(decoded_before['inputs'][0].keys())}")
+        self.log.info(f"BEFORE signing - has non_witness_utxo: {'non_witness_utxo' in decoded_before['inputs'][0]}")
+        self.log.info(f"BEFORE signing - has witness_utxo: {'witness_utxo' in decoded_before['inputs'][0]}")
+
+        # Check witness_utxo script details
+        if 'witness_utxo' in decoded_before['inputs'][0]:
+            witness_utxo = decoded_before['inputs'][0]['witness_utxo']
+            self.log.info(f"BEFORE signing - witness_utxo script type: {witness_utxo['scriptPubKey'].get('type', 'unknown')}")
+            self.log.info(f"BEFORE signing - witness_utxo hex: {witness_utxo['scriptPubKey'].get('hex', 'none')}")
+
+        # Log the exact walletprocesspsbt call being made
+        psbt_b64 = psbt_new.to_base64()
+        self.log.info(f"walletprocesspsbt call: offline_node.walletprocesspsbt('{psbt_b64[:50]}...', True, 'ALL', True, True)")
+
         signed_psbt = offline_node.walletprocesspsbt(psbt_new.to_base64())
-        assert not "non_witness_utxo" in mining_node.decodepsbt(signed_psbt["psbt"])["inputs"][0]
+        decoded_after = mining_node.decodepsbt(signed_psbt["psbt"])
+
+        # Log PSBT input details after signing
+        self.log.info(f"AFTER signing - input[0] keys: {list(decoded_after['inputs'][0].keys())}")
+        self.log.info(f"AFTER signing - has non_witness_utxo: {'non_witness_utxo' in decoded_after['inputs'][0]}")
+        self.log.info(f"AFTER signing - has witness_utxo: {'witness_utxo' in decoded_after['inputs'][0]}")
+        self.log.info(f"walletprocesspsbt result complete: {signed_psbt.get('complete', 'not present')}")
+
+        # Check for Taproot vs regular segwit indicators
+        has_taproot_derivs = 'taproot_bip32_derivs' in decoded_before['inputs'][0] or 'taproot_bip32_derivs' in decoded_after['inputs'][0]
+        has_taproot_key = 'taproot_internal_key' in decoded_before['inputs'][0] or 'taproot_internal_key' in decoded_after['inputs'][0]
+        has_regular_derivs = 'bip32_derivs' in decoded_before['inputs'][0] or 'bip32_derivs' in decoded_after['inputs'][0]
+
+        self.log.info(f"Input type detection - Taproot derivs: {has_taproot_derivs}, Taproot key: {has_taproot_key}, Regular derivs: {has_regular_derivs}")
+
+        # Log the actual walletprocesspsbt call details for comparison
+        if self.options.usecli:
+            self.log.info("Using CLI mode - parameters passed as separate arguments")
+        else:
+            self.log.info("Using RPC mode - parameters passed as JSON")
+
+        # More detailed assertion with context
+        if "non_witness_utxo" in decoded_after["inputs"][0]:
+            self.log.error("FAILURE: non_witness_utxo should be removed for segwit v1+ inputs!")
+            self.log.error(f"This suggests the input is being treated as segwit v0 instead of v1+ in CLI mode")
+
+
with cli
2025-07-07T15:20:33.852283Z TestFramework (INFO): Check that non-witness UTXOs are removed for segwit v1+ inputs
2025-07-07T15:20:34.004885Z TestFramework (INFO): === DEBUG: CLI Mode = True ===
2025-07-07T15:20:34.008380Z TestFramework (INFO): Offline address: bcrt1q2skwujcxhaawtf0hlqawfmk3gclpdz24a84znw
2025-07-07T15:20:34.008437Z TestFramework (INFO): Address type: unknown
2025-07-07T15:20:34.008459Z TestFramework (INFO): Script type: unknown
2025-07-07T15:20:34.008479Z TestFramework (INFO): Is witness: True
2025-07-07T15:20:34.008497Z TestFramework (INFO): Witness version: 0
2025-07-07T15:20:34.008513Z TestFramework (INFO): Witness program: 542cee4b06bf7ae5a5f7f83ae4eed1463e168955
2025-07-07T15:20:34.008532Z TestFramework (INFO): BEFORE signing - input[0] keys: ['witness_utxo', 'non_witness_utxo', 'bip32_derivs']
2025-07-07T15:20:34.008550Z TestFramework (INFO): BEFORE signing - has non_witness_utxo: True
2025-07-07T15:20:34.008565Z TestFramework (INFO): BEFORE signing - has witness_utxo: True
2025-07-07T15:20:34.008579Z TestFramework (INFO): BEFORE signing - witness_utxo script type: witness_v0_keyhash
2025-07-07T15:20:34.008595Z TestFramework (INFO): BEFORE signing - witness_utxo hex: 0014542cee4b06bf7ae5a5f7f83ae4eed1463e168955
2025-07-07T15:20:34.008688Z TestFramework (INFO): walletprocesspsbt call: offline_node.walletprocesspsbt('cHNidP8BAFICAAAAAQVChcz/DF+d62weFt4XJAYjd8cLM81FZV...', True, 'ALL', True, True)
2025-07-07T15:20:34.016020Z TestFramework (INFO): AFTER signing - input[0] keys: ['witness_utxo', 'non_witness_utxo', 'final_scriptwitness']
2025-07-07T15:20:34.016093Z TestFramework (INFO): AFTER signing - has non_witness_utxo: True
2025-07-07T15:20:34.016117Z TestFramework (INFO): AFTER signing - has witness_utxo: True
2025-07-07T15:20:34.016135Z TestFramework (INFO): walletprocesspsbt result complete: True
2025-07-07T15:20:34.016155Z TestFramework (INFO): Input type detection - Taproot derivs: False, Taproot key: False, Regular derivs: True
2025-07-07T15:20:34.016174Z TestFramework (INFO): Using CLI mode - parameters passed as separate arguments
2025-07-07T15:20:34.016193Z TestFramework (ERROR): FAILURE: non_witness_utxo should be removed for segwit v1+ inputs!
2025-07-07T15:20:34.016209Z TestFramework (ERROR): This suggests the input is being treated as segwit v0 instead of v1+ in CLI mode
2025-07-07T15:20:34.016234Z TestFramework (ERROR): Assertion failed
without cli
2025-07-07T15:19:10.430768Z TestFramework (INFO): === DEBUG: CLI Mode = False ===
2025-07-07T15:19:10.431566Z TestFramework (INFO): Offline address: bcrt1pxmju835wnuw3z5thftuz84q8rt96vqd97jaxwdn2uqlxwwka923q8fcsp8
2025-07-07T15:19:10.431613Z TestFramework (INFO): Address type: unknown
2025-07-07T15:19:10.431645Z TestFramework (INFO): Script type: unknown
2025-07-07T15:19:10.431672Z TestFramework (INFO): Is witness: True
2025-07-07T15:19:10.431699Z TestFramework (INFO): Witness version: 1
2025-07-07T15:19:10.431729Z TestFramework (INFO): Witness program: 36e5c3c68e9f1d1151774af823d4071acba601a5f4ba67366ae03e673add2aa2
2025-07-07T15:19:10.431760Z TestFramework (INFO): BEFORE signing - input[0] keys: ['witness_utxo', 'non_witness_utxo', 'taproot_bip32_derivs', 'taproot_internal_key']
2025-07-07T15:19:10.431789Z TestFramework (INFO): BEFORE signing - has non_witness_utxo: True
2025-07-07T15:19:10.431831Z TestFramework (INFO): BEFORE signing - has witness_utxo: True
2025-07-07T15:19:10.431857Z TestFramework (INFO): BEFORE signing - witness_utxo script type: witness_v1_taproot
2025-07-07T15:19:10.431881Z TestFramework (INFO): BEFORE signing - witness_utxo hex: 512036e5c3c68e9f1d1151774af823d4071acba601a5f4ba67366ae03e673add2aa2
2025-07-07T15:19:10.431929Z TestFramework (INFO): walletprocesspsbt call: offline_node.walletprocesspsbt('cHNidP8BAF4CAAAAAVteg2JVDVvQHLSH4zmmgu7l4bYJdyIoFq...', True, 'ALL', True, True)
2025-07-07T15:19:10.433261Z TestFramework (INFO): AFTER signing - input[0] keys: ['witness_utxo', 'final_scriptwitness']
2025-07-07T15:19:10.433305Z TestFramework (INFO): AFTER signing - has non_witness_utxo: False
2025-07-07T15:19:10.433336Z TestFramework (INFO): AFTER signing - has witness_utxo: True
2025-07-07T15:19:10.433364Z TestFramework (INFO): walletprocesspsbt result complete: True
2025-07-07T15:19:10.433399Z TestFramework (INFO): Input type detection - Taproot derivs: True, Taproot key: True, Regular derivs: False
2025-07-07T15:19:10.433426Z TestFramework (INFO): Using RPC mode - parameters passed as JSON
  1. inputs are being treated differently in the cli and non-cli mode
  2. even the address being generated in rpc mode is different (larger as compared to cli mode)
  3. non-witness utxo remains in cli mode.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 56b1da7. Thanks for the update! The fix here seems right and should be practically useful. And the new tests seem to cover different scenarios where positional string parameters contain '='.

Two comments:

  • I think it would good to add a test to cover cases where positional non-string parameters contain '=', i.e. the bitcoin-cli -named echojson '["fail=yes"]' case which previously failed (because it treated everything before the '=' character as a parameter name) and now succeeds.

  • I think the code change suggested in df4f391 (tag) to combine vRPCConvertParams and vRPCStringParams tables into a single table, and document it better, and remove some layers of indirection could still be a good change to make. But it could be a followup and the current approach seems reasonable to me. Unless I missed something, the approach implemented there is the same approach implemented here currently, so this would just be a refactoring.

@DrahtBot DrahtBot requested a review from BrandonOdiwuor July 21, 2025 19:44
@zaidmstrr
Copy link
Contributor Author

  • I think it would good to add a test to cover cases where positional non-string parameters contain '=', i.e. the bitcoin-cli -named echojson '["fail=yes"]' case which previously failed (because it treated everything before the '=' character as a parameter name) and now succeeds.

You are right here, I will add a test for non-string params like echojson.

  • I think the code change suggested in df4f391 (tag) to combine vRPCConvertParams and vRPCStringParams tables into a single table, and document it better, and remove some layers of indirection could still be a good change to make.

I intentionally choose this approach to use both the vRPCStringParams and vRPCConvertParams tables because the string table has some additional cases, like adding all the string params for a given method in the table for consistency and valid lookups. So with all these things, separating the tables is a good and less complex way between both logics.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 8fa1a0e. Thanks for the updates! Only change since last review was clarifying a comment and some variable names and adding new tests to cover cases where non-string positional parameters containing '=' are passed.

@achow101
Copy link
Member

I'm wondering if there's a different way we can go about this without adding another table of arguments that need special treatment. Perhaps we could move named argument handling and string to json conversion server side? That's a much more significant refactor, but the RPC server already knows every parameter type and name, so it could do the conversion and named argument check as well.

@ryanofsky
Copy link
Contributor

ryanofsky commented Jul 31, 2025

I'm wondering if there's a different way we can go about this without adding another table of arguments that need special treatment.

I'm also not a fan of separate tables and suggested the following change to unify them earlier: b998cc5 (tag). This change is just a refactoring and could be a followup.

Perhaps we could move named argument handling and string to json conversion server side?

I think moving logic server side would avoid need for duplicate tables, but not actually make the code or logic simpler because the current syntax for distinguishing named parameters is inherently ambiguous. (It probably would have been better to require named parameters to begin with - to avoid ambiguity. It could also be better to try to parse every argument as JSON and just fall back to passing strings if they are not valid JSON to avoid the need for the conversion table.) I feel like current PR just makes some small tweaks to parsing to make the current syntax work a little better, and it adds good test coverage. If we follow up this PR with b998cc5 or incorporate those changes here, the client code should be better documented and more maintainable too.

@achow101
Copy link
Member

achow101 commented Jul 31, 2025

because the current syntax for distinguishing named parameters is inherently ambiguous.

But since the server knows the names of all parameters, it can also check whether the incoming string starts with the name of a parameter.

Alternatively, maybe we can make the rpc client aware of all of the rpcs so it can do that conversion? Although that would likely have the effect of reducing the utility of bitcoin-cli across versions.

@ryanofsky
Copy link
Contributor

Alternatively, maybe we can make the rpc client aware of all of the rpcs so it can do that conversion?

That's basically what the PR does implicitly and what my refactoring of the PR does more explicitly with string/json formats in
rpc/client.cpp.

I think if you look at that file you will see that the the parsing logic is not that complicated and can be well explained. The code there should actually be simpler than current client code in master because it gets rid of intermediate lookup tables and the overloaded overloaded ArgToUniValue method.

Although that would likely have the effect of reducing the utility of bitcoin-cli across versions.

I think that's just the current reality. The fact that client needs to know about parameter names and types makes it more fragile, but shouldn't be made worse by the changes here, as far as I can tell.

@achow101
Copy link
Member

achow101 commented Jul 31, 2025

That's basically what the PR does implicitly and what my refactoring of the PR does more explicitly with string/json formats

Yes, but I meant more so in having access to RPCHelpMan which already has the names and types of all parameters.

Or if we had a machine parseable schema thing (see #29912) that bitcon-cli refers to.

@zaidmstrr
Copy link
Contributor Author

Rebased on top latest master.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 780bec6, just rebased since last review to avoid minor conflict with #31886

I still think b998cc5 ('rpc/client.cpp') would be an nice simplification, but it's just a refactoring so it could be a followup.

@zaidmstrr
Copy link
Contributor Author

I still think b998cc5 ('rpc/client.cpp') would be an nice simplification, but it's just a refactoring so it could be a followup.

Yes, that could be the next possible followup.

Comment on lines +314 to +319
class RpcStringParam {
public:
std::string method_name; //!< method name this parameter belongs to
int param_idx; //!< 0-based idx of param
std::string param_name; //!< parameter name
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

This seems unnecessary, it's the same thing CRPCConvertParam.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definition might be the same, but the purpose is entirely different. This is specifically for string RPC methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the meaning is different enough to warrant a whole new class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning is totally different, and it's easier and much more maintainable right now. This class doesn't depend on any other class/object. It's an independent representation.

Comment on lines +394 to +395
std::set<std::pair<std::string, std::string>> stringParams;
std::set<std::pair<std::string, int>> stringParamsByIndex;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

nit: these should use the new naming style.

Comment on lines +492 to +493
bool next_positional_arg_is_string = rpcCvtTable.IsStringParamByIndex(strMethod, positional_args.size());
bool next_positional_arg_is_json = rpcCvtTable.IsConvertParamByIndex(strMethod, positional_args.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

I find the name 'next_...` to be confusing since this is really referring to the position of the current argument, not the next argument.

Copy link
Contributor

@ryanofsky ryanofsky Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #32821 (comment)

I find the name 'next_...` to be confusing since this is really referring to the position of the current argument, not the next argument.

Technically these aren't really referring to the current argument s. At this point in the loop, all we know about s is that it contains an = sign and that the beginning part of s before = is not a known parameter name. At this point s might be passed by-position if it is compatible with what we know about the next position in the args list (i.e. the positional_args vector). These variables are describing attributes of the next positional parameter. If s is compatible with them, it will be be passed by position, otherwise it will be passed by name.

I can see how next_positional_arg_* might be confusing though and maybe one solution is to just drop these variables and use the expressions directly below. My rewrite of this code in b998cc5 does not include these variables and includes a fuller explanation of what this code is doing:

bitcoin/src/rpc/client.cpp

Lines 422 to 439 in b998cc5

/**
* Convert command line arguments to params object when -named is enabled.
*
* Arguments that do not contain '=' are treated as positional parameters.
*
* Arguments that do contain '=' are assumed to be named parameters in
* NAME=VALUE format except for two special cases where they are treated as
* positional parameters instead:
*
* 1. The case where NAME is not a known parameter name and the next positional
* parameter requires a JSON value and the argument parses as JSON. The
* argument is presumed to be a positional JSON value that happens to contain
* a '='.
*
* 2. The case where NAME is not a known parameter name and the next positional
* parameter requires a string value. This argument is presumed to be a
* positional string value that happens to contain a '='.
*/

bool next_positional_arg_is_string = rpcCvtTable.IsStringParamByIndex(strMethod, positional_args.size());
bool next_positional_arg_is_json = rpcCvtTable.IsConvertParamByIndex(strMethod, positional_args.size());
UniValue value;
if(next_positional_arg_is_json && value.read(s)){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

Instead of parsing the argument directly, we should be using either Parse or ArgToUniValue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

Instead of parsing the argument directly, we should be using either Parse or ArgToUniValue

I don't think you can use Parse or ArgToUniValue unless you catch the exception they throw. No strong opinion, but I think it is probably simpler to use read.

Comment on lines +324 to +325
* converted from JSON. This is used for string parameters that might
* contain '=' character which could interfere with named parameter parsing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a4f2144 "rpc: Handle -named argument parsing where '=' character is used"

What is the criteria for including a parameter in this table? This text says "parameters that might contain '=' character", but this includes things that should never have = in them, like sighashtype, estimate_mode, or blockhash. But if the inclusion criteria is basically any string a user can provide, then there are several string parameters that should be included, like all of the other blockhash parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #32821 (comment)

What is the criteria for including a parameter in this table?

Basically if any string parameter for a method can contain =, all the other string parameters for the method should be listed too. My cleanup of this code in b998cc5 adds a comment explaining this:

* String parameters need to be listed here for methods accepting string
* arguments with '=' characters, to make bitcoin-cli treat these command line
* arguments as positional parameters instead of named parameters when -named is
* used. And if one string parameter is listed for a method, other string
* parameters for that method need to be listed as well so bitcoin-cli does not
* make the opposite mistake and pass other arguments by position instead of
* name because it does not recognize their names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a @note at around lines 328-330 explaining this behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was more that the criteria appears to be applied inconsistently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my analysis, there is nothing more than specified criteria applied inconsistently in the codebase. Can you point out something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so to verify my claim, please remove these params from the table and recompile the code, and then run the functional tests with CLI. You will see some errors because of inconsistent behaviour of the class. Meaning the code needs to know about all the string parameters for the given method for correctness. And that's only needed for RPCs which might contain an = character; the rest can be part of the JSON table without any further rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these arguments are there, but not all RPCs that have these arguments, is why I ask about the criteria. Either we are including all string parameters because they may contain an =, or we are only including parameters that might reasonably contain a =

Any individual string that is allowed to contain an '=' should be to be listed in the string table so the new code in this PR can function and detect that these string arguments should be passed as positional parameters, not named parameters.

But this is not sufficient, because methods can accept multiple string parameters. If a method accepts two string parameters and one of them may contain = and the other is not allowed to contain =, both parameters need to be listed in the table so the code does not make the opposite mistake and treat a named parameter as a positional parameter.

For example, say there is a wallet RPC called with ==mylabel== and type=bech32 as arguments. These are both string parameters and the label parameter can contain = and the type parameter can't. Both parameters need to be listed in the table. The label one needs to be listed so ==mylabel== is passed by position instead of by name, and the type one needs to be listed so type=bech32 is passed by name instead of by position.

Either all of a method's string parameters need to be listed in the table or none of them need to be listed for things to work properly.

Copy link
Member

@achow101 achow101 Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will see some errors

Then that is not just "for consistency" and that shouldn't be listed as the reason for including the other parameters.

Either all of a method's string parameters need to be listed in the table or none of them need to be listed for things to work properly.

I see. The comment does not explain this clearly.

It seems like this might also make it more difficult to use bitcoin-cli across versions. If bitcoind is a newer version with some RPC that has a new string parameter, it may not be possible to pass that new argument by name with a bitcoin-cli from an older version if the RPC has an entry in this table. (This is more likely to be an issue for developers, at least in my workflow, I typically always use the bitcoin-cli from master even when working on a branch.)

Copy link
Contributor

@ryanofsky ryanofsky Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The comment does not explain this clearly.

If it helps, as mentioned earlier I wrote a comment which does try to explain this:

* String parameters need to be listed here for methods accepting string
* arguments with '=' characters, to make bitcoin-cli treat these command line
* arguments as positional parameters instead of named parameters when -named is
* used. And if one string parameter is listed for a method, other string
* parameters for that method need to be listed as well so bitcoin-cli does not
* make the opposite mistake and pass other arguments by position instead of
* name because it does not recognize their names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this might also make it more difficult to use bitcoin-cli across versions. If bitcoind is a newer version with some RPC that has a new string parameter, it may not be possible to pass that new argument by name with a bitcoin-cli from an older version if the RPC has an entry in this table.

This is true, and we know a similar issue already happens if a new JSON parameter is added: old clients will incorrectly pass that parameter as a string instead of a JSON value. This PR is adding to the problem by not only using the table to determine whether arguments are JSON vs. string, but whether they are positional vs named.

I still think this PR is an improvement though because the positional vs named table lookups only need to be used as a heuristic to disambiguate genuinely ambiguous bitcoin-cli calls. If you want to pass a string argument that contains = by name, you do that 100% reliably by using -named mode and specifying parameter names. If you want to pass the argument by position, you can reliably do that by using -nonamed mode and not specifying parameter names. However, when you make -named calls without specifying parameter names, those calls are genuinely ambiguous, so bitcoin-cli needs to use a heuristic to decide what to do with them. This PR just improves that heuristic.

I do think we may want to think about supporting another syntax for passing named and positional parameters together that is not ambiguous. One syntax could be accepting --name=value arguments and accepting -- as a separator for specifying position arguments that begin with --. Another syntax which I always liked was jonasschnelli's python-like syntax from #20273

@ryanofsky
Copy link
Contributor

Code review ACK 6de6c78. Just style and naming changes since last review.

Just to restate my opinion of this PR from earlier, I think it is not good to have separate tables for string and json parameters. I think it will make this code harder to maintain and it would be better in long run to have a single table with clearly spelled out rules for what parameters need to be included in the table, as implemented in my suggested code change df4f391 (tag) . But I'd be happy to see this PR merged as is and cleanup done as a followup if both things can't be done in the same PR.

@achow101
Copy link
Member

I think it is not good to have separate tables for string and json parameters.

I would also prefer that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants