Skip to content

Conversation

tcharding
Copy link
Member

@tcharding tcharding commented Mar 16, 2025

We have the hex_lit dependency for converting a hex string literal to an array.

Currently we have a test_hex_unwrap macro in the hex v0.3.0 release but not on either master
or the upcoming v1.0.0-alpha.0 release. This is making PRs around releasing and depending on the
release more noisy than required.

Introduce a test_hex_unwrap macro in internals for usage when the input is not a string literal.

Use hex_lit::hex where possible (often needing an additional call to

@github-actions github-actions bot added C-bitcoin PRs modifying the bitcoin crate C-internals PRs modifying the internals crate C-base58 labels Mar 16, 2025
@tcharding
Copy link
Member Author

This will make #4249 less noisy.

@tcharding tcharding force-pushed the 03-17-hex-macro branch 3 times, most recently from 813a170 to c3b0e79 Compare March 17, 2025 00:16
Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, all of these can, and probably should, use hex_lit.

@apoelstra
Copy link
Member

Yeah, would prefer to use hex_lit for uniformity.

@tcharding
Copy link
Member Author

Not all call sites use string literals. Now uses hex_lit::hex where possible. I've updated the PR and description.

@tcharding
Copy link
Member Author

What is going on here, the dev-dependency seems not to be available in bench code?

@@ -23,6 +23,7 @@ internals = { package = "bitcoin-internals", version = "0.4.0" }

[dev-dependencies]
hex = { package = "hex-conservative", version = "0.3.0", default-features = false, features = ["alloc"] }
internals = { package = "bitcoin-internals", version = "0.4.0", features = ["hex"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this needs hex_lit dependency instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, I tried that before I worked out to throw .to_vec() everywhere. Done as suggested.

// annex starting with 0x50 causes the branching logic.
let annex = hex!("50");
let annex = hex!("50").to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root cause is that we don't implement Encodable for slices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I looked below, it only needs slices.

#[cfg(feature = "hex")]
mod tests {
#[test]
fn parse_hex_into_vector() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test you mean or the macro itself? The macro is used and I'd say if we use the macro then we keep the test as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the macro, yes. Interesting that it's used - we have dynamic test data somewhere.

@apoelstra
Copy link
Member

CI failure is real.

@apoelstra
Copy link
Member

In fc334a1:

:/ there are a few points of messiness here:

  • One is the need for .to_vec for things that we're Decodeing. Ok, I think we can live with that since we decided elsewhere not to support Decodeing from byteslices directly.
  • Another is the need for .to_vec in Witness::from_slice. This maybe should be promoted to an API issue on Witness.
  • In one place you have use internals::test_hex_unwrap as hex;, and to support this you modify internals and also add an internals dep. Since it's only one location can you just write let hex = |b| <alloc::vec::Vec<u8> as hex::FromHex>::from_hex(b).unwrap());?

Only the third point is something that I'd like changed about this PR.

@Kixunil
Copy link
Collaborator

Kixunil commented Mar 18, 2025

  • One is the need for .to_vec for things that we're Decodeing. Ok, I think we can live with that since we decided elsewhere not to support Decodeing from byteslices directly.

Do you mean Encodableing? ( 😉 )

Honestly, I think Encodable for Vec and no equivalent thing for slices is beyond weird. We already have a case that's not encoded with length (TaprootMerkleBranch). Instead it should be something like SliceExt with encode_len_prefixed method. But please, nobody bother with it now.

  • Another is the need for .to_vec in Witness::from_slice. This maybe should be promoted to an API issue on Witness.

That's not Witness issue but "nobody thought to make compiler suggest array of slices rather than array of differently-sized arrays" so Tobin didn't notice he can just cast the first array to a slice. :) And since this is easy to change and the PR is going to change anyway I'd like this to be changed.

@github-actions github-actions bot removed the C-internals PRs modifying the internals crate label Mar 19, 2025
@tcharding tcharding force-pushed the 03-17-hex-macro branch 3 times, most recently from 89a6596 to fa8e588 Compare March 19, 2025 03:59
@tcharding
Copy link
Member Author

Only the third point is something that I'd like changed about this PR.

Mainly I just called Vec::from_hex and imported the trait.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK, so many call to to_vec where slices can be used is somewhat confusing.


const SOME_TX: &str = "0100000001a15d57094aa7a21a28cb20b59aab8fc7d1149a3bdbcddba9c622e4f5f6a99ece010000006c493046022100f93bb0e7d8db7bd46e40132d1f8242026e045f03a0efe71bbb8e3f475e970d790221009337cd7f1f929f00cc6ff01f03729b069a7c21b59b1736ddfee5db5946c5da8c0121033b9b137ee87d5a812d6f506efdd37f0affa7ffc310711c06c7f3e097c9447c52ffffffff0100e1f505000000001976a9140389035a9225b3839e2bbf32d826a1e222031fd888ac00000000";

#[bench]
pub fn bench_transaction_size(bh: &mut Bencher) {
let raw_tx = hex!(SOME_TX);
let raw_tx = Vec::from_hex(SOME_TX).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also shouldn't need Vec, might need cast to &[u8] below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a patch up front that cleans up the bench code in transaction.

let el_0 = hex!("03d2e15674941bad4a996372cb87e1856d3652606d98562fe39c5e9e7e413f2105");
let el_1 = hex!("000000");
let el_0 = hex!("03d2e15674941bad4a996372cb87e1856d3652606d98562fe39c5e9e7e413f2105").to_vec();
let el_1 = hex!("000000").to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could've been just slices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that for some reason this tests that Vec<Vec<u8>> serialization is the same as Witness. Do we actually need this? @apoelstra I think since we added Witness Vec<Vec<u8>> serialization is no longer relevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that for some reason this tests that Vec<Vec<u8>> serialization is the same as Witness.

Well, because that's how the serialization of Witness is defined.

Maybe we can drop the encoding impls for Vec<Vec<u8>>, though it's a single line of code and isn't hurting anything and makes this test easy to write -- although it should probably be an arbitrary/kani test rather than a fixed-vector one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, because that's how the serialization of Witness is defined.

My idea is to move the entire thing to Witness - indeed, removing Vec<Vec<u8>>.

isn't hurting anything

If any other future soft fork adds a structure that's similar to witness but encoded differently it'll be a mess. I would even argue it already has something similar - taproot merkle branch. Though we have a type that's so much more convenient nobody will ever convert it so it's not a problem now. Note that I don't object to a function capable of serializing Vec<Vec<u8>> existing, just to it being the trait impl.

And yes, I think we should fuzz it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that taproot merkle branches are similar at all to a Vec<Vec>. They are similar to a Vec<[u8; 32]>, and without looking I can't say whether they are serialized like that or like a Vec.

But regardless, it's a distinct object which has its own distinct type, and the "raw type" that shares its serialization means little more than a mnemonic for remembering/implementing/testing the serialization.

I don't really mind removing Vec<Vec>, but I really don't think it hurts anything.

// annex starting with 0x50 causes the branching logic.
let annex = hex!("50");
let annex = hex!("50").to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, could've been slices.

// annex starting with 0x50 causes the branching logic.
let annex = hex!("50");
let annex = hex!("50").to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I looked below, it only needs slices.

// annex starting with 0x50 causes the branching logic.
let annex = hex!("50");
let annex = hex!("50").to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slices again.

hex!("3045022100d12b852d85dcd961d2f5f4ab660654df6eedcc794c0c33ce5cc309ffb5fce58d022067338a8e0e1725c197fb1a88af59f51e44e4255b20167c8684031c05d1f2592a01"),
hex!("0223b72beef0965d10be0778efecd61fcac6f79a4ea169393380734464f84f2ab3"),
hex!("3045022100d12b852d85dcd961d2f5f4ab660654df6eedcc794c0c33ce5cc309ffb5fce58d022067338a8e0e1725c197fb1a88af59f51e44e4255b20167c8684031c05d1f2592a01").to_vec(),
hex!("0223b72beef0965d10be0778efecd61fcac6f79a4ea169393380734464f84f2ab3").to_vec(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, slices on all Witness constructions above.

@tcharding
Copy link
Member Author

I'm leaving all the witness stuff as is. The macro being replaced produced a Vec, I don't see any reason why this PR should do otherwise. Feel free to follow up and remove the calls to .to_vec().

@Kixunil
Copy link
Collaborator

Kixunil commented Mar 21, 2025

The reason is that the test is not supposed to test if vec works in particular (quite the opposite, a test should make sure non-vecs are usable). I know that historically we had Vec<Vec<u8>> but given the current code &[impl AsRef<[u8]>] better models the input to the Witness.

I feel like the cleanup should go before this PR, not after. I can send it if you're fine with waiting.

@tcharding
Copy link
Member Author

Thanks, go for it.

@tcharding tcharding marked this pull request as draft March 21, 2025 20:30
Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better now.

@@ -2013,10 +2007,18 @@ mod benches {

#[bench]
pub fn bench_transaction_deserialize(bh: &mut Bencher) {
let raw_tx = hex!(SOME_TX);
let raw_tx = <Vec<u8> as hex::FromHex>::from_hex(SOME_TX).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change. I think it's equivalent to the macro and should work equally well with hex_lit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hex_lit doesn't work in bench code for some reason. Seems the dev-dependecies aren't available.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, whaat?! OK then, let's keep it this way. A comment if you can be bothered would be nice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this would be fixed with criterion. We should probably have a "switch to criterion" tracking issue so that we notice when we hit a critical mass of pain with the built-in benchmarks and feel like doing the work to switch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this would be fixed with criterion. We should probably have a "switch to criterion" tracking issue so that we notice when we hit a critical mass of pain with the built-in benchmarks and feel like doing the work to switch.

Added to #3333

apoelstra added a commit that referenced this pull request Mar 26, 2025
84bee2f Simplify `Witness` construction in tests (Martin Habovstiak)
3551ec2 Don't access internalls of `Witness` in tests (Martin Habovstiak)
c807836 Impl `PartialEq` between `Witness` and containers (Martin Habovstiak)
587a66d Add a bunch of missing conversions for `Witness` (Martin Habovstiak)

Pull request description:

  This is supposed to go in front of #4250

  `Witness` lacked a bunch of APIs that were making it harder to use and test, so this also adds them in addition to cleaning up tests. (I only realized they are missing when I tried to clean up tests and got a bunch of errors.)

ACKs for top commit:
  tcharding:
    ACK 84bee2f
  apoelstra:
    ACK 84bee2f; successfully ran local tests

Tree-SHA512: 7973f2a56b070babba7b4c632f45858154ccd00f8e77956ad2d28cb66e1fd18ff60d92c031ba3b76d0958e4acd34adfca10607fa26ec569dfd52ba1c1e2c79eb
We can use `deserialize_hex` when outside of the actual benchmark code
to simplify the functions.

Also add an additional test that benchmarks `deserialize_hex`.
We have the `hex_lit` dependency for converting a hex string literal
to an array.

Currently we have a `test_hex_unwrap` macro in the `hex v0.3.0` release
but not on either `master` or the upcoming `v1.0.0-alpha.0` release.
This is making PRs around releasing and depending on the release more
noisy than required.

Use `hex_lit::hex` where possible (often needing an additional call to
`to_vec()`) and where not possible use `Vec::from_hex`.
@tcharding
Copy link
Member Author

Rebased and added a comment to the bench usage of FromHex.

@tcharding tcharding marked this pull request as ready for review April 10, 2025 20:49
@github-actions github-actions bot removed C-internals PRs modifying the internals crate C-primitives labels Apr 10, 2025
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d6296cd; successfully ran local tests


let mut want_witness = Witness::default();
want_witness.push(&el_0);
want_witness.push(&el_1);

let vec = vec![el_0.clone(), el_1.clone()];
let vec = vec![el_0, el_1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is later used in a function let got_witness = Witness::from_slice(&vec);. Can't this just be a slice the begin with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to rebase this PR we could fix this. I'm a little surprised the clippy "unnecesary vec" lint didn't notice it.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d6296cd

@apoelstra apoelstra merged commit 5f4075a into rust-bitcoin:master Apr 24, 2025
24 checks passed
@tcharding tcharding deleted the 03-17-hex-macro branch April 28, 2025 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-base58 C-bitcoin PRs modifying the bitcoin crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants