Skip to content

Conversation

tomasvdw
Copy link

@tomasvdw tomasvdw commented Sep 19, 2017

This adds support for Elliptic Curve Multiset operations

Currently benches at about 14us/hash

@apoelstra
Copy link
Contributor

apoelstra commented Sep 19, 2017

  1. Can you change the copyright headers on all the new files to use your name(s) rather than those on the files you templated from?
  2. Can you write a markdown file like this one or this one that explains what you're doing and a bit of motivation?
  3. I have no idea what our policy on accepting upstreamed modules is, other than that I'm pretty sure we have none because this hasn't happened before.

@tomasvdw
Copy link
Author

1 & 2 will do.

Note that this implements what Pieter Wiulle proposed on the Bitcoin mailing list. It isn't really upstream ATM.

@apoelstra
Copy link
Contributor

apoelstra commented Sep 19, 2017

Ah, that's right, I knew this looked familiar. Even a .md in modules/multiset that was nothing but a link to the mailing list post would be great.

Since this is intended eventually for Bitcoin, my guess is that this will be treated something like the aggsig module, where we'll review the code but nothing can actually be merged until an actual BIP is produced and finalized. We don't want to give the impression that the secp maintainers have any undue control over what proposals eventually are accepted.

@tomasvdw
Copy link
Author

I've changed the attribution and added a readme.

I understand if this isn't merged before a BIP is embraced, although you could argue that optional modules can be experimental.

Notes for review:

  • I believe I may be too eagerly normalizing field elements in secp256k1_multiset_create(). Someone with more knowledge of the normalization behaviour of the field operations may want to optimize some away.
  • I am not using context, but added it to the API calls for consistency. I am not sure whether this is preferred.

@gmaxwell
Copy link
Contributor

  • We use trial-and-increment which is fast but non-constant time.

I believe that this is slower than shallue van de woestijne at least in batches when a fast legendre symbol is available. It also results in non-unform points, which probably would kill hope of creating a strong security proof (though without creating a pratical weakness).

Currently benches at about 800k/sec

In Bitcoin I am a little doubtful this approach is acceptable because it would add CPU hours to synchronization, multiplicative ( bitcoin/bitcoin#10434 ) has much better performance and a more robust security argument. But is less cachable so it has worse performance at the tip. It's a difficult trade-off.

Inits the multiset with the constant for empty data

This initialization is non-portable; data encoded in a ge_storage_t (or gej) is potentially platform and or compile option specific. What is that value in any case? Looks like a backdoor as is, Since if there is any possible collection of set inputs that sum to that value which can be computed, then the initial state is prefilled with that entry as a member; so the initial state should be a value which is unreachable by any known input. The norm in the codebase is to include sage code to generate and verify all non-trivial constants. Please explain precisely where this constant originates.

I'm somewhat dubious about the point at infinity handling. I guess the idea behind the unexplained 'random' constant is to make it unreachable?

  • secp256k1_ge_set_gej(&ge_result, &gej_result);

Reprojecting to affine is devastating for performance and is entirely unneeded with the API setup here.

@sipa
Copy link
Contributor

sipa commented Sep 24, 2017

We use trial-and-increment which is fast but non-constant time.

I believe that this is slower than shallue van de woestijne at least in batches when a fast legendre symbol is available.

I think that's only true for characteristic-2 curves. Constant-time Shallue-Van de Woestijne here needs 1 inverse and 3 square roots. Even variable-time SVdW would need 1 inverse, 1.5 Jacobi symbols, and 1 square root.

The increment-and-test approach is indeed biased, and I'd prefer to avoid it, but using a hash operation rather than an increment would be fine (variable time doesn't matter here), and should be pretty fast when a fast Jacobi symbol is available (almost 2x the speed of the current PR, as it relies on using sqrt to test quadratic residues).

* Maintain multiset as Jacobian to prevent unneeded Jacobian->affine
* Change API to directly add/remove data
* Use infinity as empty element
* Use trial-and-rehash instead of trial-and-increment
@tomasvdw
Copy link
Author

Thanks for the input! I've made some improvements.

In Bitcoin I am a little doubtful this approach is acceptable because it would add CPU hours to synchronization, multiplicative ( bitcoin/bitcoin#10434 ) has much better performance and a more robust security argument. But is less cachable so it has worse performance at the tip. It's a difficult trade-off.

It is currently running at ~14us/hash, which won't add CPU hours, but only a few minutes. Also, I am writing this as I am experimenting with hybrid prefix tree/rolling hash approaches for which 768 bytes/rolling hash isn't really an option.

This initialization is non-portable; data encoded in a ge_storage_t (or gej) is potentially platform and or compile option specific. What is that value in any case? Looks like a backdoor as is, Since if there is any possible collection of set inputs that sum to that value which can be computed, then the initial state is prefilled with that entry as a member; so the initial state should be a value which is unreachable by any known input. The norm in the codebase is to include sage code to generate and verify all non-trivial constants. Please explain precisely where this constant originates.

Good point. Using ge_storage_t as multiset was a bit hacky.

I was indeed using the constant to avoid infinite, but this was a rather silly idea. This breaks empty+empty=empty and more important derived properties. I now switched to a portable multiset type which handles infinity and properly using infinity as empty set.

Reprojecting to affine is devastating for performance and is entirely unneeded with the API setup here.

Good point. My reasoning was, to use a 64-byte multiset, but the performance effect is indeed devastating (~25%). I now switched to a 96-byte (jacobian) multiset to avoid this. We could always add a 64 byte multiset_compact format if needed.

using a hash operation rather than an increment would be fine

Using a hash seems better indeed. I've switched to trial-and-hash instead of trial-and-increment.

@sipa
Copy link
Contributor

sipa commented Sep 25, 2017

because it would add CPU hours to synchronization

It is currently running at ~14us/hash, which won't add CPU hours, but only a few minutes.

At 14us per operation, synchronization from scratch would gain 3.5 hours of CPU time (there are about 900M inputs+outputs in the chain).

You could argue that in a post-UTXO-sync world that isn't relevant, but we're not there, and I don't think we should disregard the cost for doing a full validation.

I think less than 14us/op is doable though; I'll have a look at the code soon.

@tomasvdw
Copy link
Author

You could argue that in a post-UTXO-sync world that isn't relevant, but we're not there, and I don't think we should disregard the cost for doing a full validation.

But even for full validation, why would you hash spend TXOs? Surely, you only need to hash the current UTXO set, at ~50 million or ~12 min (with a single thread). And only once.

I think less than 14us/op is doable though; I'll have a look at the code soon.

I would very much appreciate that.

@sipa
Copy link
Contributor

sipa commented Sep 25, 2017

But even for full validation, why would you hash spend TXOs? Surely, you only need to hash the current UTXO set, at ~50 million or ~12 min (with a single thread).

Perhaps you're thinking of a different use case than I am. My hope is that every full node can at every point in time know what the hash of their UTXO set is. It could be used to compare the state of nodes, do sanity checking, or as part of a future UTXO commitment in blocks.

All of those things are not possible if you don't know the UTXO set hash during (even initial) synchronization.

@tomasvdw
Copy link
Author

I have the similar use cases in mind. In the future adding UTXO commitment to blocks, fast syncing, and possibly (with a hybrid prefix tree/rolling hash approach) UTXO proofs.

But I am afraid I do not understand the benefit of processing each historic UTXO change that way. What is the difference between the initial sync implementation:

A. Download all blocks, and for each block update the UTXO set and the commitment.
B. Download all blocks, and for each block update the UTXO set. Then update the commitment from the UTXO set.

I thought the primary advantage of a commutative hash is that A and B yield the same result. So what would be the drawback of B? Regardless of further optimizations, A seems wasteful.

@sipa
Copy link
Contributor

sipa commented Sep 25, 2017

If we'd have UTXO commitments (as a consensus rule) in blocks, with (B) you wouldn't be able to validate the correctness of those commitments in the historical chain.

My view is that with commutative hashing the advantage is that you can just start off with a UTXO set and compare it to chain commitments, if that's the security tradeoff you want to make. But I think it's wrong to think that everyone will make it, and disregard the validation cost it entails for those who want full historical validation.

To be clear, I'm not using this as an argument that using secp256k1 multiset hashing is the wrong approach. I'm just pointing out that the performance does matter for some use cases, but perhaps the tradeoffs are still worth it.

@tomasvdw
Copy link
Author

Agreed. I don't think the 900M existing outputs are relevant in this respect (as they don't have a commitment), but certainly the per block update time does for future full validation.


ctx = secp256k1_context_create(SECP256K1_CONTEXT_VERIFY);

run_benchmark("ecdsa_multiset", bench_multiset, bench_multiset_setup, NULL, NULL, 10, 20);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will report wrong results. The last parameter is supposed to acount for the number of iterations in the loop above. So it should be 10,000 but still it won't properly account for multiset_finalize.

My suggestion is remove multiset_finalize above and change the line to

run_benchmark("ecdsa_multiset_add", bench_multiset, bench_multiset_setup, NULL, NULL, 10, 100000);

which is 12.5us on my machine whereas minimum should be should be about 6us: two hashes (=0.46us), three mul and sqrt (5.2us) and gej_add_ge_var (0.228us).

@elichai
Copy link
Contributor

elichai commented Dec 10, 2019

For future reference, I also think this lacks a multiset serialization/parse methods for persistent storage (unless the author expected people to write the 96 byte struct to persistent storage, which would be really inconsistent with this library's API)

ftrader pushed a commit to bitcoin-cash-node/bitcoin-cash-node that referenced this pull request Jan 8, 2023
…ltiset

Summary
-------

This is part of initial work for the UTXO Fast Sync project.

As per the spec, we will be using the multiset pubkey (rather than the
pubkey hash) to identify a UTXO set (or UTXO commitment).
Using the pubkey has the advantage that it means we can always reconstruct
the multiset object given a serialized pubkey.

The code for ec-multiset we have in secp256k1 could only produce a hash
of the ec-multiset, and not the pubkey itself.  However, the pubkey is kept
in memory internally.  This MR adds new functions to the secp256k1 lib to
retrieve the pubkey:

- Serialize an empty or non-empty multiset to a 33 byte compressed pubkey.
- Unserialize a 33-byte compressed pubkey to an ec multiset.
- Test if an ec-multiset is empty.

This MR also adds a bunch of tests for the above.

This MR also enables the EC multiset module in secp256k1 because it will
be needed for subsequent MRs that will depend on this one.
In case you are wondering, it's safe to enable it, since if you don't use
it, it does no harm and is inert in the codebase.

I checked with Core and ABC and no subsequent work or commits exist for
the ec-multiset module, which was originally contributed to ABC here:

https://reviews.bitcoinabc.org/D1072

Core has an extant PR for similar work here which was never merged:

bitcoin-core/secp256k1#477.

With this MR we will have the most feature-rich implementation of the module.

Test Plan
---------

- `ninja check-secp256k1`

(you may have to clear your cmake cache and/or start from a clean build/
directory because of bugs in our cmake building for secp256k1)
@real-or-random
Copy link
Contributor

Our parent project Bitcoin Core has MuHash3072 now (https://doxygen.bitcoincore.org/class_mu_hash3072.html). I feel that obsoletes this PR. (But please don't hesitate to disagree.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants