Skip to content

Conversation

TheBlueMatt
Copy link
Contributor

This does the minimal effort to generate a libbitcoinscript.so which can VerifyScript, though nothing else is exposed.

Build with ./configure --enable-shared=yes and you'll find a libbitcoinscript.so installed (or in src/.libs).

It weighs quite a bit, but it works, so I think its worth merging and then trimming down (and adding features) as we go (Ive already got one or two others ready, but first things first).

Please do not ACK this unless you've verified there are no changes before/after the copy/paste.




extern "C" bool VerifyScript(const unsigned char *scriptSig, const unsigned int scriptSigLen,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is necessary to pass scriptSig? It could be taken from txTo.vin[nIn].scriptSig?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more flexible; maybe the scriptSig being passed is some new form of signed message, for example. But it would make sense to accept NULL to do the extraction from txTo...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, fixed.

@jgarzik
Copy link
Contributor

jgarzik commented Aug 13, 2014

It does not make much sense to put CScript class to scriptutils.h. CScript is not a utility.

@sipa
Copy link
Member

sipa commented Aug 13, 2014

@jgarzik CScript is defined in script/script.h, scriptutils.h only contains whatever isn't necessary for validation.

@jgarzik
Copy link
Contributor

jgarzik commented Aug 13, 2014

@sipa Actually it was just the first commit. CScript is in scriptutils.h... until it is moved again in a further commit.

@theuni
Copy link
Member

theuni commented Aug 13, 2014

Great work! I really like this idea, it's something I've been wanting to do for a long time.

However, I would really prefer that this be broken up into 2 steps: splitting out libbitcoinscript as a convenience library, THEN exposing it as a shared lib once that's stabilized. There are just too many issues to work wrt packaging/distribution/api/abi/etc to introduce a shared lib haphazardly. Once that's exposed, it can't be taken back.

@jgarzik
Copy link
Contributor

jgarzik commented Aug 13, 2014

Generally agreed. I'm curious about the C API, versus C++. Any comment on that design decision? Do we want to start providing a C API in additional to C++? If yes, to what extent? All core datastructures?

Data point, may or may not be relevant: The few C++-native projects I know of that export a C API typically have a One Big C Wrapper approach, that sits on top of a first class C++ API. ie. you #include "bitcoin/c.h"

@theuni
Copy link
Member

theuni commented Aug 13, 2014

@jgarzik I think the c api approach is basically the only reasonable one until we have a set of cleaned up c++ structures that can be exported. We don't have any kind of c++ api, only internal data structures.

I think this is a great first step towards that though, and I would be very willing to work on it if others agree it's the way forward.

@sipa
Copy link
Member

sipa commented Aug 13, 2014

I like this approach in general. I don't think we're ready to start exposing a shared library with a stable API for core data structures, but the sufficiently encapsulated approach of "Take a bunch of serialized structures, and compute something with it" should be very safe. Libsecp256k1 uses the same approach by the way - none of the internal data structures are accessible through the public API. As the exposed functions are so simple, there's little reason to need a C++ api there.

@jtimon
Copy link
Contributor

jtimon commented Aug 13, 2014

Ack on the general intent. I'm precisely working on separating things on script and I think I'm close to find something that's ready for a PR, but so far I've been doing incremental little improvements first like #4617 #4680 #4681 #4555 #4694 and #4695
Right now it's pretty dirty with ugly commits and half-backed ideas, but you can get a general sense of where I'm headed here: https://github.com/jtimon/bitcoin/tree/script2
I'll clean it up soon, I just wasn't ready for a PR yet.

I'm making some things templated so that:

  1. The interpreter file is completely independent from core.h
  2. Maybe some people want to use the scripting language to sign other things beside bitcoin transactions?
    In particular I would like very few things in the same file as the interpreter (in my case checksig the sig cache, and verifyscript) to make it easier to read.

So I'm not sure what I should do now...we should definitely coordinate.
My approach was improving script first, separating it later, then move to its own module (this PR).
Maybe its own module first then imrpoving things like you're doing it's better.

@TheBlueMatt
Copy link
Contributor Author

@jtimon Well it looks like you're trying to make this library expose more and weight less, which is of course the goal, but I'm just trying to push this step-by-step, so I think they're very much compatible.

@TheBlueMatt
Copy link
Contributor Author

@jgarzik re: C-vs-C++: It was originally C++, but since its using literally nothing C++ in the API, I figured its easier to expose it as a C API and that way its easy to include from C++ or C code, whereas using vectors would make it hard from C...

@laanwj
Copy link
Member

laanwj commented Aug 14, 2014

+1 for this C API that doesn't expose any of the internal data structures and is purely data in/out. That's much easier to wrap in Python (ctypes) and a bunch and other languages.
Dumb interoperability is what we need here, not a fancy interface.
(though that can, of course, be done later! But my point is that it shouldn't hold up this very important work)

@TheBlueMatt
Copy link
Contributor Author

The goal is to continue to use this to expose only the minimum required APIs for someone to use much of Bitcoin Core's validation engine to do the consensus-critical stuff and can still feel special for having re-implemented things.

@TheBlueMatt
Copy link
Contributor Author

Note that the library is very deliberately not built by default (yet), also script/libbitcoinscript.h should move to an include/ directory, I think, but, again, I'll leave that for @theuni after this gets merged.

@theuni
Copy link
Member

theuni commented Aug 14, 2014

@TheBlueMatt for this PR, would you mind removing the libtool lib so that the code changes can be reviewed independently of the build discussion? There are several build-side things that I'd like to address before actually introducing the lib, but that shouldn't slow this down.

I'll verify the code movement is 1:1 with the originals tomorrow if no one beats me to it.

@TheBlueMatt
Copy link
Contributor Author

@theuni hmm? its all disabled-by-default, whats the harm? Also, having more people verify 1:1 is always welcome :).

@TheBlueMatt
Copy link
Contributor Author

(and not in configure --help yet, either, as its not supported)

@jtimon
Copy link
Contributor

jtimon commented Aug 14, 2014

So I have manually reproduced all the code movements until I've got to a 0-diff branch here:
https://github.com/jtimon/bitcoin/tree/libscript2

In the last commit you can see only pure changes instead of code movements (plus some include fixes). Nothing bad detected in the process, but yeah, the more people that independently verify is 1:1 the better, I'll leave that branch there in case is useful for somebody else verifying 1:1.

ACK

@theuni
Copy link
Member

theuni commented Aug 14, 2014

I went through and manually c/p from the old files to the new. The diffs lined up perfectly other than a few expected formatting changes.

I really don't like the fact that CheckSig is defined differently based on which object you're building (standalone vs unified). Among other things, it’s confusing to read and means that they can’t both be used at the same time (for ex, a test to compare their results).

It also means that the convenience lib can’t be used by the share libs, so several objects have to be built twice to cope.

Can this not be handled with a delegate function or extra param (cache flag or function pointer) to VerifyScript?

@jtimon
Copy link
Contributor

jtimon commented Aug 15, 2014

I would prefer to merge this as is and then change the header to add a
cache bool param and remove the redundancy. I'm biased since this blocks
further refactor work in script and I just prefer to have it merged soon
than it being perfect. It's good enough for first step I wouldn't like to
delay.

@sipa
Copy link
Member

sipa commented Aug 15, 2014

I would prefer CheckSig to remain where it was, but take the
CSignatureCache object as an optional parameter, which is queried and
updated when provided. The usual calls inside script could use the default
cache, the call in the C wrapper interface could pass NULL.

This would also be compatible with potentially later extending the
interface to have context objects, which could hold such a cache.

Making things leaner, like by moving the cache functionality out can be
done later.

Pieter

On Aug 15, 2014 12:33 PM, "Jorge Timón" notifications@github.com wrote:

I would prefer to merge this as is and then change the header to add a
cache bool param and remove the redundancy. I'm biased since this blocks
further refactor work in script and I just prefer to have it merged soon
than it being perfect. It's good enough for first step I wouldn't like to
delay.
On Aug 15, 2014 12:07 AM, "Cory Fields" notifications@github.com wrote:

I went through and manually c/p from the old files to the new. The diffs
lined up perfectly other than a few expected formatting changes.

I really don't like the fact that CheckSig is defined differently based
on
which object you're building (standalone vs unified). Among other
things,
it's confusing to read and means that they can't both be used at the
same
time (for ex, a test to compare their results).

It also means that the convenience lib can't be used by the share libs,
so
several objects have to be built twice to cope.

Can this not be handled with a delegate function or extra param (cache
flag or function pointer) to VerifyScript?

Reply to this email directly or view it on GitHub
#4692 (comment).

Reply to this email directly or view it on GitHub
#4692 (comment).

@theuni
Copy link
Member

theuni commented Aug 15, 2014

I still contend that this should be split up so that the code movement and lib issues can be addressed separately. I really don't see any disadvantage in that.

That said, I'd go along with a merge if the issues above will be worked out afterwards.

@jtimon
Copy link
Contributor

jtimon commented Aug 16, 2014

yeah I would do the code movement from script to scriptutils and script/script first, that would be very simple to merge and the following standalone PR would also be more readable later, but please, let's merge something soon.

@petertodd
Copy link
Contributor

We should put the word "consensus" or at least "validation" into the name of this library somewhere. I'd like us to send a clear message that you can rely on it to accurately validate scripts, and that this library does so significantly more reliably than other alternatives.

Also, ACK on concept of a C-level interface "black box". I'd rather keep what it does and how it exposes functionality as stripped down as possible. For instance I'd add this to python-bitcoinlib by making VerifyScript call this library behind the scenes, retaining the Python code for non-consensus-critical work.

@sipa
Copy link
Member

sipa commented Aug 19, 2014

It's two different APIs, with different goals and (partially) different
features. I don't care about duplicating a few flags.

@sipa
Copy link
Member

sipa commented Aug 19, 2014

The public API flags could even be a simple enum (not bit flags), with
values like verify_prebip16, verify_bip16, verify_strict, ...

@jgarzik
Copy link
Contributor

jgarzik commented Aug 19, 2014

+1 for keeping the bit flags, if you can

Though in general agree w/ @sipa When defining APIs, sometimes you wind up with a bit of duplication. Not a big issue in the bigger picture.

@@ -214,7 +214,9 @@ class CTransaction
private:
/** Memory only. */
const uint256 hash;
void UpdateHash() const;
void UpdateHash() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale for moving this implementation to the header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its needed to not have to build all of core.cpp, which pulls in quite a bit more weight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's broken. If there is logic in core (whether it's in h or cpp) that you don't want, it should be split up as a whole.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? it needs two functions, moving them to a separate file makes no sense, its two lines of code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind moving these functions. But not including the .cpp source in the library while you are relying on its header file is fragile - just a refactor may break the library. IMHO, the .h and .cpp should be considered as a unit, and if it it turns out that we want the library to be leaner, then that means that the source code needs to be changed to support that.

I'd prefer to not do any such things, and just get the library in a functioning state with minimal changes. Making it leaner can always be done later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have several functions implemented in core.h, including core.cpp doesnt just pull in core.cpp, but quite a few others quite quickly. The library weighs a lot now, but including core.cpp would end up with half of bitcoind...
re: modifying the header file and breaking build? That's what we have pull-tester (or Travis) for, they tell you if you break something dumb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of core.h without core.cpp simply signifies "I'm using the definitions and maybe a few of the serialization/deserialization routines, but I dont care about most of the object"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, the first step in making the library lighter later, would be exactly this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @sipa here. Include files are the interface for the implementation file. Including one but not linking the other is confusing. Please avoid that. Moving functions from the .cpp to the .h should not result in build errors.

I'm sure what you want can be solved some other way. I wouldn't worry about size of the resulting library at all in this initial pull.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If VerifyScript becomes a templated function as shown in jtimon@8862601 (well, something simpler and less dirty, that's just a thinking branch for now), the standalone can instantiate the functions with its own minimal CTransaction type without any need to link core.o or include core.h
Maybe other C++ implementations like libcoin or libbitcoin decide to consume this templated version directly instead of the C API, also with their own Transaction classes.

@TheBlueMatt TheBlueMatt force-pushed the libscript branch 2 times, most recently from 958e47f to f12a0de Compare August 20, 2014 03:03
The goal here is to move towards a C-callable libbitcoinscript.so
that any reimplementation can use for its script exectuion instead
of reimplementing it from scratch, which has been shown to be
incredibly error-prone.

This library will implement basic script parsing and execution, but
it is not the goal to expose things like transaction signing or
script analysis.

The end-goal is to move the things which will be built as a part of
this library to the script/ subfolder, with the exposed API in
script/bitcoinscript.h. Things which are used only by Bitcoin Core
will be in scriptutils.{h,cpp}.
This moves a bunch of functions around and, with minimal changes
generates a linkable (though not useful) libbitcoinscript.so.
@BitcoinPullTester
Copy link

Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/p4692_176ec02a6c567d84363ea2394f0508a4f67a82fc/ for binaries and test log.

This could happen for one of several reasons:

  1. It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts (please tweak those patches in qa/pull-tester)
  2. It adds/modifies tests which test network rules (thanks for doing that), which conflicts with a patch applied at test time
  3. It does not build on either Linux i386 or Win32 (via MinGW cross compile)
  4. The test suite fails on either Linux i386 or Win32
  5. The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)

If you believe this to be in error, please ping BlueMatt on freenode or TheBlueMatt here.

This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@TheBlueMatt
Copy link
Contributor Author

@theuni OK, I added the link to fix OSX. Right now this pull works on linux/osx (albeit with too many symbols exposed), and builds a static library on windows (no dll yet, needs the symbol exposure stuff worked out). So, either this pull needs to wait on the symbol exposure stuff and then just add the macros, or the symbol exposure pull needs to throw in the one macro, I dont see much of a difference there.

@theuni
Copy link
Member

theuni commented Aug 20, 2014

There's much more to it for building dll's. For starters, we'll need 2 sets of dependencies and a way to cope with them (dllimport vs dllexport). We'll have to rewrite a good chunk of configure to quit assuming static linkage for mingw. After those are done, we can teach our headers how to cope with the 2 types of builds. Then we'll need some tests to verify that the dll actually does something useful.

I have all of the above hacked together enough to test. They're non-trivial changes. In reality, I think time would be better spent trying to drop those dependencies for lib usage, since they're quite minimal already.

@TheBlueMatt
Copy link
Contributor Author

Yea, you need to have a define for building (dllexport) vs installing (dllimport), but really these aren't that hard. Anyway, its up to you, either put the the ability to build a dll on windows in a separate pull or this one can wait until the infrastructure for dllimport/dllexport/gcc symbol export is in place and I'll get it in here.

@petertodd
Copy link
Contributor

@TheBlueMatt re: naming, long term I think we need a "libbtcconsensus" that holds all the consensus-critical code, so why not just name the library that? Obviously it'll be incomplete, but we can move functionality over in pieces.

@TheBlueMatt
Copy link
Contributor Author

@petertodd yea, but I think we're so far away from that...I dont know if in the end we'll get a libbtcconsensus which uses a libbtcscript or just force everyone onto a libbtcconsensus (I might be in favor of the second, but I dont know what people will want/demand).

@TheBlueMatt
Copy link
Contributor Author

In any case, if we do get a single library, it would be to force people to let libbtc* handle all block thinking, and then just deal with its results, and would probably want to avoid exposing script stuff directly.

@jtimon jtimon mentioned this pull request Aug 23, 2014
@petertodd
Copy link
Contributor

@TheBlueMatt re: exposing script stuff directly, I'm not opposed to that even with a libbtcconsensus library, because often it can still be useful to be able to evaluate script execution independent of chain validity, for instance to determine if a transaction could be mined in the future.

Also a libbtcconsensus library will likely have more than just the consensus code strictly needed by a full node; we probably want a reference SPV implementation in there as well.

@laanwj
Copy link
Member

laanwj commented Aug 23, 2014

@petertodd Disagree about including a SPV implementation - a consensus library should be just the consensus, and the absolute minimum amount of code to implement that, nothing more. It's not meant as a reference anything, that's what Bitcoin Core is already. Well it could be included in the repository if it is clearly marked as an example of using the library.

@TheBlueMatt
Copy link
Contributor Author

I agree with @laanwj, my goal here and in the future is to provide Bitcoin Core's consensus in an easily-readable form to any clients who want it in a shared library. SPV...meh, its easy enough to implement SPV, the bugs creep in in the wallet, and I'm not sure we want to do that too.

@petertodd
Copy link
Contributor

@laanwj With headers-first my understanding is we will have essentially a SPV-consensus layer, which develops into full consensus as blocks are downloaded. Since a consensus library won't include networking code in it, I see no harm in ensuring that the SPV-consensus part of that library can be used as-is; most likely the natural path of development will make it so anyway.

@jtimon
Copy link
Contributor

jtimon commented Aug 28, 2014

Can someone help verify that the changes in #4754 are moving code only?
It will make this PR easier to read when reopened and the sooner it is merged, the sooner other changes in script will stop conflicting with these simple code movements.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants