Remove core dependencies from CScript #4981

theuni · 2014-09-25T07:29:52Z

As an ongoing effort to break out script verification. I hope this doesn't conflict with @sipa's or @jtimon's next steps.

This work moves the logic for writing custom core classes to scripts out of CScript. Those classes can describe how their data should be written to a script, and CScript does so without further knowledge.

Please ignore the entire PR's diffstat while reviewing as it won't be at all helpful. Individual commits will make much more sense.

Also, CScriptBinaryData is a pretty terrible name but I couldn't come up with anything better. Suggestions welcome.

Diapolo · 2014-09-25T11:28:21Z

src/script/binarydata.cpp

+// Copyright (c) 2009-2014 The Bitcoin developers
+// Distributed under the MIT software license, see the accompanying
+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
+#include "script/binarydata.h"


Nit: Can you add a newline above.

Diapolo · 2014-09-25T11:38:26Z

@theuni Sorry for hitting you with that bunch of nits, but they should be fixed IMHO.

laanwj · 2014-09-25T11:53:19Z

src/script/script.h

+    if (!vch.size())
+      return "";
+    std::string rv;
+    static const char hexmap[16] = { '0', '1', '2', '3', '4', '5', '6', '7',


Please don't implement hexadecimal conversion code here. We've spent a lot of time going from duplicated code like this left around all over the place to the current state.
My point of moving the string encoding/decoding to a separate utility file without dependencies (utilstrencodings) was that we can use those in places like here.

Diapolo · 2014-09-25T20:52:20Z

@theuni After the last few of my style nits are fixed you've got my style ACK. No, really thanks for updating this pull and taking care of the style stuff :)!

theuni · 2014-09-25T21:13:59Z

@Diapolo I'd put a rollseyes emoticon in the commit message if I could ;)
Keeping things uniform is nice though, so thanks for reviewing.

@laanwj What do you suggest? It'd defeat the purpose to depend on Core for the parsing. This seems like a reasonable place to make an exception, but I understand the case against as well.

utilstrencodings.h itself isn't a terrible dependency, as it only depends on tinyformat. I suppose we could call that part of our "shipped" script lib, but it'd be a shame for such a small thing.

The alternative would be using HexStr(script.begin(),script.end()) everywhere in Core that currently uses script.ToString(), but that would be really ugly.

laanwj · 2014-09-26T07:19:39Z

@theuni Ok if a hexstr is really, truly, the only thing you need in the script lib, I'm fine with making an exception. This is trivially simple code.

One other remark though. The old implementation of that function is:

if (vch.size() <= 4)
    return strprintf("%d", CScriptNum(vch).getint());
else
    return HexStr(vch);

This makes sure that normal scriptnums are printed as numbers instead of hex strings, which people have come to expect. The new one always shows a hex string.

At least move the implementation to the implementation file instead of the header (then the header doesn't have to depend on tinyformat/strencodings/... even if you use it, and the library doesn't have to export those headers).

theuni · 2014-09-26T17:12:46Z

@laanwj That all sounds perfectly reasonable. How about this, then: Since there's still a good bit of work left before we can break out verification entirely, I'll leave the include for CScript for now. If, at the end, it's still the only thing needed from core, we'll dupe it here as you've specified.

Sound good?

sipa · 2014-09-26T17:39:04Z

I disagree with including script.h in key; that makes no sense - script is higher level.

I see why you're doing that - the problem is CScriptID being declared in key.h in the first place. I think it should move to script/standard.h instead.

theuni · 2014-09-26T17:55:00Z

@sipa: I was also unhappy with that include and tried moving CScriptID to a few places, but never found a better home. I'll give that a shot.

theuni · 2014-09-26T19:31:53Z

Moved CScriptID to standard.h as @sipa requested.

Re-added the string processing deps as described above. When the library can stand on its own, we can evaluate whether it's worth it to dupe the core functions there.

sipa · 2014-09-26T20:18:09Z

src/script/script.h


-#include <boost/variant.hpp>
+   Allows arbitrary structures to be written to a script without CScript


Is this necessary to be virtual and overridable? Just having a templated method/constructor that takes any type, and copied the vector from x.begin() to x.end() seems like it would suffice. This is really just convenience - the user of CScript can always use his own code to first convert to a vector.

Yea, it was done that way so that a user could make a distinction between input types and define his own serializers. If you'd rather avoid the overhead, I'm fine with a templated ctor.

Yeah, I understand, I just don't think it's worth it for the cases where we need it.

Agreed with @sipa. Let's avoid virtuals here. There is no need for external parties to add anything here.

laanwj · 2014-10-14T09:13:52Z

src/script/script.h

+        if (size)
+        {
+            assert(size <= UCHAR_MAX);
+            m_vch.resize(size+1);


From a readability point I'd greatly prefer:

m_vch.reserve(size + 1); m_vch.push_back(size); m_vch.insert(m_vch.end(), in.begin(), in.end());

(edit: oeps)

theuni · 2014-10-14T20:43:42Z

Nit addressed. Will squash when ready.

sipa · 2014-10-14T22:17:40Z

src/script/script.h

+        size_t size = in.size();
+        if (size)
+        {
+            assert(size <= UCHAR_MAX);


Hmm, this is not strict enough; the size needs to be at most 75 for this to work.

An alternative is not serializing the size in here, and doing that inside CScript::operator<<, which selects based on the size.

And when combined with BIP62, the data pushed should be at least 2 bytes too.

sipa · 2014-10-15T00:05:37Z

utACK. I'm fine with squashing now.

theuni · 2014-10-15T00:26:22Z

squashed and rebased.

theuni · 2014-10-15T03:30:26Z

made the const ref change as requested by @sipa and squashed it down.

TheBlueMatt · 2014-10-15T05:51:08Z

src/script/script.h

-    {
-        insert(end(), sizeof(b));
-        insert(end(), (unsigned char*)&b, (unsigned char*)&b + sizeof(b));
+        assert(data.size() >= 2 && data.size() <= MAX_SCRIPT_ELEMENT_SIZE);


It seems the only point of using a CScriptBinaryData over just using vectors is to assert that they are at least 2 bytes in size?

...and that they're prefixed with the size?

Edit: right? I don't see the size being added here, but it used to be the case in the last implementation of CScriptBinaryData

There should be no size added here. The next line just calls this << std::vector, which will prefix the size properly...I dont see why we need CScriptBinaryData?

Right, the size is added somewhere else.

AFAIK CScriptBinaryData is necessary to convert the argument (which is usually a uint160 or derived type) into a std::vector. It could be a function instead of a class.

Well, this has been through many iterations now. Originally it was polymorphic so that eventual users could add their own serialization functionality. Then that was dropped and serialization was done on a per-type basis. Then that was dropped, and it was templated so that it's all done the same way.

So I suppose this should go one of two ways, rather than the hybrid that it is now

keep the class, which allows for implicit conversion from anything with begin/end iterators. That would mean that we could drop nearly all of b25d5f0dd531aaf5618f077c646afa556f6277ad, which should've been done when it was templatized.

Make it a function, and use as in b25d5f0dd531aaf5618f077c646afa556f6277ad.

I really don't care either way. Opinions?

See-also comment thread on #5091 (comment)

So #5091 is a change I've wanted to do for a while - the part about replacing the pushers for uint160/CPubKey isn't important there. As this PR already has more work towards have some wrapping, it's probably easier to use something like the suggested ToByteVector here; I can rebase #5091.

I still really don't mind either way. I see the merit in being explicit, and I see the merit in letting CScript handle whatever comes its way.

Sounds like everyone's ok with making the change to ToByteVector here, then discussing #5091 as a follow-up. It's fine by me if ToByteVector is essentially reverted there, as long as progress is being made.

I'll push that change.

I'm not planning to revert ToByteVector in #5091 - I was just trying to avoid touching too much code there, but as it's exactly code you're already touching, that's fine.

I still really don't mind either way. I see the merit in being explicit, and I see the merit in letting CScript handle whatever comes its way.

I see merit in it, and I wouldn't care either way if this was, for example, GUI code. But for consensus-critical code it's too dangerous to just "handle everything that comes its way in some way". I like to be explicit and clear in what gets handled how and what the representation is. And to have compile-time errors when something is fishy. Remember the fun a while ago with std::vector<unsigned char>(a.begin,e.end()) with a being a std::vector<int>() "just working" by truncating each value?

sipa · 2014-10-15T19:46:10Z

We could of course just have a CScript::operator<<(T begin, T end) { std::vector vch(begin, end); *this << vch; } ?

theuni · 2014-10-16T17:53:44Z

Updated to use ToByteVector, squashed, and rebased.

ToByteVector() is currently in script.h, which is not a good place for it, but I don't see a better one that wouldn't drag in the deps we're working to remove here. I'm hoping we can at least agree to leave it there long enough to get this merged in, then move it around later if desired.

Also, I didn't add any extra checks or conditions to the operator<<(vector), since #5091 reworks that.

sipa · 2014-10-16T18:05:15Z

utACK

sipa · 2014-10-16T23:33:25Z

Agree with @jtimon; the first group of 3 commits and the second 3 commits make much more sense when squashed together, for example.

TheBlueMatt · 2014-10-16T23:57:56Z

theuni · 2014-10-17T00:20:14Z

Ok. I find it easier to review/digest that way, but I'll keep em more grouped together.
Will squash.

… from CScripts This allows for a reversal of the current behavior. This: CScript foo; CScriptID bar(foo.GetID()); Becomes: CScript foo; CScriptID bar(foo); This way, CScript is no longer dependent on CScriptID or Hash();

This should move to a util header once their dependencies are cleaned up.

…pt.h Lots of files ended up with indirect includes from script.h.

theuni · 2014-10-17T17:46:30Z

Squished down deeper. The header removal commit remains on top though, that was the purpose of this series. Not re-rebased against master, so this is identical code-wise to previously ACKed changes.

gavinandresen · 2014-10-17T19:11:01Z

utACK

85c579e script: add a slew of includes all around and drop includes from script.h (Cory Fields) db8eb54 script: move ToString and ValueString out of the header (Cory Fields) e9ca428 script: add ToByteVector() for converting anything with begin/end (Cory Fields) 066e2a1 script: move CScriptID to standard.h and add a ctor for creating them from CScripts (Cory Fields)

Diapolo reviewed Sep 25, 2014
View reviewed changes

laanwj reviewed Sep 25, 2014
View reviewed changes

sipa reviewed Sep 26, 2014
View reviewed changes

jtimon mentioned this pull request Sep 27, 2014

Replace SignatureHash() with class TxSignatureHasher #4989

Closed

theuni force-pushed the reducedeps branch from 3532a12 to ea9d733 Compare October 1, 2014 01:37

laanwj reviewed Oct 14, 2014
View reviewed changes

sipa reviewed Oct 14, 2014
View reviewed changes

theuni force-pushed the reducedeps branch from 6c28200 to 7fc8d4c Compare October 15, 2014 00:25

theuni force-pushed the reducedeps branch from 7fc8d4c to 0ed725f Compare October 15, 2014 00:39

TheBlueMatt reviewed Oct 15, 2014
View reviewed changes

laanwj mentioned this pull request Oct 16, 2014

Unify and simplify CScript push methods #5091

Closed

theuni force-pushed the reducedeps branch from 0ed725f to c558f43 Compare October 16, 2014 17:48

theuni added 3 commits October 17, 2014 13:44

script: add ToByteVector() for converting anything with begin/end

e9ca428

This should move to a util header once their dependencies are cleaned up.

script: move ToString and ValueString out of the header

db8eb54

theuni force-pushed the reducedeps branch from c558f43 to 85c579e Compare October 17, 2014 17:44

script: add a slew of includes all around and drop includes from scri…

85c579e

…pt.h Lots of files ended up with indirect includes from script.h.

jtimon mentioned this pull request Oct 18, 2014

MOVEONLY: Split Transaction and Block from core #5100

Merged

laanwj merged commit 85c579e into bitcoin:master Oct 22, 2014

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021


		#include <boost/variant.hpp>
		Allows arbitrary structures to be written to a script without CScript

Remove core dependencies from CScript #4981

Remove core dependencies from CScript #4981

Uh oh!

Conversation

theuni commented Sep 25, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Diapolo commented Sep 25, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Diapolo commented Sep 25, 2014

Uh oh!

theuni commented Sep 25, 2014

Uh oh!

laanwj commented Sep 26, 2014

Uh oh!

theuni commented Sep 26, 2014

Uh oh!

sipa commented Sep 26, 2014

Uh oh!

theuni commented Sep 26, 2014

Uh oh!

theuni commented Sep 26, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

theuni commented Oct 14, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sipa commented Oct 15, 2014

Uh oh!

theuni commented Oct 15, 2014

Uh oh!

theuni commented Oct 15, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sipa commented Oct 15, 2014

Uh oh!

theuni commented Oct 16, 2014

Uh oh!

sipa commented Oct 16, 2014

Uh oh!

sipa commented Oct 16, 2014

Uh oh!

TheBlueMatt commented Oct 16, 2014

Uh oh!