-
Notifications
You must be signed in to change notification settings - Fork 37.8k
Remove core dependencies from CScript #4981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// Copyright (c) 2009-2014 The Bitcoin developers | ||
// Distributed under the MIT software license, see the accompanying | ||
// file COPYING or http://www.opensource.org/licenses/mit-license.php. | ||
#include "script/binarydata.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can you add a newline above.
@theuni Sorry for hitting you with that bunch of nits, but they should be fixed IMHO. |
if (!vch.size()) | ||
return ""; | ||
std::string rv; | ||
static const char hexmap[16] = { '0', '1', '2', '3', '4', '5', '6', '7', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't implement hexadecimal conversion code here. We've spent a lot of time going from duplicated code like this left around all over the place to the current state.
My point of moving the string encoding/decoding to a separate utility file without dependencies (utilstrencodings) was that we can use those in places like here.
@theuni After the last few of my style nits are fixed you've got my style ACK. No, really thanks for updating this pull and taking care of the style stuff :)! |
@Diapolo I'd put a rollseyes emoticon in the commit message if I could ;) @laanwj What do you suggest? It'd defeat the purpose to depend on Core for the parsing. This seems like a reasonable place to make an exception, but I understand the case against as well. utilstrencodings.h itself isn't a terrible dependency, as it only depends on tinyformat. I suppose we could call that part of our "shipped" script lib, but it'd be a shame for such a small thing. The alternative would be using HexStr(script.begin(),script.end()) everywhere in Core that currently uses script.ToString(), but that would be really ugly. |
@theuni Ok if a hexstr is really, truly, the only thing you need in the script lib, I'm fine with making an exception. This is trivially simple code. One other remark though. The old implementation of that function is:
This makes sure that normal scriptnums are printed as numbers instead of hex strings, which people have come to expect. The new one always shows a hex string. At least move the implementation to the implementation file instead of the header (then the header doesn't have to depend on tinyformat/strencodings/... even if you use it, and the library doesn't have to export those headers). |
@laanwj That all sounds perfectly reasonable. How about this, then: Since there's still a good bit of work left before we can break out verification entirely, I'll leave the include for CScript for now. If, at the end, it's still the only thing needed from core, we'll dupe it here as you've specified. Sound good? |
I disagree with including script.h in key; that makes no sense - script is higher level. I see why you're doing that - the problem is CScriptID being declared in key.h in the first place. I think it should move to script/standard.h instead. |
@sipa: I was also unhappy with that include and tried moving CScriptID to a few places, but never found a better home. I'll give that a shot. |
Moved CScriptID to standard.h as @sipa requested. Re-added the string processing deps as described above. When the library can stand on its own, we can evaluate whether it's worth it to dupe the core functions there. |
|
||
#include <boost/variant.hpp> | ||
Allows arbitrary structures to be written to a script without CScript |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary to be virtual and overridable? Just having a templated method/constructor that takes any type, and copied the vector from x.begin() to x.end() seems like it would suffice. This is really just convenience - the user of CScript can always use his own code to first convert to a vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it was done that way so that a user could make a distinction between input types and define his own serializers. If you'd rather avoid the overhead, I'm fine with a templated ctor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I understand, I just don't think it's worth it for the cases where we need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @sipa. Let's avoid virtuals here. There is no need for external parties to add anything here.
if (size) | ||
{ | ||
assert(size <= UCHAR_MAX); | ||
m_vch.resize(size+1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a readability point I'd greatly prefer:
m_vch.reserve(size + 1);
m_vch.push_back(size);
m_vch.insert(m_vch.end(), in.begin(), in.end());
(edit: oeps)
Nit addressed. Will squash when ready. |
size_t size = in.size(); | ||
if (size) | ||
{ | ||
assert(size <= UCHAR_MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is not strict enough; the size needs to be at most 75 for this to work.
An alternative is not serializing the size in here, and doing that inside CScript::operator<<, which selects based on the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And when combined with BIP62, the data pushed should be at least 2 bytes too.
utACK. I'm fine with squashing now. |
squashed and rebased. |
made the const ref change as requested by @sipa and squashed it down. |
{ | ||
insert(end(), sizeof(b)); | ||
insert(end(), (unsigned char*)&b, (unsigned char*)&b + sizeof(b)); | ||
assert(data.size() >= 2 && data.size() <= MAX_SCRIPT_ELEMENT_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the only point of using a CScriptBinaryData over just using vectors is to assert that they are at least 2 bytes in size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and that they're prefixed with the size?
Edit: right? I don't see the size being added here, but it used to be the case in the last implementation of CScriptBinaryData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be no size added here. The next line just calls this << std::vector, which will prefix the size properly...I dont see why we need CScriptBinaryData?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the size is added somewhere else.
AFAIK CScriptBinaryData is necessary to convert the argument (which is usually a uint160 or derived type) into a std::vector. It could be a function instead of a class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this has been through many iterations now. Originally it was polymorphic so that eventual users could add their own serialization functionality. Then that was dropped and serialization was done on a per-type basis. Then that was dropped, and it was templated so that it's all done the same way.
So I suppose this should go one of two ways, rather than the hybrid that it is now
- keep the class, which allows for implicit conversion from anything with begin/end iterators. That would mean that we could drop nearly all of b25d5f0dd531aaf5618f077c646afa556f6277ad, which should've been done when it was templatized.
- Make it a function, and use as in b25d5f0dd531aaf5618f077c646afa556f6277ad.
I really don't care either way. Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See-also comment thread on #5091 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still really don't mind either way. I see the merit in being explicit, and I see the merit in letting CScript handle whatever comes its way.
Sounds like everyone's ok with making the change to ToByteVector here, then discussing #5091 as a follow-up. It's fine by me if ToByteVector is essentially reverted there, as long as progress is being made.
I'll push that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not planning to revert ToByteVector in #5091 - I was just trying to avoid touching too much code there, but as it's exactly code you're already touching, that's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still really don't mind either way. I see the merit in being explicit, and I see the merit in letting CScript handle whatever comes its way.
I see merit in it, and I wouldn't care either way if this was, for example, GUI code. But for consensus-critical code it's too dangerous to just "handle everything that comes its way in some way". I like to be explicit and clear in what gets handled how and what the representation is. And to have compile-time errors when something is fishy. Remember the fun a while ago with std::vector<unsigned char>(a.begin,e.end())
with a
being a std::vector<int>()
"just working" by truncating each value?
We could of course just have a CScript::operator<<(T begin, T end) { std::vector vch(begin, end); *this << vch; } ? |
Updated to use ToByteVector, squashed, and rebased. ToByteVector() is currently in script.h, which is not a good place for it, but I don't see a better one that wouldn't drag in the deps we're working to remove here. I'm hoping we can at least agree to leave it there long enough to get this merged in, then move it around later if desired. Also, I didn't add any extra checks or conditions to the operator<<(vector), since #5091 reworks that. |
utACK |
Agree with @jtimon; the first group of 3 commits and the second 3 commits make much more sense when squashed together, for example. |
|
Ok. I find it easier to review/digest that way, but I'll keep em more grouped together. |
… from CScripts This allows for a reversal of the current behavior. This: CScript foo; CScriptID bar(foo.GetID()); Becomes: CScript foo; CScriptID bar(foo); This way, CScript is no longer dependent on CScriptID or Hash();
This should move to a util header once their dependencies are cleaned up.
…pt.h Lots of files ended up with indirect includes from script.h.
Squished down deeper. The header removal commit remains on top though, that was the purpose of this series. Not re-rebased against master, so this is identical code-wise to previously ACKed changes. |
utACK |
85c579e script: add a slew of includes all around and drop includes from script.h (Cory Fields) db8eb54 script: move ToString and ValueString out of the header (Cory Fields) e9ca428 script: add ToByteVector() for converting anything with begin/end (Cory Fields) 066e2a1 script: move CScriptID to standard.h and add a ctor for creating them from CScripts (Cory Fields)
As an ongoing effort to break out script verification. I hope this doesn't conflict with @sipa's or @jtimon's next steps.
This work moves the logic for writing custom core classes to scripts out of CScript. Those classes can describe how their data should be written to a script, and CScript does so without further knowledge.
Please ignore the entire PR's diffstat while reviewing as it won't be at all helpful. Individual commits will make much more sense.
Also, CScriptBinaryData is a pretty terrible name but I couldn't come up with anything better. Suggestions welcome.