-
Notifications
You must be signed in to change notification settings - Fork 295
[test] Reclassify some tests between "invalid" and "malformed" #2130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(global (shared i32) (i32.const 0)) | ||
(assert_malformed | ||
(module quote | ||
"(global (shared i32) (i32.const 0))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this, and some tests below, switched to module quote
?
Or possible answering my own question I was looking at some of the future patches and I saw more handling of features in BinaryReader
, so is that what's happening here? Personally I've always tried to minimize the interaction of features with parsing and put everything in validation instead. I've felt historically that while, technically yes, this module produces an invalid binary with respect to the current spec it's just way easier to develop a proposal and evaluate it if the validator is the only piece of tooling that needs configuration (as opposed to every layer of tooling needing feature configuration)
In that sense I'd prefer to keep this as-is with assert_invalid
and the error only coming up with the validator, but I'm also want to get your take on this as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: assert_invalid
vs. assert_malformed
, I think the issue is that this error is currently detected in the binary reader (https://github.com/bytecodealliance/wasm-tools/blob/main/crates/wasmparser/src/readers/core/globals.rs#L42-L54). This seems like the right behavior because of the upstream tests that make this malformed (https://github.com/WebAssembly/spec/blob/main/test/core/global.wast#L414).
If it were accept_invalid
, it should be parseable and printable and then parseable again to produce the same binary; accept_malformed
means we're not expecting to be able to parse or print it which reflects the status quo in this case.
Re: module
vs. module quote
, at least for wasm-tools this doesn't need to be a module quote
, because the error is only detected by the binary parser (not the text parser). The method I'm proposing for the assert_malformed
tests is basically not to change the text parsers at all, and instead run each test by first (1) using the wast
crate to convert WAT to bytes, and then (2) running wasmparser on those bytes to see if they parse as binary Wasm. That way all the safety checks are done in the binary reader/parser and it doesn't matter where the module originally came from (text or binary). But in general I think it's good practice for an assert_malformed
test to use module quote
or module binary
, because if the text parser is what catches the malformedness (e.g. for something like mismatched parentheses or quotation marks), then unless the module is quoted, it would doom the whole .wast file. If we want these tests to pass in the reference interpreter (once the shared-everything-threads proposal has a reference interpreter) or another engine, I think they would need module quote
or module binary
.
(func end return_call 0)) | ||
(assert_malformed | ||
(module quote | ||
"(func end return_call 0)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we've had bugs in both the text and binary format in the past about handling operators-after-end, notably panics. Would you be up for adding module binary
versions of these tests in addition to module quote
so we can be guaranteed at least one of them hits the binary validator as opposed to having them all fail in the text parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to do it, but fwiw I'm basically not anything to the text parser and the text parser won't catch this by itself -- it's only the binary parser that will keep track of the blocks and make sure the end
s are well-matched. But I don't mind extra safety -- will add. :-)
(Per above: the method I'm proposing for the assert_malformed tests is basically not to change the text parsers at all, and instead run each test by first (1) using the wast crate to convert WAT to bytes, and then (2) running wasmparser on those bytes to see if they parse as binary Wasm. That way all the safety checks are done in the binary reader/parser and it doesn't matter where the module originally came from (text or binary).)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
(func end br_table 0)) | ||
(assert_malformed | ||
(module quote | ||
"(func end br_table 0)") | ||
"operators remaining after end of function") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment on these tests as to the ones above, would you be up for adding a module binary
dual for these malformed tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will add. But same note as above -- only the binary parser is going to be catching this kind of error anyway so I think the chances of a text/binary mismatch (in the current codebase) are very low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll continue some discussion of malformed/invalid over on #2123, and in the meantime I'll flag this for merge
This PR is preparation for an invalid/malformed separation in the wast subcommand and for wasm-tools to be able to roundtrip any non-malformed module or component (the overall patchset is at https://github.com/keithw/wasm-tools/tree/malformed-invalid-separation).
It reclassifies some of the local tests between "malformed" (a violation of the binary-format or text-format syntaxes) vs. "invalid". I think these tests now match the spec more closely, although I haven't checked them all in the reference interpreter because the error messages aren't exactly the same and the reference interpreter doesn't make it easy to ignore them.
This doesn't have any effect yet because
wasm-tools wast
currently handlesassert_malformed
andassert_invalid
the same way, but it will make a difference once they are separated.