Skip to content

[test] Reclassify some tests between "invalid" and "malformed" #2130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 9, 2025

Conversation

keithw
Copy link
Contributor

@keithw keithw commented Apr 9, 2025

This PR is preparation for an invalid/malformed separation in the wast subcommand and for wasm-tools to be able to roundtrip any non-malformed module or component (the overall patchset is at https://github.com/keithw/wasm-tools/tree/malformed-invalid-separation).

It reclassifies some of the local tests between "malformed" (a violation of the binary-format or text-format syntaxes) vs. "invalid". I think these tests now match the spec more closely, although I haven't checked them all in the reference interpreter because the error messages aren't exactly the same and the reference interpreter doesn't make it easy to ignore them.

This doesn't have any effect yet because wasm-tools wast currently handles assert_malformed and assert_invalid the same way, but it will make a difference once they are separated.

(global (shared i32) (i32.const 0))
(assert_malformed
(module quote
"(global (shared i32) (i32.const 0))"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this, and some tests below, switched to module quote?

Or possible answering my own question I was looking at some of the future patches and I saw more handling of features in BinaryReader, so is that what's happening here? Personally I've always tried to minimize the interaction of features with parsing and put everything in validation instead. I've felt historically that while, technically yes, this module produces an invalid binary with respect to the current spec it's just way easier to develop a proposal and evaluate it if the validator is the only piece of tooling that needs configuration (as opposed to every layer of tooling needing feature configuration)

In that sense I'd prefer to keep this as-is with assert_invalid and the error only coming up with the validator, but I'm also want to get your take on this as well

Copy link
Contributor Author

@keithw keithw Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: assert_invalid vs. assert_malformed, I think the issue is that this error is currently detected in the binary reader (https://github.com/bytecodealliance/wasm-tools/blob/main/crates/wasmparser/src/readers/core/globals.rs#L42-L54). This seems like the right behavior because of the upstream tests that make this malformed (https://github.com/WebAssembly/spec/blob/main/test/core/global.wast#L414).

If it were accept_invalid, it should be parseable and printable and then parseable again to produce the same binary; accept_malformed means we're not expecting to be able to parse or print it which reflects the status quo in this case.

Re: module vs. module quote, at least for wasm-tools this doesn't need to be a module quote, because the error is only detected by the binary parser (not the text parser). The method I'm proposing for the assert_malformed tests is basically not to change the text parsers at all, and instead run each test by first (1) using the wast crate to convert WAT to bytes, and then (2) running wasmparser on those bytes to see if they parse as binary Wasm. That way all the safety checks are done in the binary reader/parser and it doesn't matter where the module originally came from (text or binary). But in general I think it's good practice for an assert_malformed test to use module quote or module binary, because if the text parser is what catches the malformedness (e.g. for something like mismatched parentheses or quotation marks), then unless the module is quoted, it would doom the whole .wast file. If we want these tests to pass in the reference interpreter (once the shared-everything-threads proposal has a reference interpreter) or another engine, I think they would need module quote or module binary.

(func end return_call 0))
(assert_malformed
(module quote
"(func end return_call 0)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we've had bugs in both the text and binary format in the past about handling operators-after-end, notably panics. Would you be up for adding module binary versions of these tests in addition to module quote so we can be guaranteed at least one of them hits the binary validator as opposed to having them all fail in the text parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to do it, but fwiw I'm basically not anything to the text parser and the text parser won't catch this by itself -- it's only the binary parser that will keep track of the blocks and make sure the ends are well-matched. But I don't mind extra safety -- will add. :-)

(Per above: the method I'm proposing for the assert_malformed tests is basically not to change the text parsers at all, and instead run each test by first (1) using the wast crate to convert WAT to bytes, and then (2) running wasmparser on those bytes to see if they parse as binary Wasm. That way all the safety checks are done in the binary reader/parser and it doesn't matter where the module originally came from (text or binary).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(func end br_table 0))
(assert_malformed
(module quote
"(func end br_table 0)")
"operators remaining after end of function")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment on these tests as to the ones above, would you be up for adding a module binary dual for these malformed tests?

Copy link
Contributor Author

@keithw keithw Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add. But same note as above -- only the binary parser is going to be catching this kind of error anyway so I think the chances of a text/binary mismatch (in the current codebase) are very low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll continue some discussion of malformed/invalid over on #2123, and in the meantime I'll flag this for merge

@alexcrichton alexcrichton added this pull request to the merge queue Apr 9, 2025
Merged via the queue into bytecodealliance:main with commit 26dedbb Apr 9, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants