-
Notifications
You must be signed in to change notification settings - Fork 296
Roundtrip non-malformed test cases through wasmparser / wasmprinter #2148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roundtrip non-malformed test cases through wasmparser / wasmprinter #2148
Conversation
This is the last PR in my series. I think it is something of an extension of #2146 so no objections to landing that first. |
crates/wasmparser/src/lib.rs
Outdated
@@ -330,7 +330,7 @@ macro_rules! _for_each_operator_group { | |||
// reference-types | |||
// https://github.com/WebAssembly/reference-types | |||
@reference_types { | |||
TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1) | |||
TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this is another example which is sort of a step-too-far in following the spec exactly as-is. Primarily from a performance perspective this means that any time this instruction is parsed it ends up doing a heap allocation where previously it didn't, and the only purpose is to check the length of the vector and ensure that it's one.
This is I think another example of where the spec is written in a way that doesn't really consider the consequence whether something is malformed or invalid. To me this would ideally result in a change to the binary grammar of the spec to say that a non-1-length typed select is a binary syntax error rather than a validation error.
For now could the select.wast
test be excluded from the round-trip of assert_invalid
directives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough... what do you think of the just-pushed approach (where visit_typed_select_multi
is another visitor function off to the side)?
I really would love to keep the invariant that every parseable module can be roundtripped (especially since we're so close, and my group is trying to write a Wasm text editor that enforces the syntax), but I agree it shouldn't affect performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti
as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Done.
@@ -156,7 +157,7 @@ struct State { | |||
core: CoreState, | |||
#[cfg(feature = "component-model")] | |||
component: ComponentState, | |||
custom_section_place: Option<&'static str>, | |||
custom_section_place: Option<(&'static str, usize)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new usize
variable is to me pretty subtle and is something I'd prefer to avoid. Locally I see that this test is what motivated this change. Would it make sense to exclude module binary
from both binary roundtripping and text roundtripping? (for similar reasons effectively). That would result in there being no more need for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the issue is that if a section is empty, you don't want to update the "after xxx" custom section place because nothing got printed (and so the module won't roundtrip if you do this twice).
Given that this gets us to 100% binary-to-binary roundtrippability of all modules that originated in text, and 100% text-to-text roundtrippability of all modules that originated in binary, I would love to find a way to keep that invariant -- it feels like a really nice property to have and ensure. E.g. it did successfully catch a bug in that the wasmprinter was printing after element
instead of after elem
.
Are you open to some alternative implementation strategy here? We basically just need to make sure that the custom section place doesn't get updated if the section was empty -- I don't have any strong feelings about how we do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that all makes sense, and I agree it's nice to catch the after element
bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option
are and how the line number is being used to detect empty sections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
58969d7
to
0d74fd0
Compare
Handle multi-value typed select, even though this is currently invalid (but not malformed). Not handled: invalid components that use forward-referring name references (which the component parser will reject)
0d74fd0
to
5c973b2
Compare
if tys.len() != 1 { | ||
bail!(self.offset, "invalid result arity"); | ||
} | ||
self.visit_typed_select(tys.pop().unwrap()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this debug_assert!
that tys.len() != 1
and then always return with bail!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Done.
src/bin/wasm-tools/dump.rs
Outdated
match stringify!($visit).strip_prefix("visit_").unwrap() { | ||
"typed_select_multi" => "typed_select", | ||
other => other, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is just a debugging utility I think it's reasonable to print typed_select_multi
in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -156,7 +157,7 @@ struct State { | |||
core: CoreState, | |||
#[cfg(feature = "component-model")] | |||
component: ComponentState, | |||
custom_section_place: Option<&'static str>, | |||
custom_section_place: Option<(&'static str, usize)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that all makes sense, and I agree it's nice to catch the after element
bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option
are and how the line number is being used to detect empty sections?
crates/wasmparser/src/lib.rs
Outdated
@@ -330,7 +330,7 @@ macro_rules! _for_each_operator_group { | |||
// reference-types | |||
// https://github.com/WebAssembly/reference-types | |||
@reference_types { | |||
TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1) | |||
TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti
as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)
🙏 🎆 Thank you! |
And of course thank you for working on this (and working with me on the review)! |
This PR modifies wasmprinter and wasm-encoder to support roundtripping all non-malformed examples in the testsuite. The main changes were to
ensure that component printer doesn't use forward-referring name references in the text format (these are rejected by the component parser)It then modifies the
wast
subcommand to run eachassert_invalid
test roundtrip through the text format and back.