Skip to content

Conversation

keithw
Copy link
Contributor

@keithw keithw commented Apr 15, 2025

This PR modifies wasmprinter and wasm-encoder to support roundtripping all non-malformed examples in the testsuite. The main changes were to

  • handle multi-value typed select (currently invalid, but not malformed), and
  • ensure that component printer doesn't use forward-referring name references in the text format (these are rejected by the component parser)

It then modifies the wast subcommand to run each assert_invalid test roundtrip through the text format and back.

@keithw
Copy link
Contributor Author

keithw commented Apr 15, 2025

This is the last PR in my series. I think it is something of an extension of #2146 so no objections to landing that first.

@@ -330,7 +330,7 @@ macro_rules! _for_each_operator_group {
// reference-types
// https://github.com/WebAssembly/reference-types
@reference_types {
TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1)
TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this is another example which is sort of a step-too-far in following the spec exactly as-is. Primarily from a performance perspective this means that any time this instruction is parsed it ends up doing a heap allocation where previously it didn't, and the only purpose is to check the length of the vector and ensure that it's one.

This is I think another example of where the spec is written in a way that doesn't really consider the consequence whether something is malformed or invalid. To me this would ideally result in a change to the binary grammar of the spec to say that a non-1-length typed select is a binary syntax error rather than a validation error.

For now could the select.wast test be excluded from the round-trip of assert_invalid directives?

Copy link
Contributor Author

@keithw keithw Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough... what do you think of the just-pushed approach (where visit_typed_select_multi is another visitor function off to the side)?

I really would love to keep the invariant that every parseable module can be roundtripped (especially since we're so close, and my group is trying to write a Wasm text editor that enforces the syntax), but I agree it shouldn't affect performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Done.

@@ -156,7 +157,7 @@ struct State {
core: CoreState,
#[cfg(feature = "component-model")]
component: ComponentState,
custom_section_place: Option<&'static str>,
custom_section_place: Option<(&'static str, usize)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new usize variable is to me pretty subtle and is something I'd prefer to avoid. Locally I see that this test is what motivated this change. Would it make sense to exclude module binary from both binary roundtripping and text roundtripping? (for similar reasons effectively). That would result in there being no more need for this change?

Copy link
Contributor Author

@keithw keithw Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the issue is that if a section is empty, you don't want to update the "after xxx" custom section place because nothing got printed (and so the module won't roundtrip if you do this twice).

Given that this gets us to 100% binary-to-binary roundtrippability of all modules that originated in text, and 100% text-to-text roundtrippability of all modules that originated in binary, I would love to find a way to keep that invariant -- it feels like a really nice property to have and ensure. E.g. it did successfully catch a bug in that the wasmprinter was printing after element instead of after elem.

Are you open to some alternative implementation strategy here? We basically just need to make sure that the custom section place doesn't get updated if the section was empty -- I don't have any strong feelings about how we do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that all makes sense, and I agree it's nice to catch the after element bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option are and how the line number is being used to detect empty sections?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@keithw keithw force-pushed the roundtrip-invalid-tests branch 3 times, most recently from 58969d7 to 0d74fd0 Compare April 17, 2025 16:52
keithw added 2 commits April 17, 2025 10:38
Handle multi-value typed select, even though this is
currently invalid (but not malformed).

Not handled: invalid components that use forward-referring name
references (which the component parser will reject)
@keithw keithw force-pushed the roundtrip-invalid-tests branch from 0d74fd0 to 5c973b2 Compare April 17, 2025 17:38
Comment on lines 2052 to 2055
if tys.len() != 1 {
bail!(self.offset, "invalid result arity");
}
self.visit_typed_select(tys.pop().unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this debug_assert! that tys.len() != 1 and then always return with bail!?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Done.

Comment on lines 856 to 859
match stringify!($visit).strip_prefix("visit_").unwrap() {
"typed_select_multi" => "typed_select",
other => other,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is just a debugging utility I think it's reasonable to print typed_select_multi in this case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -156,7 +157,7 @@ struct State {
core: CoreState,
#[cfg(feature = "component-model")]
component: ComponentState,
custom_section_place: Option<&'static str>,
custom_section_place: Option<(&'static str, usize)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that all makes sense, and I agree it's nice to catch the after element bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option are and how the line number is being used to detect empty sections?

@@ -330,7 +330,7 @@ macro_rules! _for_each_operator_group {
// reference-types
// https://github.com/WebAssembly/reference-types
@reference_types {
TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1)
TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)

@alexcrichton alexcrichton added this pull request to the merge queue Apr 17, 2025
Merged via the queue into bytecodealliance:main with commit 00c7efd Apr 17, 2025
32 checks passed
@keithw
Copy link
Contributor Author

keithw commented Apr 17, 2025

🙏 🎆 Thank you!

@keithw keithw deleted the roundtrip-invalid-tests branch April 17, 2025 20:31
@alexcrichton
Copy link
Member

alexcrichton commented Apr 17, 2025

And of course thank you for working on this (and working with me on the review)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants