Skip to content

wasm-smith: add support for --exports and --available-imports option #2160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 30, 2025
Merged

Conversation

mobsceneZ
Copy link
Contributor

This PR fixes issue #2110 (sorry for the lateness):

  1. It allows wasm-smith to generate the corresponding tag, table, and memory exports as specified by the --exports option. Note that I have also modified the function signatures of arbitrary_const_expr and globals_for_const_expr and added a new parameter called allow_defined_globals. This is because the required_exports is called after arbitrary_globals in the build function, which results in wasm-smith potentially picking up defined globals for table initialization expressions and that is not allowed in the current Wasm standard;
  2. It allows Wasm modules with GC types to appear in the --available-imports option. To achieve this, the current solution is to copy all the recursive groups in the provided module into the module being constructed. And the necessary trait implementation is also added (i.e., From trait);

Please feel free to share any suggestions/concerns you might have :).

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this!

}

#[cfg(feature = "wasmparser")]
impl From<wasmparser::SubType> for SubType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these impls be modeled as TryFrom to avoid the internal .unwrap()s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these impls be modeled as TryFrom to avoid the internal .unwrap()s?

The TryFrom trait requires defining an associated type Error, do you think it would be acceptable to reuse wasm_encoder::reencode::Error in wasm-smith for this purpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are mostly internal I think it's fine to use something like () for the error, my hope would be to basically push the .unwrap() to the location where all the other .unwrap()s are happening to avoid future code relying on From::from here without realizing there's unwraps internally

Comment on lines 1993 to 2042
wasmparser::types::EntityType::Tag(id) => {
let subtype = exports_types.get(id).unwrap_or_else(|| {
panic!("Unable to get subtype for tag {id:?} in `exports` Wasm")
});
match &subtype.composite_type.inner {
wasmparser::CompositeInnerType::Func(func_type) => {
assert!(
subtype.is_final,
"Subtype {subtype:?} from `exports` Wasm is not final"
);
assert!(
subtype.supertype_idx.is_none(),
"Subtype {subtype:?} from `exports` Wasm has non-empty supertype"
);
assert!(
func_type.results().is_empty(),
"Subtype {subtype:?} for tag {id:?} from `exports` Wasm has non-empty results"
);
let new_type = Rc::new(FuncType {
params: func_type
.params()
.iter()
.copied()
.map(|t| t.try_into().unwrap())
.collect(),
results: vec![],
});
self.rec_groups.push(self.types.len()..self.types.len() + 1);
let type_index = self.add_type(SubType {
is_final: true,
supertype: None,
depth: 1,
composite_type: CompositeType::new_func(
Rc::clone(&new_type),
subtype.composite_type.shared,
),
});
let tag_index = self.tags.len() as u32;
self.tags.push(TagType {
func_type_idx: type_index,
func_type: new_type,
});
self.num_defined_tags += 1;
tag_index
}
_ => panic!(
"Unable to handle type {:?} from `exports` Wasm",
subtype.composite_type
),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a large copy/paste of what functions are doing above, so could this be refactored to not duplicate the logic?

Comment on lines +1663 to +1665
// Only imported globals are allowed in the constant initialization
// expressions for tables.
let expr = self.arbitrary_const_expr(ValType::Ref(ty), u, false)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this bool argument to arbitrary_const_expr was added due to tests failing at some point? I'm curious why though because arbitrary_tables, which calls this function, is called before globals are generated so I'm not sure how this would otherwise refer to a defined global instead of an imported global.

Could you expand on what led you to adding this bool argument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hunch is that this was added due to required_exports being called after all items, and in that case I wonder if it might make sense to split apart the required_exports function to being called at the start of section generation? For example when generating tables the first thing that would be done is that any required tables, because of exports, would be generated. After that though more arbitrary tables might be generated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hunch is that this was added due to required_exports being called after all items, and in that case I wonder if it might make sense to split apart the required_exports function to being called at the start of section generation? For example when generating tables the first thing that would be done is that any required tables, because of exports, would be generated. After that though more arbitrary tables might be generated.

Yes, that's exactly the reason why I'm adding the bool argument, otherwise the required_exports function may pick up defined globals for table initialization expressions. Spliting apart the required_exports function to being called in each section generation function might be feasible, but intuitively, this would require:

  1. Either duplicating the module parsing logic within each section generation function, which definitely introduces unnecessary time cost;
  2. Or parsing the module in advance in the build function and storing relevant information in the Module structure for subsequent use, still, this demands additional bookkeeping cost;

Thus, I personally believe that introducing a boolean argument is the simplest solution, as the resulting code changes and side effects are both clear and straightforward to manage. Of course, if you insist, I think the spliting approach could be worth a try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that seems reasonable yeah, let's leave it as-is. I'll look into possibly refactoring this as a follow-up since I'm interested but I agree that amount of work is larger-than-expected so doesn't need to happen here

@mobsceneZ
Copy link
Contributor Author

Hey @alexcrichton, I have made the necessary code changes in a43b982, and you can check it out to see if any further changes are required :).

@alexcrichton alexcrichton added this pull request to the merge queue Apr 30, 2025
Merged via the queue into bytecodealliance:main with commit 7b08b93 Apr 30, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants