-
Notifications
You must be signed in to change notification settings - Fork 295
wasm-smith: add support for --exports
and --available-imports
option
#2160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this!
crates/wasm-smith/src/core.rs
Outdated
} | ||
|
||
#[cfg(feature = "wasmparser")] | ||
impl From<wasmparser::SubType> for SubType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these impls be modeled as TryFrom
to avoid the internal .unwrap()
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these impls be modeled as
TryFrom
to avoid the internal.unwrap()
s?
The TryFrom
trait requires defining an associated type Error
, do you think it would be acceptable to reuse wasm_encoder::reencode::Error
in wasm-smith for this purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are mostly internal I think it's fine to use something like ()
for the error, my hope would be to basically push the .unwrap()
to the location where all the other .unwrap()
s are happening to avoid future code relying on From::from
here without realizing there's unwraps internally
crates/wasm-smith/src/core.rs
Outdated
wasmparser::types::EntityType::Tag(id) => { | ||
let subtype = exports_types.get(id).unwrap_or_else(|| { | ||
panic!("Unable to get subtype for tag {id:?} in `exports` Wasm") | ||
}); | ||
match &subtype.composite_type.inner { | ||
wasmparser::CompositeInnerType::Func(func_type) => { | ||
assert!( | ||
subtype.is_final, | ||
"Subtype {subtype:?} from `exports` Wasm is not final" | ||
); | ||
assert!( | ||
subtype.supertype_idx.is_none(), | ||
"Subtype {subtype:?} from `exports` Wasm has non-empty supertype" | ||
); | ||
assert!( | ||
func_type.results().is_empty(), | ||
"Subtype {subtype:?} for tag {id:?} from `exports` Wasm has non-empty results" | ||
); | ||
let new_type = Rc::new(FuncType { | ||
params: func_type | ||
.params() | ||
.iter() | ||
.copied() | ||
.map(|t| t.try_into().unwrap()) | ||
.collect(), | ||
results: vec![], | ||
}); | ||
self.rec_groups.push(self.types.len()..self.types.len() + 1); | ||
let type_index = self.add_type(SubType { | ||
is_final: true, | ||
supertype: None, | ||
depth: 1, | ||
composite_type: CompositeType::new_func( | ||
Rc::clone(&new_type), | ||
subtype.composite_type.shared, | ||
), | ||
}); | ||
let tag_index = self.tags.len() as u32; | ||
self.tags.push(TagType { | ||
func_type_idx: type_index, | ||
func_type: new_type, | ||
}); | ||
self.num_defined_tags += 1; | ||
tag_index | ||
} | ||
_ => panic!( | ||
"Unable to handle type {:?} from `exports` Wasm", | ||
subtype.composite_type | ||
), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a large copy/paste of what functions are doing above, so could this be refactored to not duplicate the logic?
// Only imported globals are allowed in the constant initialization | ||
// expressions for tables. | ||
let expr = self.arbitrary_const_expr(ValType::Ref(ty), u, false)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this bool
argument to arbitrary_const_expr
was added due to tests failing at some point? I'm curious why though because arbitrary_tables
, which calls this function, is called before globals are generated so I'm not sure how this would otherwise refer to a defined global instead of an imported global.
Could you expand on what led you to adding this bool
argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hunch is that this was added due to required_exports
being called after all items, and in that case I wonder if it might make sense to split apart the required_exports
function to being called at the start of section generation? For example when generating tables the first thing that would be done is that any required tables, because of exports, would be generated. After that though more arbitrary tables might be generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hunch is that this was added due to
required_exports
being called after all items, and in that case I wonder if it might make sense to split apart therequired_exports
function to being called at the start of section generation? For example when generating tables the first thing that would be done is that any required tables, because of exports, would be generated. After that though more arbitrary tables might be generated.
Yes, that's exactly the reason why I'm adding the bool
argument, otherwise the required_exports
function may pick up defined globals for table initialization expressions. Spliting apart the required_exports
function to being called in each section generation function might be feasible, but intuitively, this would require:
- Either duplicating the module parsing logic within each section generation function, which definitely introduces unnecessary time cost;
- Or parsing the module in advance in the
build
function and storing relevant information in theModule
structure for subsequent use, still, this demands additional bookkeeping cost;
Thus, I personally believe that introducing a boolean argument is the simplest solution, as the resulting code changes and side effects are both clear and straightforward to manage. Of course, if you insist, I think the spliting approach could be worth a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that seems reasonable yeah, let's leave it as-is. I'll look into possibly refactoring this as a follow-up since I'm interested but I agree that amount of work is larger-than-expected so doesn't need to happen here
…refactor the code logic.
Hey @alexcrichton, I have made the necessary code changes in a43b982, and you can check it out to see if any further changes are required :). |
This PR fixes issue #2110 (sorry for the lateness):
--exports
option. Note that I have also modified the function signatures of arbitrary_const_expr and globals_for_const_expr and added a new parameter called allow_defined_globals. This is because the required_exports is called after arbitrary_globals in the build function, which results in wasm-smith potentially picking up defined globals for table initialization expressions and that is not allowed in the current Wasm standard;--available-imports
option. To achieve this, the current solution is to copy all the recursive groups in the provided module into the module being constructed. And the necessary trait implementation is also added (i.e.,From
trait);Please feel free to share any suggestions/concerns you might have :).