WebAssembly Updates #5097

steven-johnson · 2020-07-09T20:23:05Z

Revamp our WebAssembly support to bring it more up-to-date with current spec:

Remove the use of V8 for our "jit" support and use the WABT interpreter instead. (It's vastly slower but is correct and is far easier to integrate into Halide.)
Remove all explicit support for wasm building/testing from the Makefile; move it into CMake instead.
Update all tests to reflect current status of build/test accuracy.
Especially update simd_op_check to verify final-spec simd ops.
Update README.

This has been split off from #5097 to make reviewing simpler; this modifies a bunch of tests for better use with WASM: - everything in auto_schedule is skipped, since the Mullapudi2016 autoscheduler assert-fails for most non-real-CPU backends - everything in performance is skipped, because we'll be doing wasm "jit" testing with an interpreter, so performance results would be meaningless - Added/updated the skip message for tests that exercise things that wasm doesn't support yet (eg atomics, threads) - For a handful of tests that are unreasonably slow under the wasm "jit" interpreter, dialed down some of the test parameters to avoid test timeouts - Removed some wasm-related skips that won't be necessary any longer

Cherry-picking some CMake-related stuff from #5097 to make it easier to review: - Add an optional COMMAND argument to add_halide_test(), this allows us to customize the command for execution (e.g. to run generated .wasm with a shell tool) - Add some missing DEPENDS in a few places - Add an optional EXTRA_LIBS argument to halide_define_aot_test(); this allows us to pass extra dependencies rather than requiring separate calls to target_link_libraries(). That last one is a little odd, so let me expand: the intent here is that (when the wasm changes land), some of the tests that aren't usable under wasm (e.g. matlab), and halide_define_aot_test() will handle these by just skipping those targets entirely. This means that we can effectively centralize the blocklist in one place, and then the callers of halide_define_aot_test() can just do something like halide_define_aot_test(matlab) if (TARGET generator_aot_matlab) set_target_properties(generator_aot_matlab PROPERTIES ENABLE_EXPORTS True) endif () ... i.e., just assume that if the target may not be defined if it's blacklisted for some reason. (Open for better suggestions on this last one, but I felt it was better than spraying lots of checks for "if wasm blah blah" thru this entire file)

Cherry-picking a change from #5097 so that reviewing that PR will be simpler; this just adds another feature option for wasm that tooling wasn't ready for before, but is now (saturating float-to-int conversion).

alexreinking · 2020-07-15T18:47:02Z

re: the GIT_SUBMODULES CMake bug. It was fixed:

https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4729

steven-johnson · 2020-07-15T19:10:12Z

re: the GIT_SUBMODULES CMake bug. It was fixed:

https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4729

...but presumably in 3.18+, so we can't rely on it yet?

alexreinking · 2020-07-15T21:07:48Z

...but presumably in 3.18+, so we can't rely on it yet?

Ah, I misunderstood your comment in the code. I thought that the bug was introduced in 3.18. Regardless, leaving GIT_SUBMODULES "" in the FetchContent_Declare is fine. In the bugged versions, it's a no-op. In the corrected versions, it leaves out the submodules like we want.

steven-johnson · 2020-07-23T21:33:04Z

There's more wasm-related work to do in the future (notably, getting benchmarks/performance tests working), but I think this is a good enough stopping point that we should look at reviewing and landing it.

(Note that wasm testing isn't enabled on the buildbots yet -- it never has been! -- but should be possible now thanks to these changes. After this lands, I'll update the buildbots accordingly.)

abadams · 2020-07-23T22:05:40Z

README_webassembly.md

-  `HL_JIT_TARGET=wasm-32-wasmrt-wasm_simd128`) and run normally. The test suites
-  which we have vetted to work include correctness, performance, error, and
-  warning. (Some of the others could likely be made to work with modest effort.)
+- To run the JIT tests, set `HL_JIT_TARGET=wasm-32-wasmrt` (possibly adding `wasm_simd128`, `wasm_signext`, and/or `wasm_sat_float_to_int`) and run CMake/CTest normally. Note that wasm testing is only support under CMake (not via Make).


line wrapping in this file seems necessary. I think @alexreinking used a markdown auto-formatter at some point that might do it?

Yes, it's called prettier.io. You can install it from NPM. I like it quite a bit. I make sure to always pass --prose-wrap always at the CLI so that it reflows paragraphs nicely.

abadams · 2020-07-23T22:07:54Z

src/CodeGen_WebAssembly.cpp

+        {Target::WasmSimd128, true, UInt(16, 8), 0, "llvm.wasm.avgr.unsigned.v8i16", u16(((wild_u32x_ + wild_u32x_) + 1) >> 1)},
+
+        // TODO: LLVM should support this directly, but doesn't yet.
+        // To make this work, we need to be able to call the intrinsics with two vecs.


The way I've had to do this in the past is with force-inlined implementations that accept the wider vec, e.g. see packsswbx16 in src/runtime/x86.ll

One of the wasm/LLVM folks has been working on improving these codegen issues in their backend, so I haven't expended much energy on trying to improve it here yet. I'll add this as a note in case circumstances make us need to improve sooner.

abadams · 2020-07-23T22:10:13Z

src/WasmExecutor.cpp

-            String::Utf8Value str(isolate, key);
-            wdebug(0) << "\t" << i + 1 << ". " << *str << "\n";
-        }
+    // lld will temporarily hijack the signal handlers to ensure that temp files get cleaned up,


This sounded painful to debug.

it wasn't the worst thing I've ever had to do, but it wasn't much fun, either

abadams · 2020-07-23T22:11:05Z

src/WasmExecutor.cpp

-inline void ExtractAndStoreScalar<int64_t>::operator()(const Local<Context> &context, const Local<Value> &val, void *slot) {
-    internal_error << "TODO: 64-bit slots aren't yet supported";
+std::string to_string(const wabt::MemoryStream &m) {
+    // TODO: ugh


Please elaborate. Are you talking about the weird requirement to const cast?

Yeah, that comment doesn't belong there. Removing.

tutorial/CMakeLists.txt

steven-johnson · 2020-07-27T17:10:05Z

Can I get final review/approval for this? (Only failure is unrelated Windows GPU failure)

alexreinking · 2020-07-27T21:08:12Z

I'd like to check that the commands in the README still work even with an LLVM that doesn't have WebAssembly/WABT/whatever, since the Ubuntu-bundled one doesn't.

steven-johnson · 2020-07-27T23:27:08Z

I'd like to check that the commands in the README still work even with an LLVM that doesn't have WebAssembly/WABT/whatever, since the Ubuntu-bundled one doesn't.

cool, will wait

…installed

alexreinking

There might be some weirdness if you try to use this with WITH_WASM_SHELL disabled, but we'll cross that bridge when we come to it.

will be necessary once halide/Halide#5097 lands.

This is a partial revert of halide#5097. It brings back a bunch of the code in WasmExecutor to set up and use V8 to run Wasm code. All of the code is copy-pasted. There are some small cleanups to move common code (like BDMalloc, structs, asserts) to a common area guarded by `if WITH_WABT || WITH_V8`. Enabling V8 requires setting 2 CMake options: - V8_INCLUDE_PATH - V8_LIB_PATH The first is a path to v8 include folder, to find headers, the second is the monolithic v8 library. This is because it's pretty difficult to build v8, and there are various flags you can set. Comments around those options provide some instructions for building v8. By default, we still use the wabt for running Wasm code, but we can use V8 by setting WITH_WABT=OFF WITH_V8=ON. Maybe in the future, with more testing, we can flip this. Right now this requires a locally patched build of V8 due to https://crbug.com/v8/10461, but once that is resolved, the version of V8 that includes the fix will be fine. Also enable a single test, block_transpose, to run on V8, with these results: $ HL_JIT_TARGET=wasm-32-wasmrt-wasm_simd128 \ ./test/performance/performance_block_transpose Dummy Func version: Scalar transpose bandwidth 3.45061e+08 byte/s. Wrapper version: Scalar transpose bandwidth 3.38931e+08 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 6.74143e+08 byte/s. Wrapper version: Transpose vectorized in y bandwidth 3.54331e+08 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 3.50053e+08 byte/s. Wrapper version: Transpose vectorized in x bandwidth 6.73421e+08 byte/s. Success! For comparison, when targeting host: $ ./test/performance/performance_block_transpose Dummy Func version: Scalar transpose bandwidth 1.33689e+09 byte/s. Wrapper version: Scalar transpose bandwidth 1.33583e+09 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 2.20278e+09 byte/s. Wrapper version: Transpose vectorized in y bandwidth 1.45959e+09 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 1.45921e+09 byte/s. Wrapper version: Transpose vectorized in x bandwidth 2.21746e+09 byte/s. Success! For comparison, running with wabt: Dummy Func version: Scalar transpose bandwidth 828715 byte/s. Wrapper version: Scalar transpose bandwidth 826204 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 1.12008e+06 byte/s. Wrapper version: Transpose vectorized in y bandwidth 874958 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 879031 byte/s. Wrapper version: Transpose vectorized in x bandwidth 1.10525e+06 byte/s. Success!

* Support using V8 as the Wasm JIT interpreter This is a partial revert of #5097. It brings back a bunch of the code in WasmExecutor to set up and use V8 to run Wasm code. All of the code is copy-pasted. There are some small cleanups to move common code (like BDMalloc, structs, asserts) to a common area guarded by `if WITH_WABT || WITH_V8`. Enabling V8 requires setting 2 CMake options: - V8_INCLUDE_PATH - V8_LIB_PATH The first is a path to v8 include folder, to find headers, the second is the monolithic v8 library. This is because it's pretty difficult to build v8, and there are various flags you can set. Comments around those options provide some instructions for building v8. By default, we still use the wabt for running Wasm code, but we can use V8 by setting WITH_WABT=OFF WITH_V8=ON. Maybe in the future, with more testing, we can flip this. Right now this requires a locally patched build of V8 due to https://crbug.com/v8/10461, but once that is resolved, the version of V8 that includes the fix will be fine. Also enable a single test, block_transpose, to run on V8, with these results: $ HL_JIT_TARGET=wasm-32-wasmrt-wasm_simd128 \ ./test/performance/performance_block_transpose Dummy Func version: Scalar transpose bandwidth 3.45061e+08 byte/s. Wrapper version: Scalar transpose bandwidth 3.38931e+08 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 6.74143e+08 byte/s. Wrapper version: Transpose vectorized in y bandwidth 3.54331e+08 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 3.50053e+08 byte/s. Wrapper version: Transpose vectorized in x bandwidth 6.73421e+08 byte/s. Success! For comparison, when targeting host: $ ./test/performance/performance_block_transpose Dummy Func version: Scalar transpose bandwidth 1.33689e+09 byte/s. Wrapper version: Scalar transpose bandwidth 1.33583e+09 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 2.20278e+09 byte/s. Wrapper version: Transpose vectorized in y bandwidth 1.45959e+09 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 1.45921e+09 byte/s. Wrapper version: Transpose vectorized in x bandwidth 2.21746e+09 byte/s. Success! For comparison, running with wabt: Dummy Func version: Scalar transpose bandwidth 828715 byte/s. Wrapper version: Scalar transpose bandwidth 826204 byte/s. Dummy Func version: Transpose vectorized in y bandwidth 1.12008e+06 byte/s. Wrapper version: Transpose vectorized in y bandwidth 874958 byte/s. Dummy Func version: Transpose vectorized in x bandwidth 879031 byte/s. Wrapper version: Transpose vectorized in x bandwidth 1.10525e+06 byte/s. Success! * Add instructions to build V8 * Formatting * More documentation * Update README_webassembly.md * Update README_webassembly.md * Update WasmExecutor.cpp * Update WasmExecutor.cpp * Skip tests * Update WasmExecutor.cpp * Skip performance tests * Update WasmExecutor.cpp * Address review comments * 9.8.147 -> 9.8.177 Co-authored-by: Ng Zhi An <zhin@google.com>

steven-johnson mentioned this pull request Jul 10, 2020

Update many tests for WASM #5105

Merged

steven-johnson force-pushed the srj-wabt branch 2 times, most recently from 2ffa61c to 6785c98 Compare July 14, 2020 17:16

steven-johnson mentioned this pull request Jul 14, 2020

Land some CMake revisions for wasm integration #5115

Merged

steven-johnson mentioned this pull request Jul 14, 2020

Add Target::WasmSatFloatToInt #5116

Merged

alexreinking self-requested a review July 16, 2020 00:25

steven-johnson force-pushed the srj-wabt branch 2 times, most recently from 7f10ff7 to 95b3ed5 Compare July 20, 2020 22:35

Revamp WebAssembly Support

02e89f7

steven-johnson force-pushed the srj-wabt branch from a54d8ff to 02e89f7 Compare July 23, 2020 21:20

steven-johnson changed the title ~~WebAssembly fixes (draft)~~ WebAssembly Updates Jul 23, 2020

steven-johnson added 2 commits July 23, 2020 14:26

Update README_webassembly.md

a617746

Update presubmit.yml

33852a6

steven-johnson marked this pull request as ready for review July 23, 2020 21:31

steven-johnson requested a review from abadams July 23, 2020 21:31

Update CMakeLists.txt

bdaf9b8

abadams reviewed Jul 23, 2020

View reviewed changes

tutorial/CMakeLists.txt Show resolved Hide resolved

steven-johnson added 2 commits July 23, 2020 16:52

Update README_webassembly.md

e61d821

Tweaks

a51673e

steven-johnson added 2 commits July 24, 2020 10:47

Merge branch 'master' into srj-wabt

ef3724b

fixes

4232fe2

abadams approved these changes Jul 27, 2020

View reviewed changes

alexreinking added 2 commits July 28, 2020 09:57

Fix case when LLVM reports WebAssembly capability, but liblld is not …

9db0d79

…installed

fix option name in dependent option

3d7854d

alexreinking approved these changes Jul 28, 2020

View reviewed changes

Merge branch 'master' into srj-wabt

aa3041f

steven-johnson added a commit to halide/build_bot that referenced this pull request Jul 28, 2020

Add wasm testing to buildbot

79b5863

will be necessary once halide/Halide#5097 lands.

steven-johnson mentioned this pull request Jul 28, 2020

Add wasm testing to buildbot halide/build_bot#77

Merged

Merge branch 'master' into srj-wabt

419c2dd

steven-johnson merged commit 83b452f into master Jul 29, 2020

steven-johnson deleted the srj-wabt branch July 29, 2020 17:24

alexreinking mentioned this pull request Aug 7, 2020

Buildbots need to build/test WASM code #4245

Closed

ngzhian mentioned this pull request Nov 19, 2021

Support using V8 as the Wasm JIT interpreter #6429

Closed

WebAssembly Updates #5097

WebAssembly Updates #5097

Uh oh!

Conversation

steven-johnson commented Jul 9, 2020 • edited by alexreinking Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexreinking commented Jul 15, 2020

Uh oh!

steven-johnson commented Jul 15, 2020

Uh oh!

alexreinking commented Jul 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steven-johnson commented Jul 23, 2020

Uh oh!

abadams Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

alexreinking Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abadams Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

steven-johnson Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

abadams Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

steven-johnson Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

abadams Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

steven-johnson Jul 23, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steven-johnson commented Jul 27, 2020

Uh oh!

alexreinking commented Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steven-johnson commented Jul 27, 2020

Uh oh!

alexreinking left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steven-johnson commented Jul 9, 2020 •

edited by alexreinking

Loading

alexreinking commented Jul 15, 2020 •

edited

Loading

alexreinking Jul 23, 2020 •

edited

Loading

alexreinking commented Jul 27, 2020 •

edited

Loading