Add reference output tests for JS operations #1894

alexcrichton · 2019-12-03T22:48:00Z

This commit starts adding a test suite which checks in, to the
repository, test assertions for both the JS and wasm file outputs of a
Rust crate compiled with #[wasm_bindgen]. These aren't intended to be
exhaustive or large scale tests, but rather micro-tests to help observe
the changes in wasm-bindgen's output over time.

The motivation for this commit is basically overhauling how all the GC
passes work in wasm-bindgen today. The reorganization is also included
in this commit as well.

Previously wasm-bindgen would, in an ad-hoc fashion, run the GC passes
of walrus in a bunch of places to ensure that less "garbage" was seen
by future passes. This not only was a source of slowdown but it also was
pretty brittle since wasm-bindgen kept breaking if extra iteams leaked
through.

The strategy taken in this commit is to have one precise location for a
GC pass, and everything goes through there. This is achieved by:

All internal exports are removed immediately when generating the
nonstandard wasm interface types section. Internal exports,
intrinsics, and runtime support are all referenced by the various
instructions and/or sections that use them. This means that we now
have precise tracking of what an adapter uses.
This in turn enables us to implement the add_gc_roots function for
walrus custom sections, which in turn allows walrus GC passes to do
what unexport_unused_intrinsics did before. That function is now no
longer necessary, but effectively works the same way. All intrinsics
are unexported at the beginning and then they're selectively
re-imported and re-exported through the JS glue generation pass as
necessary and defined by the bindings.
Passes like the anyref pass are now much more precise about the
intrinsics that they work with. The anyref pass also deletes any
internal intrinsics found and also does some rewriting of the adapters
aftewards now to hook up calls to the heap count import to the heap
count intrinsic in the wasm module.

This commit starts adding a test suite which checks in, to the repository, test assertions for both the JS and wasm file outputs of a Rust crate compiled with `#[wasm_bindgen]`. These aren't intended to be exhaustive or large scale tests, but rather micro-tests to help observe the changes in `wasm-bindgen`'s output over time. The motivation for this commit is basically overhauling how all the GC passes work in `wasm-bindgen` today. The reorganization is also included in this commit as well. Previously `wasm-bindgen` would, in an ad-hoc fashion, run the GC passes of `walrus` in a bunch of places to ensure that less "garbage" was seen by future passes. This not only was a source of slowdown but it also was pretty brittle since `wasm-bindgen` kept breaking if extra iteams leaked through. The strategy taken in this commit is to have one precise location for a GC pass, and everything goes through there. This is achieved by: * All internal exports are removed immediately when generating the nonstandard wasm interface types section. Internal exports, intrinsics, and runtime support are all referenced by the various instructions and/or sections that use them. This means that we now have precise tracking of what an adapter uses. * This in turn enables us to implement the `add_gc_roots` function for `walrus` custom sections, which in turn allows walrus GC passes to do what `unexport_unused_intrinsics` did before. That function is now no longer necessary, but effectively works the same way. All intrinsics are unexported at the beginning and then they're selectively re-imported and re-exported through the JS glue generation pass as necessary and defined by the bindings. * Passes like the `anyref` pass are now much more precise about the intrinsics that they work with. The `anyref` pass also deletes any internal intrinsics found and also does some rewriting of the adapters aftewards now to hook up calls to the heap count import to the heap count intrinsic in the wasm module.

The final user of the `require_internal_export` function was `__wbindgen_realloc`. This usage has now been removed by updating how we handle usage of the `realloc` function. The wasm interface types standard doesn't have a `realloc` function slot, nor do I think it ever will. This means that as a polyfill for wasm interface types we'll always have to support the lack of `realloc`. For direct Rust to JS, however, we can still optionally handle `realloc`. This is all handled with a few internal changes. * Custom `StringToMemory` instructions now exist. These have an extra `realloc` slot to store an intrinsic, if found. * Our custom instructions are lowered to the standard instructions when generating an interface types section. * The `realloc` function, if present, is passed as an argument like the malloc function when passing strings to wasm. If it's not present we use a slower fallback, but if it's present we use the faster implementation. This should mean that there's little-to-no impact on existing users of `wasm-bindgen`, but this should continue to still work for wasm interface types polyfills and such. Additionally the GC passes now work in that they don't delete `__wbindgen_realloc` which we later try to reference.

This depends on the anyref table and a function to allocate an index if the anyref pass is running, so be sure to track that in the instruction itself for GC rooting.

Or if you're otherwise not using anyref slices, don't force some intrinsics to exist.

alexcrichton · 2019-12-03T22:51:28Z

For some background on this, I've started working on some simple anyref/wasm-interface-types tests. These two passes are pretty 'early days' at this point and need a fair amount of improvement. One of the first things I identified is that, as usual, it's always been a nightmare managing the GC passes in wasm-bindgen. With the recent refactoring and embedding walrus ids in instructions I wanted to try to fix this once-and-for all. We should now have proper rooting that's not "whatever the JS output happens to do" but rather the wasm module itself can be properly GC'd at any time.

This is not intended to change any user-facing functionality of wasm-bindgen today. In doing all this though I got paranoid that I was regressing small outputs and including extra items in the JS/wasm by accident. To assuage my fear of this I started to add a new test suite to wasm bindgen which checks the exhaustive output of wasm-bindgen itself, both the wasm and the JS file. The wasm is sanitized to not be so dependent on the compiler revision, but the idea here is that in a PR we can see the effect that the change has on the generated wasm and/or the generated JS file. This will also help prevent regressing situations where wasm-bindgen includes an intrinsic in a module by accident.

So overall this is in general a test suite being added, but lots of improvements as well to get the anyref pass working better with GC and such.

Looks like these values adjust in slight but insignificant ways over time

fitzgen

👍

fitzgen · 2019-12-04T17:19:46Z

crates/cli-support/src/anyref.rs

+                    None => continue,
+                };
+                match import {
+                    AuxImport::Intrinsic(Intrinsic::AnyrefHeapLiveCount) => {}


These nested loops are really calling for some kind of traversal / walker or something with filtering support for adapter instructions...

fitzgen · 2019-12-04T17:23:01Z

crates/cli-support/src/lib.rs

@@ -280,15 +279,6 @@ impl Bindgen {
            }
        };

-        // Our threads and multi-value xforms rely on the presence of the stack
-        // pointer, so temporarily export it so that our many GC's don't remove
-        // it before the xform runs.


alexcrichton added 5 commits December 3, 2019 13:40

Add an empty test for the anyref pass

a2613d1

Precisely track I32FromOptionAnyref's dependencies

9631522

This depends on the anyref table and a function to allocate an index if the anyref pass is running, so be sure to track that in the instruction itself for GC rooting.

Trim extraneous exports from nop anyref module

3b8542a

Or if you're otherwise not using anyref slices, don't force some intrinsics to exist.

alexcrichton requested a review from fitzgen December 3, 2019 22:51

alexcrichton added 2 commits December 4, 2019 08:47

Remove globals from reference tests

44cab62

Looks like these values adjust in slight but insignificant ways over time

Update the anyref xform tests

1fb4a58

fitzgen approved these changes Dec 4, 2019

View reviewed changes

alexcrichton merged commit d7a4a77 into rustwasm:master Dec 4, 2019

alexcrichton deleted the reference-tests branch December 4, 2019 18:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reference output tests for JS operations #1894

Add reference output tests for JS operations #1894

alexcrichton commented Dec 3, 2019

alexcrichton commented Dec 3, 2019

fitzgen left a comment

fitzgen Dec 4, 2019

fitzgen Dec 4, 2019

Add reference output tests for JS operations #1894

Add reference output tests for JS operations #1894

Conversation

alexcrichton commented Dec 3, 2019

alexcrichton commented Dec 3, 2019

fitzgen left a comment

Choose a reason for hiding this comment

fitzgen Dec 4, 2019

Choose a reason for hiding this comment

fitzgen Dec 4, 2019

Choose a reason for hiding this comment