Skip to main content

WebAssembly Comes to Hermes

WebAssembly Comes to Hermes

Hermes can now compile and run WebAssembly modules. A standard .wasm binary - the same one that runs in a browser or Node.js - can be loaded into Hermes at runtime, or compiled ahead of time into Hermes bytecode (.hbc) for zero startup cost. The result is the same fast, compact bytecode format that Hermes already uses for JavaScript, executed by the same interpreter.

Why does this matter? WebAssembly opens the door to reusing existing C, C++, and Rust libraries without writing native modules. Image processing, physics engines, crypto routines - anything compiled to Wasm can now run directly in the JS engine, using the standard WebAssembly API familiar from the browser.

This post walks through the full pipeline - from C source code to running Wasm inside Hermes - and looks at what happens under the hood.

The Simplest Example

Let’s start with a function so small there’s nowhere to hide: computing the average of two integers.

The C Source

int avg(int a, int b) {
  return (a + b) >> 1;
}

One line of real logic. Perfect for seeing the entire pipeline without distraction.

Compiling C to Wasm

Clang can compile C to WebAssembly when configured with the wasm32 target. The system Clang on macOS does not include it, but the Homebrew version does:

brew install llvm lld

Then compile with the Homebrew Clang, making sure the Homebrew LLD linker is on the PATH:

PATH="/opt/homebrew/opt/lld/bin:$PATH" \
/opt/homebrew/opt/llvm/bin/clang \
  --target=wasm32-unknown-unknown \
  -nostdlib \
  -O2 \
  -Wl,--no-entry \
  -Wl,--export-all \
  -o avg.wasm avg.c

A few of these flags deserve explanation:

  • --target=wasm32-unknown-unknown - compile to 32-bit WebAssembly rather than the host architecture.
  • -nostdlib - do not link the C standard library. Wasm modules typically provide their own memory management or import what they need from JavaScript.
  • -Wl,--no-entry - tell the linker there is no main entry point. This is a library, not a standalone program.
  • -Wl,--export-all - export every non-static function so it can be called from JavaScript.

The output is a standard .wasm binary - the same format every browser and Wasm runtime understands.

The WAT - What Wasm Looks Like

If we disassemble avg.wasm to the WebAssembly Text Format (WAT), we get something like this:

(module
  (func $avg (export "avg") (param $a i32) (param $b i32) (result i32)
    local.get $a       ;; push first parameter onto the stack
    local.get $b       ;; push second parameter
    i32.add            ;; pop both, push their sum
    i32.const 1        ;; push the shift amount
    i32.shr_s          ;; arithmetic right shift: (a + b) >> 1
  )
)

WebAssembly is a stack machine. Each instruction pops its inputs from the stack and pushes its result. i32.add pops two 32-bit integers and pushes their sum. i32.shr_s pops a value and a shift amount and pushes the result of an arithmetic right shift. That’s the entire function.

Compiling to Hermes Bytecode

This is the key step. Instead of parsing the Wasm binary at runtime, it can be compiled ahead of time:

hermesc --wasm -emit-binary -out avg.hbc avg.wasm

The output is a standard .hbc file - the same format Hermes uses for JavaScript. It loads instantly, with no compilation at startup.

Running It

Write a small JS driver that loads the module and calls the exported function:

// avg-run.js
var path = hermescli.getScriptArgs()[0];
var bytes = hermescli.loadFile(path);

var mod = new WebAssembly.Module(bytes);
var instance = new WebAssembly.Instance(mod);

print(instance.exports.avg(10, 20));

Then run it - either from the .wasm file directly, or from the precompiled .hbc:

# From .wasm (compiled at runtime)
hermes -Xhermes-internal-test-methods avg-run.js -- avg.wasm

# From .hbc (precompiled, zero startup cost)
hermes -Xhermes-internal-test-methods avg-run.js -- avg.hbc

Both print 15.

Under the Hood - The Bytecode

What does hermesc --wasm actually produce? Let’s look at the bytecode for our avg function:

Function<wasm_export_avg>(3 params, 3 registers, 2 numbers, 0 non-pointers):
    LoadParam         r2, 1          ;; r2 = first argument (a)
    ToInt32           r1, r2         ;; r1 = ToInt32(a)
    LoadParam         r2, 2          ;; r2 = second argument (b)
    ToInt32           r0, r2         ;; r0 = ToInt32(b)
    AddN              r0, r1, r0     ;; r0 = r1 + r0     (numeric add)
    ToInt32           r1, r0         ;; r1 = ToInt32(sum) (truncate to i32)
    LoadConstUInt8    r0, 1          ;; r0 = 1            (shift amount)
    RShift            r0, r1, r0     ;; r0 = r1 >> r0     (arithmetic shift)
    Ret               r0             ;; return r0

This is regular Hermes bytecode - the same instruction set used for JavaScript. LoadParam fetches function arguments. AddN is numeric addition. RShift is a right shift. ToInt32 coerces values to 32-bit integers, matching Wasm’s i32 semantics.

The key insight: Wasm becomes regular Hermes bytecode. There is no separate Wasm interpreter, no second runtime. The Wasm module is translated into the same bytecode format as JS code, and executed by the same interpreter. This means Wasm modules benefit from the same AOT compilation model, the same compact bytecode format, and the same startup characteristics as JavaScript in Hermes.

Standard WebAssembly API - Same Code Runs on Node.js

The WebAssembly JavaScript API is standardized. A single driver script can run on both Node.js and Hermes - only the file-loading code differs, because Hermes is a pure JavaScript VM and does not bundle Node.js built-ins like fs or process:

// universal-avg.js - runs on both Node.js and Hermes

// Load the .wasm bytes (the only part that differs between engines)
var bytes;
if (typeof process !== "undefined" && process.versions && process.versions.node) {
  // Node.js
  bytes = require("fs").readFileSync(process.argv[2]);
} else {
  // Hermes
  bytes = hermescli.loadFile(hermescli.getScriptArgs()[0]);
}

// Everything below is identical on both engines
var mod = new WebAssembly.Module(bytes);
var instance = new WebAssembly.Instance(mod);

// Call the exported Wasm function
var result = instance.exports.avg(10, 20);
console.log(result);

Run it on either engine:

# Node.js
node universal-avg.js avg.wasm

# Hermes
hermes -Xhermes-internal-test-methods universal-avg.js -- avg.wasm

Both print 15. The WebAssembly.Module and WebAssembly.Instance constructors, the .exports object, the function call - all identical. The WebAssembly API itself is the same across both engines.

AOT vs Runtime - Why Precompile?

There are two ways to get Wasm running in Hermes:

Runtime compilation. Load the .wasm binary at startup and compile it on the fly via new WebAssembly.Module(wasmBytes). This works, but the compilation happens every time the app starts.

Ahead-of-time compilation. Run hermesc --wasm during the build step to produce an .hbc file. At runtime, the module loads instantly - no parsing, no compilation, no startup cost.

Both paths produce the same bytecode. The only difference is when the compilation happens. For development and prototyping, loading .wasm directly is convenient. For production, precompiling to .hbc is the better choice.

This is the same model Hermes already uses for JavaScript: it can interpret JS source at runtime, or precompile it to .hbc for instant startup. Wasm gets the same treatment.

A Real Example - Conway’s Game of Life

To show something more substantial, here’s Conway’s Game of Life running as a Wasm module: a 128x128 toroidal grid, initialized with the R-pentomino pattern (a classic “methuselah” that evolves for over 1000 generations before stabilizing).

The C Source

#define WIDTH 128
#define HEIGHT 128
#define SIZE (WIDTH * HEIGHT)

__attribute__((import_module("env"), import_name("log")))
extern void env_log(int value);

static unsigned char gridA[SIZE];
static unsigned char gridB[SIZE];

static int count_neighbors(const unsigned char *grid, int x, int y) {
  int count = 0;
  for (int dy = -1; dy <= 1; dy++) {
    for (int dx = -1; dx <= 1; dx++) {
      if (dx == 0 && dy == 0)
        continue;
      int nx = (x + dx + WIDTH) % WIDTH;
      int ny = (y + dy + HEIGHT) % HEIGHT;
      count += grid[ny * WIDTH + nx];
    }
  }
  return count;
}

static void step(const unsigned char *src, unsigned char *dst) {
  for (int y = 0; y < HEIGHT; y++) {
    for (int x = 0; x < WIDTH; x++) {
      int n = count_neighbors(src, x, y);
      int alive = src[y * WIDTH + x];
      dst[y * WIDTH + x] = alive ? (n == 2 || n == 3) : (n == 3);
    }
  }
}

static int count_alive(const unsigned char *grid) {
  int count = 0;
  for (int i = 0; i < SIZE; i++)
    count += grid[i];
  return count;
}

static void clear(unsigned char *grid) {
  for (int i = 0; i < SIZE; i++)
    grid[i] = 0;
}

static void set_cell(int x, int y) {
  gridA[y * WIDTH + x] = 1;
}

// Place R-pentomino at center:
//   .##
//   ##.
//   .#.
static void init_pattern(void) {
  clear(gridA);
  clear(gridB);
  int cx = WIDTH / 2;
  int cy = HEIGHT / 2;
  set_cell(cx,     cy - 1);
  set_cell(cx + 1, cy - 1);
  set_cell(cx - 1, cy);
  set_cell(cx,     cy);
  set_cell(cx,     cy + 1);
}

__attribute__((export_name("run")))
void run(int iterations) {
  init_pattern();

  unsigned char *src = gridA;
  unsigned char *dst = gridB;

  for (int i = 0; i < iterations; i++) {
    step(src, dst);
    unsigned char *tmp = src;
    src = dst;
    dst = tmp;
  }

  env_log(count_alive(src));
}

The code uses Wasm linear memory for the two grids, wrapping arithmetic for the toroidal topology, and a single imported function (env.log) to report the result.

The Pipeline

Compile, convert, run:

# Compile C to Wasm (using Homebrew Clang, as shown earlier)
PATH="/opt/homebrew/opt/lld/bin:$PATH" \
/opt/homebrew/opt/llvm/bin/clang \
  --target=wasm32-unknown-unknown -nostdlib -O2 \
  -Wl,--no-entry -Wl,--export-all \
  -o life.wasm life.c

# Ahead-of-time compile to Hermes bytecode
hermesc --wasm -emit-binary -out life.hbc life.wasm

# Run it
hermes -Xhermes-internal-test-methods life-run.js -- life.hbc

The JS Driver

// life-run.js
var path = hermescli.getScriptArgs()[0];
var bytes = hermescli.loadFile(path);

var mod = new WebAssembly.Module(bytes);
var instance = new WebAssembly.Instance(mod, {
  env: {
    log: function(value) { console.log(value); }
  }
});

instance.exports.run(2000);

The Wasm module imports env.log, so we provide it as a plain JS function in the imports object. After 2000 iterations of the R-pentomino on the 128x128 grid, the output is:

120

Current Status

This is an early preview. We are focusing on correctness first, not performance. Wasm support is not yet ready for production use.

What’s supported:

  • Core Wasm 1.0 instruction set
  • Linear memory
  • Tables
  • Globals
  • Imports and exports
  • Exception handling
  • Bulk memory operations
  • i64 via split 32-bit pairs
  • Full WebAssembly JavaScript API (WebAssembly.Module, WebAssembly.Instance, WebAssembly.Memory, etc.)
  • Ahead-of-time compilation to .hbc

What’s not yet supported:

  • SIMD
  • Threads and shared memory
  • Performance optimizations
  • Native i64 representation

Closing

WebAssembly in Hermes means C, C++, or Rust code can be compiled to a standard .wasm binary and run inside the same JS engine that powers JavaScript apps. The AOT compilation path - hermesc --wasm - turns Wasm into the same .hbc bytecode format Hermes uses for JavaScript, giving instant module loading with no runtime compilation cost.

The WebAssembly API works the same way it does in the browser or Node.js. Modules are instantiated, exported functions are called, imports are provided - all with familiar JavaScript code. The difference is that Hermes can precompile everything ahead of time.

We’re at the beginning of this work. The instruction set coverage is solid, the API is functional, and the AOT pipeline works end to end. What comes next is optimization, broader feature coverage, and hardening for production use.

Comments

Popular posts from this blog

You Don't Like Google's Go Because You Are Small

When you look at Google's presentations about Go, they are not shy about it. Go is about very smart people at Google solving very BIG problems. They know best. If you don't like Go, then you are small and are solving small problems. If you were big (or smart), you would surely like it. For example, you might naively think that printing the greater of two numbers should be as simple as std::cout << max(b(),c()) That is because you think small. What you really should want is: t1 := b() if t2 := c(); t1 < t2 { t1 = t2 } fmt.Print( t1 ) Isn't it much better? We didn't have to type all those extra semicolons that were killing productivity before. If you don't like it, you are small. If you wanted to extract an attribute of an optional parameter, you may be used to typing something like: a = p ? p->a : 0; or even: a = p && p->a You just make me sad because obviously what you really want is: a = 0 if p != nil { a = p-...

How to speed up a micro-benchmark 300x

How to speed up a ubenchmark 300x Static Hermes: How to Speed Up a Micro-benchmark by 300x Without Cheating This is the first of a series of light blog posts about Static Hermes, covering topics that I find interesting or amusing. It is not intended to be a systematic introduction to Static Hermes design.Consider it more like a backstage pass to the quirks, features, and “aha!” moments that make Static Hermes unique. If you’re not familiar with Static Hermes, it’s an evolving project we’re working on to explore the confluence of static typing and JavaScript. It is work in progress, but we’re excited about the possibilities it opens up for performance improvements and more. For more background: Tweet with the slide deck of the Static Hermes announcement Previous talk about Hermes Contents: Meet interp-dispatch.js Let’s Run It! Static Hermes with Untyped Code Helping the Compiler Other Ways to Help the Compiler Some Observations Revisiting The Origina...

Apple2TC: an Apple II Binary to C Decompiler - Part 1

This is a series of blog posts to serve as a log documenting my work on Apple2TC - an open source hobby project developed on GitHub:  https://github.com/tmikov/apple2tc . There are various interesting things that come up all the time and I thought it might be useful to record them for posterity. Part 2  describes the different components of the project Part 3 shows how we validate our disassembly by generating a working C representation of it. Part 4 shows the Apple II tricks we had to deal with to get to correct running code. Part 5 describes decompiling and running Robotron 2084 and Snake Byte. Part 6 introduces our SSA-based intermediate representation . Part 7 shows how to transform the IR to discover the original intent of the code.   What is Apple2TC? Apple2TC is an open source project to decompile original Apple II binaries, mostly games, into understandable and working modern C code,  completely automatically  by analyzing the runtime behavior...