Introduction

What is main?

main is a function like this:

int main(int argc, char *argv[])

It takes a variety of types of data:

  • Integers, booleans, actual strings, etc.
  • Enums, subcommands, etc.
  • Files, network connections, services, etc.

However, these are all communicated as strings.

Filenames

  • User has a file: 😺
  • They type in the name: "cat.jpg"
  • Program gets the name: "cat.jpg"
  • Program opens the file: 😺

What problems does parsing cause?

Commmand-line parsing is inconsistent and error-prone.

Is -ab the same as -a -b?

Is -a -b the same as -b -a?

Also, security and virtualization:

  • Child process needs same filesystem view as parent
  • Child process needs permissions of parent
  • Child process hard-codes how to turn strings into streams

We can sandbox child processes, but we often don't.

What's different about WebAssembly?

WebAssembly

Is WebAssembly Assembly?

Wasm is like an ISA, but different.

General-purpose CPUs have converged in many areas:

  • 8-bit bytes
  • Two's complement
  • IEEE 754 floating-point

But also:

  • Memory is a big virtual address space of bytes
  • Calls are just jumps (-and-link) with register and memory conventions
  • Syscalls are mode switches and jumps

In WebAssembly, the address space isn't everything.

Calls arguments are part of the call instruction.

WebAssembly has a static type system

It has an up-front validation step, rather than just SIGILL on thy fly.

Two perspectives

Minimize the differences to maximize compatibility?

Or take advantages of the differences to do new things?

Types

MVP types: i32, i64, f32, f64

WASI is about interfaces, so we also look forward to Interface Types:

  • signed and unsigned integers
  • bool
  • lists
  • variants
  • records
  • strings, aka lists of characters
  • handles

Handles

The way we represent handles in wasm will likely evolve over time.

At the witx level, we can just use the handle type.

Signatures

Functions have signatures.

Programs are functions.

What is the signature of a native program?

  • The binary doesn't say.
  • The OS doesn't know.
  • The shell doesn't know (in general).

Unix imposes a single effective signature on all programs.

Typed Main lets programs declare their signatures.

Command-line usage

Command-line parsing for Typed Main programs happens in the Wasm engine.

Example: an f32 argument

The user might type "6.283185"

In some locales, the user might type "6,283185".

Or "0x1.921fb5p+2"`.

Parsing in the Wasm engine means that all programs have a consistent interface.

Example: a handle argument

Many programs read files, but they don't literally need files.

An "input stream", that supports read would often be enough.

Programs written this way:

  • Do one thing and do it well (waves to Unix)
  • Don't depend on a particular filesystem view
  • Don't depend on filesystem privileges
  • Don't depend on a filesystem at all!

Bonus:

  • No implied string comparisons! No:
    • Non-Unicode filenames
    • Unicode normalization
    • Case sensitivity
    • Windows special-case path parsing
    • Filename length limits
    • ...

Putting it all together

Typed Main programs are just programs.

With signatures, they're also just functions.

We can use this to compose multiple programs together.

Compatibility with existing code

We have three options.

Option A: Out of the box

For porting an existing application with no changes, things work like they do in WASI today:

  • Everything is strings
  • User needs to use --dir preopens

This uses Typed Main, but with a fixed signature.

  • List-of-strings for the args
  • List-of-(handle,string) for the preopens

Option B: Provide a witx description

In this option, the Developer writes a witx file to describe their application.

This option involves no changes to the program itself.

The main program will still take strings and use preopens, but it can be wrapped in a wasm interface generated form witx.

Example:

;; Typed main example: a simple grep

(module $grep
  ;;; Main entrypoint for grep.
  (@interface func (export "main")
    ;;; The string to search for.
    (param $pattern string)

    ;;; The output to write to.
    (param $output $output_byte_stream)

    ;;; The inputs to search for it in.
    (param $inputs (list $input_byte_stream))

    ;;; The result: Just indicate if any I/O failed.
    (result $error (expected unit (error $input_byte_stream_error)))
  )
)

Option C: Typed Main in the source language

What if programming languages let you just write a main function which took arbitrary types?

Nameless

Nameless is a Rust crate prototyping Option C, for native code:

https://github.com/sunfishcode/nameless

https://crates.io/crates/nameless

Nameless today works for native code by doing everything itself. But once we hook it up to Typed Main in WASI, it'll be a very thin API.

/// # Arguments
///
/// * `pattern` - The regex to search for
/// * `inputs` - Input sources
#[kommand::main]
fn main(
    pattern: Regex,
    output: LazyOutput<OutputTextStream>,
    inputs: Vec<InputTextStream>,
) -> anyhow::Result<()> {
    let mut output = output.materialize(Type::text())?;

    let print_inputs = inputs.len() > 1;

    for input in inputs {
        let pseudonym = input.pseudonym();
        for line in BufReader::new(input).lines() {
            let line = line?;
            if pattern.is_match(&line) {
                if print_inputs {
                    output.write_pseudonym(&pseudonym)?;
                    write!(output, ":")?;
                }
                writeln!(output, "{}", line)?;
            }
        }
    }

    Ok(())
}

Wrap up

Current Status

The witx syntax shown here is supported by the witx parser in the WASI tree.

New-style commands, a toolchain feature that can be one of the building blocks:

Interface types are in phase 1, and there is prototyping underway at a few different levels.

Typed Main uses in WASI

wasi-clocks

Clocks should be capabilities, rather than ambient authorities.

wasi-random

Entropy sources should be capabilities, rather than ambient authorities.

New WASI proposals

  • Serial ports
  • Audio devices
  • Database connections
  • etc.

A big question for all capability systems is how a program obtains the first capability. Typed Main is a way to do this.