6618 - new docs

2020-07-05 15:28:37 -07:00 · 2020-07-05 15:28:37 -07:00 · 9a524793ee
parent 3ccb2c8328
commit 9a524793ee
16 changed files with 1132 additions and 1097 deletions
--- a/README.md
+++ b/README.md
@ -10,7 +10,7 @@ Running the code you want to run, and nothing else.
  ```sh
  $ git clone https://github.com/akkartik/mu
  $ cd mu
-  $ ./translate_mu apps/ex2.mu  # emits a.elf
+  $ ./translate_mu apps/ex2.mu  # emit a.elf
  $ ./a.elf  # adds 3 and 4
  $ echo $?
  7
@ -18,10 +18,13 @@ Running the code you want to run, and nothing else.
 [![Build Status](https://api.travis-ci.org/akkartik/mu.svg?branch=master)](https://travis-ci.org/akkartik/mu)
-Mu requires just a Unix-like OS and nothing else. The Mu translator is built
+Rather than start from some syntax and introduce layers of translation to
-up from machine code. You can also bootstrap it from C++. Both C++ and
+implement it, Mu starts from the processor's instruction set and tries to get
-self-hosted versions emit identical binaries. The generated binaries require
+to _some_ safe and clear syntax with as few layers of translation as possible.
-just a Unix-like kernel and nothing else. (There are more details in [this paper](http://akkartik.name/akkartik-convivial-20200607.pdf).)
+The emphasis is on internal consistency at any point in time rather than
 compatibility with the past. ([More details.](http://akkartik.name/akkartik-convivial-20200607.pdf))
 Currently Mu requires a 32-bit x86 processor and Linux kernel.
 ## Goals
@ -56,668 +59,84 @@ In priority order:
  For now it's a thin veneer over machine code. I'm working on memory safety
  before expressive syntax.
-## Source Language
+## Toolchain
-Mu's main source language is [still being built](http://akkartik.name/post/mu-2019-2).
+The Mu stack consists of:
-When completed, it will be type- and memory-safe. At the moment it performs no
+- the Mu type-safe language;
-checks. Here's the program we translated above:
+- SubX, an unsafe notation for a subset of x86 machine code; and
 - _bare_ SubX, a more rudimentary form of SubX without certain syntax sugar.
-<img alt='ex2.mu' src='html/ex2.mu.png'>
+All Mu programs get translated through these layers, and the translators are
 mostly built out of lower levels. The translator from Mu to SubX is written in
 SubX, and the translator from SubX to bare SubX is built in bare SubX.
-There are no expressions; functions consist of only statements that operate on
+To translate Mu programs through these layers into tiny zero-dependency ELF
-variables. Most statements in Mu translate to [a single machine-code instruction](http://akkartik.github.io/mu/html/mu_instructions.html).
+binaries for Linux, use `translate_mu`.
 Variables reside in memory by default. Programs must specify registers when
 they want to use them. Functions must return results in registers. Execution
 begins at the function `main`, which always returns its result in register
 `ebx`. [The paper](http://akkartik.name/akkartik-convivial-20200607.pdf) has
 more details, and there's a [summary](mu_summary) of all supported instructions.
-## SubX
+Mu programs can be run in emulated mode to emit traces, which permit time-travel
 debugging. ([More details.](debugging.md))
-Mu is written in [a notation for a subset of x86 machine code called SubX](http://akkartik.name/post/mu-2019-1).
+### incomplete tools
-Here's a SubX program (`apps/ex1.subx`) that returns 42:
+
-
+The Mu translator is still a work in progress; not all incorrect programs
-  ```sh
+result in good error messages.
-  bb/copy-to-ebx  0x2a/imm32  # 42 in hex
+
-  b8/copy-to-eax  1/imm32/exit
+Once generated, ELF binaries can be packaged up with a Linux kernel into a
-  cd/syscall  0x80/imm8
+bootable disk image:
  ```
 You can generate tiny zero-dependency ELF binaries from SubX that run on Linux.
  ```sh
  $ ./bootstrap translate init.linux apps/ex1.subx -o apps/ex1  # on Linux or BSD or Mac
  $ ./apps/ex1  # only on Linux
  $ echo $?
  42
 ```
 (Running `bootstrap` requires a C++ compiler, transparently invoking it as
 necessary.)
 You can run the generated binaries on an interpreter/VM for better error
 messages.
  ```sh
  $ ./bootstrap run apps/ex1  # on Linux or BSD or Mac
  $ echo $?
  42
  ```
 Emulated runs can generate a trace that permits [time-travel debugging](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md).
  ```sh
  $ ./bootstrap --debug translate init.linux apps/factorial.subx -o apps/factorial
  saving address->label information to 'labels'
  saving address->source information to 'source_lines'
  $ ./bootstrap --trace run apps/factorial
  saving trace to 'last_run'
  $ tools/browse_trace last_run  # text-mode debugger UI
  ```
 You can write tests for your programs. The entire stack is thoroughly covered
 by automated tests. SubX's tagline: tests before syntax.
  ```sh
  $ ./bootstrap test
  $ ./bootstrap run apps/factorial test
  ```
 You can use SubX to translate itself. For example, running natively on Linux:
  ```sh
  # generate translator phases using the C++ translator
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/hex.subx    -o hex
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/survey.subx -o survey
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/pack.subx   -o pack
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/assort.subx -o assort
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/dquotes.subx -o dquotes
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/tests.subx  -o tests
  $ chmod +x hex survey pack assort dquotes tests
  # use the generated translator phases to translate SubX programs
  $ cat init.linux apps/ex1.subx |./tests |./dquotes |./assort |./pack |./survey |./hex > a.elf
  $ chmod +x a.elf
  $ ./a.elf
  $ echo $?
  42
  # or, automating the above steps
  $ ./translate_subx init.linux apps/ex1.subx
  $ ./a.elf
  $ echo $?
  42
  ```
 Or, running in a VM on other platforms (much slower):
  ```sh
  $ ./translate_subx_emulated init.linux apps/ex1.subx  # generates identical a.elf to above
  $ ./bootstrap run a.elf
  $ echo $?
  42
  ```
 You can package up SubX binaries with the minimal hobbyist OS [Soso](https://github.com/ozkl/soso)
 and run them on Qemu. (Requires graphics and sudo access. Currently doesn't
 work on a cloud server.)
  ```sh
  $ ./translate_mu apps/ex2.mu  # emit a.elf
  # dependencies
  $ sudo apt install build-essential util-linux nasm xorriso  # maybe also dosfstools and mtools
  # package up a "hello world" program with a third-party kernel into mu_soso.iso
  # requires sudo
  $ tools/iso/soso init.soso apps/ex6.subx
  # try it out
  $ qemu-system-i386 -cdrom mu_soso.iso
  ```
 You can also package up SubX binaries with a Linux kernel and run them on
 either Qemu or [a cloud server that supports custom images](http://akkartik.name/post/iso-on-linode).
 (Takes 12 minutes with 2GB RAM. Requires 12 million LoC of C for the Linux
 kernel; that number will gradually go down.)
  ```sh
  $ sudo apt install build-essential flex bison wget libelf-dev libssl-dev xorriso
-  $ tools/iso/linux init.linux apps/ex6.subx
+  $ tools/iso/linux a.elf
  $ qemu-system-x86_64 -m 256M -cdrom mu_linux.iso -boot d
  ```
-## The syntax of SubX instructions
+The disk image also runs on [any cloud server that supports custom images](http://akkartik.name/post/iso-on-linode).
-Here is the above SubX example again:
+Mu also runs on the minimal hobbyist OS [Soso](https://github.com/ozkl/soso).
 (Requires graphics and sudo access. Currently doesn't work on a cloud server.)
  ```sh
-  bb/copy-to-ebx  0x2a/imm32  # 42 in hex
+  $ ./translate_mu apps/ex2.mu  # emit a.elf
-  b8/copy-to-eax  1/imm32/exit
+  # dependencies
-  cd/syscall  0x80/imm8
+  $ sudo apt install build-essential util-linux nasm xorriso  # maybe also dosfstools and mtools
  $ tools/iso/soso a.elf  # requires sudo
  $ qemu-system-i386 -cdrom mu_soso.iso
  ```
-Every line contains at most one instruction. Instructions consist of words
+## Syntax
 separated by whitespace. Words may be _opcodes_ (defining the operation being
 performed) or _arguments_ (specifying the data the operation acts on). Any
 word can have extra _metadata_ attached to it after `/`. Some metadata is
 required (like the `/imm32` and `/imm8` above), but unrecognized metadata is
 silently skipped so you can attach comments to words (like the instruction
 name `/copy-to-eax` above, or the `/exit` argument).
-What do all these numbers mean? SubX supports a small subset of the 32-bit x86
+The entire stack shares certain properties and conventions. Programs consist
-instruction set that likely runs on your computer. (Think of the name as short
+of functions and functions consist of statements, each performing a single
-for "sub-x86".) The instruction set contains instructions like `89/copy`,
+operation. Operands to statements are always variables or constants. You can't
-`01/add`, `3d/compare` and `51/push-ecx` which modify registers and a byte-addressable
+say `a + b*c`, you have to break it up into two operations. Variables can live
-memory. For a complete list of supported instructions, run `bootstrap help opcodes`.
+in memory or in registers. Registers must be explicitly specified. There are
 some shared lexical rules; comments always start with '#', and numbers are
 always written in hex.
-The registers instructions operate on are as follows:
+Here's an example program in Mu:
- Six general-purpose 32-bit registers: `0/eax`, `1/ebx`, `2/ecx`, `3/edx`,
+<img alt='ex2.mu' src='html/ex2.mu.png'>
  `6/esi` and `7/edi`.
 - Two additional 32-bit registers: `4/esp` and `5/ebp`. (I suggest you only
  use these to manage the call stack.)
-(SubX doesn't support floating-point registers yet. Intel processors support
+[More details on Mu syntax &rarr;](mu.md)
 an 8-bit mode, 16-bit mode and 64-bit mode. SubX will never support them.
 There are also _many_ more instructions that SubX will never support.)
-While SubX doesn't provide the usual mnemonics for opcodes, it _does_ provide
+Here's an example program in SubX:
 error-checking. If you miss an argument or accidentally add an extra argument,
 you'll get a nice error. SubX won't arbitrarily interpret bytes of data as
 instructions or vice versa.
-It's worth distinguishing between an instruction's arguments and its _operands_.
+  ```sh
-Arguments are provided directly in instructions. Operands are pieces of data
+  == code
-in register or memory that are operated on by instructions.
+  Entry:
-
+    # ebx = 1
-Intel processors typically operate on no more than two operands, and at most
+    bb/copy-to-ebx  1/imm32
-one of them (the 'reg/mem' operand) can access memory. The address of the
+    # increment ebx
-reg/mem operand is constructed by expressions of one of these forms:
+    43/increment-ebx
-
+    # exit(ebx)
-  - `%reg`: operate on just a register, not memory
+    e8/call  syscall_exit/disp32
  - `*reg`: look up memory with the address in some register
  - `*(reg + disp)`: add a constant to the address in some register
  - `*(base + (index << scale) + disp)` where `base` and `index` are registers,
    and `scale` and `disp` are 2- and 32-bit constants respectively.
 Under the hood, SubX turns expressions of these forms into multiple arguments
 with metadata in some complex ways. See [subx\_addressing\_modes.md](subx_addressing_modes.md).
 That covers the complexities of the reg/mem operand. The second operand is
 simpler. It comes from exactly one of the following argument types:
  - `/r32`
  - displacement: `/disp8` or `/disp32`
  - immediate: `/imm8` or `/imm32`
 Putting all this together, here's an example that adds the integer in `eax` to
 the one at address `edx`:
  ```
  01/add %edx 0/r32/eax
  ```
-## The syntax of SubX programs
+[More details on SubX syntax &rarr;](subx.md)
-SubX programs map to the same ELF binaries that a conventional Linux system
+## Forks
 uses. Linux ELF binaries consist of a series of _segments_. In particular, they
 distinguish between code and data. Correspondingly, SubX programs consist of a
 series of segments, each starting with a header line: `==` followed by a name
 and approximate starting address.
 All code must lie in a segment called 'code'.
 Segments can be added to.
 ```sh
 == code 0x09000000  # first mention requires starting address
 ...A...
 == data 0x0a000000
 ...B...
 == code             # no address necessary when adding
 ...C...
 ```
 The `code` segment now contains the instructions of `A` as well as `C`.
 Within the `code` segment, each line contains a comment, label or instruction.
 Comments start with a `#` and are ignored. Labels should always be the first
 word on a line, and they end with a `:`.
 Instructions can refer to labels in displacement or immediate arguments, and
 they'll obtain a value based on the address of the label: immediate arguments
 will contain the address directly, while displacement arguments will contain
 the difference between the address and the address of the current instruction.
 The latter is mostly useful for `jump` and `call` instructions.
 Functions are defined using labels. By convention, labels internal to functions
 (that must only be jumped to) start with a `$`. Any other labels must only be
 called, never jumped to. All labels must be unique.
 Functions are called using the following syntax:
 ```
 (func arg1 arg2 ...)
 ```
 Function arguments must be either literals (integers or strings) or a reg/mem
 operand using the syntax in the previous section.
 A special label is `Entry`, which can be used to specify/override the entry
 point of the program. It doesn't have to be unique, and the latest definition
 will override earlier ones.
 (The `Entry` label, along with duplicate segment headers, allows programs to
 be built up incrementally out of multiple [_layers_](http://akkartik.name/post/wart-layers).)
 Another special pair of labels are the block delimiters `{` and `}`. They can
 be nested, and jump instructions can take arguments `loop` or `break` that
 jump to the enclosing `{` and `}` respectively.
 The data segment consists of labels as before and byte values. Referring to
 data labels in either `code` segment instructions or `data` segment values
 yields their address.
 Automatic tests are an important part of SubX, and there's a simple mechanism
 to provide a test harness: all functions that start with `test-` are called in
 turn by a special, auto-generated function called `run-tests`. How you choose
 to call it is up to you.
 I try to keep things simple so that there's less work to do when implementing
 SubX in SubX. But there _is_ one convenience: instructions can provide a
 string literal surrounded by quotes (`"`) in an `imm32` argument. SubX will
 transparently copy it to the `data` segment and replace it with its address.
 Strings are the only place where a SubX word is allowed to contain spaces.
 That should be enough information for writing SubX programs. The `apps/`
 directory provides some fodder for practice in the `apps/ex*.subx` files,
 giving a more gradual introduction to SubX features. In particular, you should
 work through `apps/factorial4.subx`, which demonstrates all the above ideas in
 concert.
 This repo includes binaries for all examples. At any commit, an example's
 binary should be identical bit for bit with the result of translating the
 corresponding `.subx` file. The binary should also be natively runnable on a
 Linux system running on Intel x86 processors, either 32- or 64-bit. If either
 of these invariants is broken it's a bug on my part.
 ## Running
 `bootstrap` currently has the following sub-commands:
 - `bootstrap help`: some helpful documentation to have at your fingertips.
 - `bootstrap test`: runs all automated tests.
 - `bootstrap translate <input files> -o <output ELF binary>`: translates `.subx`
  files into an executable ELF binary.
 - `bootstrap run <ELF binary> <args>`: simulates running the ELF binaries emitted
  by `bootstrap translate`. Useful for testing and debugging.
  Remember, not all 32-bit Linux binaries are guaranteed to run. I'm not
  building general infrastructure here for all of the x86 instruction set.
  SubX is about programming with a small, regular subset of 32-bit x86.
 ## Getting your editor set up
 If you've read this far, it's time to set up your editor. Mu is really
 intended to be read interactively rather than on a browser.
 There is rudimentary syntax highlighting support for Mu and SubX files for
 various editors. Look for your editor in `mu.*` and `subx.*`, and follow the
 instructions within.
 The Vim files are most developed. In particular, I recommend some optional
 setup in subx.vim to use multiple colors for comments.
 If you use [Exuberant Ctags](http://ctags.sourceforge.net) for jumping easily
 from names to their definitions in your editor, copy the contents of `exuberant_ctags_rc`
 into your `.ctags` file.
 ## A few hints for debugging
 Writing programs in SubX is surprisingly pleasant and addictive. Reading
 programs is a work in progress, and hopefully the extensive unit tests help.
 However, _debugging_ programs is where one really faces up to the low-level
 nature of SubX. Even the smallest modifications need testing to make sure they
 work. In my experience, there is no modification so small that I get it working
 on the first attempt. And when it doesn't work, there are no clear error
 messages. Machine code is too simple-minded for that. You can't use a debugger,
 since SubX's simplistic ELF binaries contain no debugging information. So
 debugging requires returning to basics and practicing with a new, more
 rudimentary but hopefully still workable toolkit:
 - Start by nailing down a concrete set of steps for reproducibly obtaining the
  error or erroneous behavior.
 - If possible, turn the steps into a failing test. It's not always possible,
  but SubX's primary goal is to keep improving the variety of tests one can
  write.
 - Start running the single failing test alone. This involves modifying the top
  of the program (or the final `.subx` file passed in to `bootstrap translate`) by
  replacing the call to `run-tests` with a call to the appropriate `test-`
  function.
 - Generate a trace for the failing test while running your program in emulated
  mode (`bootstrap run`):
  ```
  $ ./bootstrap translate input.subx -o binary
  $ ./bootstrap --trace run binary arg1 arg2  2>trace
  ```
  The ability to generate a trace is the essential reason for the existence of
  `bootstrap run` mode. It gives far better visibility into program internals than
  running natively.
 - As a further refinement, it is possible to render label names in the trace
  by adding a second flag to the `bootstrap translate` command:
  ```
  $ ./bootstrap --debug translate input.subx -o binary
  $ ./bootstrap --trace run binary arg1 arg2  2>trace
  ```
  `bootstrap --debug translate` emits a mapping from label to address in a file
  called `labels`. `bootstrap --trace run` reads in the `labels` file if
  it exists and prints out any matching label name as it traces each instruction
  executed.
  Here's a sample of what a trace looks like, with a few boxes highlighted:
  <img alt='trace example' src='html/trace.png'>
  Each of the green boxes shows the trace emitted for a single instruction.
  It starts with a line of the form `run: inst: ___` followed by the opcode
  for the instruction, the state of registers before the instruction executes,
  and various other facts deduced during execution. Some instructions first
  print a matching label. In the above screenshot, the red boxes show that
  address `0x0900005e` maps to label `$loop` and presumably marks the start of
  some loop. Function names get similar `run: == label` lines.
 - One trick when emitting traces with labels:
  ```
  $ grep label trace
  ```
  This is useful for quickly showing you the control flow for the run, and the
  function executing when the error occurred. I find it useful to start with
  this information, only looking at the complete trace after I've gotten
  oriented on the control flow. Did it get to the loop I just modified? How
  many times did it go through the loop?
 - Once you have SubX displaying labels in traces, it's a short step to modify
  the program to insert more labels just to gain more insight. For example,
  consider the following function:
  <img alt='control example -- before' src='html/control0.png'>
  This function contains a series of jump instructions. If a trace shows
  `is-hex-lowercase-byte?` being encountered, and then `$is-hex-lowercase-byte?:end`
  being encountered, it's still ambiguous what happened. Did we hit an early
  exit, or did we execute all the way through? To clarify this, add temporary
  labels after each jump:
  <img alt='control example -- after' src='html/control1.png'>
  Now the trace should have a lot more detail on which of these labels was
  reached, and precisely when the exit was taken.
 - If you find yourself wondering, "when did the contents of this memory
  address change?", `bootstrap run` has some rudimentary support for _watch
  points_. Just insert a label starting with `$watch-` before an instruction
  that writes to the address, and its value will start getting dumped to the
  trace after every instruction thereafter.
 - Once we have a sense for precisely which instructions we want to look at,
  it's time to look at the trace as a whole. Key is the state of registers
  before each instruction. If a function is receiving bad arguments it becomes
  natural to inspect what values were pushed on the stack before calling it,
  tracing back further from there, and so on.
  I occasionally want to see the precise state of the stack segment, in which
  case I uncomment a commented-out call to `dump_stack()` in the `vm.cc`
  layer. It makes the trace a lot more verbose and a lot less dense, necessitating
  a lot more scrolling around, so I keep it turned off most of the time.
 - If the trace seems overwhelming, try [browsing it](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md)
  in the 'time-travel debugger'.
 - Don't be afraid to slice and dice the trace using Unix tools. For example,
  say you have a SubX binary that dies while running tests. You can see what
  test it's segfaulting at by compiling it with debug information using
  `./translate_subx_debug`, and then running:
  ```
  ./bootstrap --trace --dump run a.elf test 2>&1 |grep 'label test'
  ```
  Just read out the last test printed out before the segfault.
 Hopefully these hints are enough to get you started. The main thing to
 remember is to not be afraid of modifying the sources. A good debugging
 session gets into a nice rhythm of generating a trace, staring at it for a
 while, modifying the sources, regenerating the trace, and so on. Email
 [me](mailto:mu@akkartik.com) if you'd like another pair of eyes to stare at a
 trace, or if you have questions or complaints.
 ## Reference documentation on available primitives
 ### Data Structures
 - Kernel strings: null-terminated regions of memory. Unsafe and to be avoided,
  but needed for interacting with the kernel.
 - Arrays: length-prefixed regions of memory containing multiple elements of a
  single type. Contents are preceded by 4 bytes (32 bits) containing the
  `length` of the array in bytes.
 - Slices: a pair of 32-bit addresses denoting a [half-open](https://en.wikipedia.org/wiki/Interval_(mathematics))
  \[`start`, `end`) interval to live memory with a consistent lifetime.
  Invariant: `start` <= `end`
 - Streams: strings prefixed by 32-bit `write` and `read` indexes that the next
  write or read goes to, respectively.
  - offset 0: write index
  - offset 4: read index
  - offset 8: length of array (in bytes)
  - offset 12: start of array data
  Invariant: 0 <= `read` <= `write` <= `length`
 - File descriptors (fd): Low-level 32-bit integers that the kernel uses to
  track files opened by the program.
 - File: 32-bit value containing either a fd or an address to a stream (fake
  file).
 - Buffered files (buffered-file): Contain a file descriptor and a stream for
  buffering reads/writes. Each `buffered-file` must exclusively perform either
  reads or writes.
 ### 'system calls'
 As I said at the top, a primary design goal of SubX (and Mu more broadly) is
 to explore ways to turn arbitrary manual tests into reproducible automated
 tests. SubX aims for this goal by baking testable interfaces deep into the
 stack, at the OS syscall level. The idea is that every syscall that interacts
 with hardware (and so the environment) should be *dependency injected* so that
 it's possible to insert fake hardware in tests.
 But those are big goals. Here are the syscalls I have so far:
 - `write`: takes two arguments, a file `f` and an address to array `s`.
  Comparing this interface with the Unix `write()` syscall shows two benefits:
  1. SubX can handle 'fake' file descriptors in tests.
  1. `write()` accepts buffer and its length in separate arguments, which
     requires callers to manage the two separately and so can be error-prone.
     SubX's wrapper keeps the two together to increase the chances that we
     never accidentally go out of array bounds.
 - `read`: takes two arguments, a file `f` and an address to stream `s`. Reads
  as much data from `f` as can fit in (the free space of) `s`.
  Like with `write()`, this wrapper around the Unix `read()` syscall adds the
  ability to handle 'fake' file descriptors in tests, and reduces the chances
  of clobbering outside array bounds.
  One bit of weirdness here: in tests we do a redundant copy from one stream
  to another. See [the comments before the implementation](http://akkartik.github.io/mu/html/060read.subx.html)
  for a discussion of alternative interfaces.
 - `stop`: takes two arguments:
  - `ed` is an address to an _exit descriptor_. Exit descriptors allow us to
    `exit()` the program in production, but return to the test harness within
    tests. That allows tests to make assertions about when `exit()` is called.
  - `value` is the status code to `exit()` with.
  For more details on exit descriptors and how to create one, see [the
  comments before the implementation](http://akkartik.github.io/mu/html/059stop.subx.html).
 - `new-segment`
  Allocates a whole new segment of memory for the program, discontiguous with
  both existing code and data (heap) segments. Just a more opinionated form of
  [`mmap`](http://man7.org/linux/man-pages/man2/mmap.2.html).
 - `allocate`: takes two arguments, an address to allocation-descriptor `ad`
  and an integer `n`
  Allocates a contiguous range of memory that is guaranteed to be exclusively
  available to the caller. Returns the starting address to the range in `eax`.
  An allocation descriptor tracks allocated vs available addresses in some
  contiguous range of memory. The int specifies the number of bytes to allocate.
  Explicitly passing in an allocation descriptor allows for nested memory
  management, where a sub-system gets a chunk of memory and further parcels it
  out to individual allocations. Particularly helpful for (surprise) tests.
 - ... _(to be continued)_
 I will continue to import syscalls over time from [the old Mu VM in the parent
 directory](https://github.com/akkartik/mu), which has experimented with
 interfaces for the screen, keyboard, mouse, disk and network.
 ### primitives built atop system calls
 _(Compound arguments are usually passed in by reference. Where the results are
 compound objects that don't fit in a register, the caller usually passes in
 allocated memory for it.)_
 #### assertions for tests
 - `check-ints-equal`: fails current test if given ints aren't equal
 - `check-stream-equal`: fails current test if stream doesn't match string
 - `check-next-stream-line-equal`: fails current test if next line of stream
  until newline doesn't match string
 #### error handling
 - `error`: takes three arguments, an exit-descriptor, a file and a string (message)
  Prints out the message to the file and then exits using the provided
  exit-descriptor.
 - `error-byte`: like `error` but takes an extra byte value that it prints out
  at the end of the message.
 #### predicates
 - `kernel-string-equal?`: compares a kernel string with a string
 - `string-equal?`: compares two strings
 - `stream-data-equal?`: compares a stream with a string
 - `next-stream-line-equal?`: compares with string the next line in a stream, from
  `read` index to newline
 - `slice-empty?`: checks if the `start` and `end` of a slice are equal
 - `slice-equal?`: compares a slice with a string
 - `slice-starts-with?`: compares the start of a slice with a string
 - `slice-ends-with?`: compares the end of a slice with a string
 #### writing to disk
 - `write`: string -> file
  - Can also be used to cat a string into a stream.
  - Will abort the entire program if destination is a stream and doesn't have
    enough room.
 - `write-stream`: stream -> file
  - Can also be used to cat one stream into another.
  - Will abort the entire program if destination is a stream and doesn't have
    enough room.
 - `write-slice`: slice -> stream
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `append-byte`: int -> stream
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `append-byte-hex`: int -> stream
  - textual representation in hex, no '0x' prefix
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `print-int32`: int -> stream
  - textual representation in hex, including '0x' prefix
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `write-buffered`: string -> buffered-file
 - `write-slice-buffered`: slice -> buffered-file
 - `flush`: buffered-file
 - `write-byte-buffered`: int -> buffered-file
 - `print-byte-buffered`: int -> buffered-file
  - textual representation in hex, no '0x' prefix
 - `print-int32-buffered`: int -> buffered-file
  - textual representation in hex, including '0x' prefix
 #### reading from disk
 - `read`: file -> stream
  - Can also be used to cat one stream into another.
  - Will silently stop reading when destination runs out of space.
 - `read-byte-buffered`: buffered-file -> byte
 - `read-line-buffered`: buffered-file -> stream
  - Will abort the entire program if there isn't enough room.
 #### non-IO operations on streams
 - `new-stream`: allocates space for a stream of `n` elements, each occupying
  `b` bytes.
  - Will abort the entire program if `n*b` requires more than 32 bits.
 - `clear-stream`: resets everything in the stream to `0` (except its `length`).
 - `rewind-stream`: resets the read index of the stream to `0` without modifying
  its contents.
 #### reading/writing hex representations of integers
 - `is-hex-int?`: takes a slice argument, returns boolean result in `eax`
 - `parse-hex-int`: takes a slice argument, returns int result in `eax`
 - `is-hex-digit?`: takes a 32-bit word containing a single byte, returns
  boolean result in `eax`.
 - `from-hex-char`: takes a hexadecimal digit character in `eax`, returns its
  numeric value in `eax`
 - `to-hex-char`: takes a single-digit numeric value in `eax`, returns its
  corresponding hexadecimal character in `eax`
 #### tokenization
 from a stream:
 - `next-token`: stream, delimiter byte -> slice
 - `skip-chars-matching`: stream, delimiter byte
 - `skip-chars-not-matching`: stream, delimiter byte
 from a slice:
 - `next-token-from-slice`: start, end, delimiter byte -> slice
  - Given a slice and a delimiter byte, returns a new slice inside the input
    that ends at the delimiter byte.
 - `skip-chars-matching-in-slice`: curr, end, delimiter byte -> new-curr (in `eax`)
 - `skip-chars-not-matching-in-slice`:  curr, end, delimiter byte -> new-curr (in `eax`)
 ## Resources
 - [Single-page cheatsheet for the x86 ISA](https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x86_opcode_structure_and_instruction_overview.pdf)
  (pdf; [cached local copy](https://github.com/akkartik/mu/blob/master/cheatsheet.pdf))
 - [Concise reference for the x86 ISA](https://c9x.me/x86)
 - [Intel processor manual](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf) (pdf)
 - [&ldquo;Bootstrapping a compiler from nothing&rdquo;](http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html) by Edmund Grumley-Evans.
 - [&ldquo;Creating tiny ELF executables&rdquo;](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) by Brian Raiter.
 - [StoneKnifeForth](https://github.com/kragen/stoneknifeforth) by [Kragen Sitaker](http://canonical.org/~kragen).
 ## Other forks
 Forks of Mu are encouraged. If you don't like something about this repo, feel
 free to make a fork. If you show it to me, I'll link to it here, so others can
@ -726,64 +145,34 @@ use it. I might even pull your changes into this repo!
 - [mu-normie](https://git.sr.ht/~akkartik/mu-normie): with a more standard
  build system and C++ modules.
-## Conclusion
+## Desiderata
 The hypothesis of Mu and SubX is that designing the entire system to be
 testable from day 1 and from the ground up would radically impact the culture
 of the eco-system in a way that no bolted-on tool or service at higher levels
 can replicate:
 - Tests would make it easier to write programs that can be easily understood
  by newcomers.
 - More broad-based understanding would lead to more forks.
 - Tests would make it easy to share code across forks. Copy the tests over,
  and then copy code over and polish it until the tests pass. Manual work, but
  tractable and without major risks.
 - The community would gain a diversified portfolio of forks for each program,
  a “wavefront” of possible combinations of features and alternative
  implementations of features. Application writers who wrote thorough tests
  for their apps (something they just can’t do today) would be able to bounce
  around between forks more easily without getting locked in to a single one
  as currently happens.
 - There would be a stronger culture of reviewing the code for programs you use
  or libraries you depend on. [More eyeballs would make more bugs shallow.](https://en.wikipedia.org/wiki/Linus%27s_Law)
 To falsify these hypotheses, here's a roadmap of the next few planned features:
 - Testable, dependency-injected vocabulary of primitives
  - Streams: `read()`, `write()`. (✓)
  - `exit()` (✓)
  - Client-like non-blocking socket/file primitives: `load`, `save`
  - Concurrency, and a framework for testing blocking code
  - Server-like blocking socket/file primitives
 - Gradually streamline the bundled kernel, stripping away code we don't need.
 ---
 If you're still reading, here are some more things to check out:
-a) Try running the tests: `./test_apps`
+- The references on [Mu](mu.md) and [SubX](subx.md) syntax, and also [bare
  SubX](subx_bare.md) without any syntax sugar.
-b) There's a handy [summary](mu_instructions) of how the Mu compiler translates
+- [How to get your text editor set up for Mu and SubX programs.](editor.md)
 instructions to SubX.
-c) Check out the online help on SubX. Starting point: `./bootstrap`
+- [Some tips for debugging Mu and SubX programs.](debugging.md)
-d) Familiarize yourself with the list of opcodes supported in SubX: `./bootstrap
+- [Shared vocabulary of data types and functions shared by Mu programs.](vocabulary.md)
-help opcodes`. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
+  Mu programs can transparently call low-level functions written in SubX.
 [Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
 are some tips on my setup for quickly finding the right opcode for any
 situation from within Vim.
-e) Try working on [some starter SubX exercises](https://github.com/akkartik/mu/pulls)
+- [A summary](mu_instructions) of how the Mu compiler translates instructions
-(labelled `hello`).
+  to SubX. ([colorized version](http://akkartik.github.io/mu/html/mu_instructions.html))
-f) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
+- [Some starter exercises for learning SubX](https://github.com/akkartik/mu/pulls)
  (labelled `hello`). Feel free to [ping me](mailto:ak@akkartik.com) with any questions.
 - [Commandline reference for the bootstrap C++ program.](bootstrap.md)
 - The [list of x86 opcodes](subx_opcodes) supported in SubX: `./bootstrap
  help opcodes`.
 - [Some details on the unconventional organization of this project.](http://akkartik.name/post/four-repos)
 - Previous prototypes: [mu0](https://github.com/akkartik/mu0), [mu1](https://github.com/akkartik/mu1).
 ## Credits
@ -795,36 +184,15 @@ Mu builds on many ideas that have come before, especially:
  and [Richard Gabriel](http://dreamsongs.net/Files/PatternsOfSoftware.pdf) for
  the intellectual tools for reasoning about the higher order design of a
  codebase;
 - Unix and C for showing us how to co-evolve language and OS, and for teaching
  the (much maligned, misunderstood and underestimated) value of concise
  *implementation* in addition to a clean interface;
 - Donald Knuth's [literate programming](http://www.literateprogramming.com/knuthweb.pdf)
  for liberating "code for humans to read" from the tyranny of compiler order;
 - [David Parnas](http://www.cs.umd.edu/class/spring2003/cmsc838p/Design/criteria.pdf)
  and others for highlighting the value of separating concerns and stepwise
  refinement;
 - [Lisp](http://www.paulgraham.com/rootsoflisp.html) for showing the power of
  dynamic languages, late binding and providing the right primitives *a la
  carte*, especially lisp macros;
 - The folklore of debugging by print and the trace facility in many lisp
  systems;
 - Automated tests for showing the value of developing programs inside an
  elaborate harness;
 - [Python doctest](http://docs.python.org/2/library/doctest.html) for
  exemplifying interactive documentation that doubles as tests;
 - [ReStructuredText](https://en.wikipedia.org/wiki/ReStructuredText)
  and [its antecedents](https://en.wikipedia.org/wiki/Setext) for showing that
  markup can be clean;
 - BDD for challenging us all to write tests at a higher level;
 - JavaScript and CSS for demonstrating the power of a DOM for complex
  structured documents;
 - Rust for demonstrating that a system-programming language can be safe;
 - Forth for demonstrating that ergonomics don't require grammar; and
 - [Minimal Linux Live](http://minimal.linux-bg.org) for teaching how to create
  a bootable disk image.
- [Soso](https://github.com/ozkl/soso), a tiny hackable OS.
+- [&ldquo;Bootstrapping a compiler from nothing&rdquo;](http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html) by Edmund Grumley-Evans.
-
+- [&ldquo;Creating tiny ELF executables&rdquo;](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) by Brian Raiter.
-## Coda
+- [StoneKnifeForth](https://github.com/kragen/stoneknifeforth) by [Kragen Sitaker](http://canonical.org/~kragen).
 - [Some details on the unconventional organization of this project.](http://akkartik.name/post/four-repos)
 - Previous prototypes: [mu0](https://github.com/akkartik/mu0), [mu1](https://github.com/akkartik/mu1).
--- a/apps/README.md
+++ b/apps/README.md
@ -9,3 +9,9 @@ Some apps written in SubX and Mu, in 3 categories:
  * More ambitious translator for a memory-safe language (in progress): `mu`
 * Miscellaneous test programs.
 All SubX apps include binaries. At any commit, an example's binary should be
 identical bit for bit with the result of translating the corresponding `.subx`
 file. The binary should also be natively runnable on a Linux system running on
 Intel x86 processors, either 32- or 64-bit. If either of these invariants is
 broken, it's a bug.
--- a/apps/ex2
+++ b/apps/ex2
--- a/apps/ex2.2.mu
+++ b/apps/ex2.2.mu
@ -1,19 +0,0 @@
 # Increment a number, and return the result in the exit code.
 #
 # To run:
 #   $ ./translate_mu apps/ex2.2.mu
 #   $ ./a.elf
 # Expected result:
 #   $ echo $?
 #   7
 fn main -> result/ebx: int {
  result <- foo
 }
 fn foo -> result/ebx: int {
  var n: int
  copy-to n, 3
  increment n
  result <- copy n
 }
--- a/apps/ex2.mu
+++ b/apps/ex2.mu
@ -1,4 +1,4 @@
-# Add two numbers, and return the result in the exit code.
+# Add 3 and 4, and return the result in the exit code.
 #
 # To run:
 #   $ ./translate_mu apps/ex2.mu
--- a/apps/ex2.subx
+++ b/apps/ex2.subx
@ -1,4 +1,4 @@
-# Add 1 and 1, and return the result in the exit code.
+# Add 3 and 4, and return the result in the exit code.
 #
 # To run:
 #   $ ./bootstrap translate init.linux apps/ex2.subx -o apps/ex2
@ -10,10 +10,10 @@
 == code
 Entry:
-# ebx = 1
+# ebx = 3
-bb/copy-to-ebx  1/imm32
+bb/copy-to-ebx  3/imm32
-# increment ebx
+# add 4 to ebx
-43/increment-ebx
+81 0/subop/add 3/mod/direct 3/rm32/ebx 4/imm32
 # exit(ebx)
 e8/call  syscall_exit/disp32
--- a/bootstrap.md
+++ b/bootstrap.md
@ -0,0 +1,17 @@
 ## Running
 `bootstrap` currently has the following sub-commands:
 - `bootstrap help`: some helpful documentation to have at your fingertips.
 - `bootstrap test`: runs all automated tests.
 - `bootstrap translate <input files> -o <output ELF binary>`: translates `.subx`
  files into an executable ELF binary.
 - `bootstrap run <ELF binary> <args>`: simulates running the ELF binaries emitted
  by `bootstrap translate`. Useful for testing and debugging.
  Remember, not all 32-bit Linux binaries are guaranteed to run. I'm not
  building general infrastructure here for all of the x86 instruction set.
  SubX is about programming with a small, regular subset of 32-bit x86.
--- a/debugging.md
+++ b/debugging.md
@ -0,0 +1,122 @@
 ## A few hints for debugging
 Writing programs in SubX is surprisingly pleasant and addictive. Reading
 programs is a work in progress, and hopefully the extensive unit tests help.
 However, _debugging_ programs is where one really faces up to the low-level
 nature of SubX. Even the smallest modifications need testing to make sure they
 work. In my experience, there is no modification so small that I get it working
 on the first attempt. And when it doesn't work, there are no clear error
 messages. Machine code is too simple-minded for that. You can't use a debugger,
 since SubX's simplistic ELF binaries contain no debugging information. So
 debugging requires returning to basics and practicing with a new, more
 rudimentary but hopefully still workable toolkit:
 - Start by nailing down a concrete set of steps for reproducibly obtaining the
  error or erroneous behavior.
 - If possible, turn the steps into a failing test. It's not always possible,
  but SubX's primary goal is to keep improving the variety of tests one can
  write.
 - Start running the single failing test alone. This involves modifying the top
  of the program (or the final `.subx` file passed in to `bootstrap translate`) by
  replacing the call to `run-tests` with a call to the appropriate `test-`
  function.
 - Generate a trace for the failing test while running your program in emulated
  mode (`bootstrap run`):
  ```
  $ ./bootstrap translate input.subx -o binary
  $ ./bootstrap --trace run binary arg1 arg2  2>trace
  ```
  The ability to generate a trace is the essential reason for the existence of
  `bootstrap run` mode. It gives far better visibility into program internals than
  running natively.
 - As a further refinement, it is possible to render label names in the trace
  by adding a second flag to the `bootstrap translate` command:
  ```
  $ ./bootstrap --debug translate input.subx -o binary
  $ ./bootstrap --trace run binary arg1 arg2  2>trace
  ```
  `bootstrap --debug translate` emits a mapping from label to address in a file
  called `labels`. `bootstrap --trace run` reads in the `labels` file if
  it exists and prints out any matching label name as it traces each instruction
  executed.
  Here's a sample of what a trace looks like, with a few boxes highlighted:
  <img alt='trace example' src='html/trace.png'>
  Each of the green boxes shows the trace emitted for a single instruction.
  It starts with a line of the form `run: inst: ___` followed by the opcode
  for the instruction, the state of registers before the instruction executes,
  and various other facts deduced during execution. Some instructions first
  print a matching label. In the above screenshot, the red boxes show that
  address `0x0900005e` maps to label `$loop` and presumably marks the start of
  some loop. Function names get similar `run: == label` lines.
 - One trick when emitting traces with labels:
  ```
  $ grep label trace
  ```
  This is useful for quickly showing you the control flow for the run, and the
  function executing when the error occurred. I find it useful to start with
  this information, only looking at the complete trace after I've gotten
  oriented on the control flow. Did it get to the loop I just modified? How
  many times did it go through the loop?
 - Once you have SubX displaying labels in traces, it's a short step to modify
  the program to insert more labels just to gain more insight. For example,
  consider the following function:
  <img alt='control example -- before' src='html/control0.png'>
  This function contains a series of jump instructions. If a trace shows
  `is-hex-lowercase-byte?` being encountered, and then `$is-hex-lowercase-byte?:end`
  being encountered, it's still ambiguous what happened. Did we hit an early
  exit, or did we execute all the way through? To clarify this, add temporary
  labels after each jump:
  <img alt='control example -- after' src='html/control1.png'>
  Now the trace should have a lot more detail on which of these labels was
  reached, and precisely when the exit was taken.
 - If you find yourself wondering, "when did the contents of this memory
  address change?", `bootstrap run` has some rudimentary support for _watch
  points_. Just insert a label starting with `$watch-` before an instruction
  that writes to the address, and its value will start getting dumped to the
  trace after every instruction thereafter.
 - Once we have a sense for precisely which instructions we want to look at,
  it's time to look at the trace as a whole. Key is the state of registers
  before each instruction. If a function is receiving bad arguments it becomes
  natural to inspect what values were pushed on the stack before calling it,
  tracing back further from there, and so on.
  I occasionally want to see the precise state of the stack segment, in which
  case I uncomment a commented-out call to `dump_stack()` in the `vm.cc`
  layer. It makes the trace a lot more verbose and a lot less dense, necessitating
  a lot more scrolling around, so I keep it turned off most of the time.
 - If the trace seems overwhelming, try [browsing it](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md)
  in the 'time-travel debugger'.
 - Don't be afraid to slice and dice the trace using Unix tools. For example,
  say you have a SubX binary that dies while running tests. You can see what
  test it's segfaulting at by compiling it with debug information using
  `./translate_subx_debug`, and then running:
  ```
  ./bootstrap --trace --dump run a.elf test 2>&1 |grep 'label test'
  ```
  Just read out the last test printed out before the segfault.
 Hopefully these hints are enough to get you started. The main thing to
 remember is to not be afraid of modifying the sources. A good debugging
 session gets into a nice rhythm of generating a trace, staring at it for a
 while, modifying the sources, regenerating the trace, and so on. Email
 [me](mailto:mu@akkartik.com) if you'd like another pair of eyes to stare at a
 trace, or if you have questions or complaints.
--- a/editor.md
+++ b/editor.md
@ -0,0 +1,19 @@
 ## Getting your editor set up
 If you've read this far, it's time to set up your editor. Mu is really
 intended to be read interactively rather than on a browser.
 There is rudimentary syntax highlighting support for Mu and SubX files for
 various editors. Look for your editor in `mu.*` and `subx.*`, and follow the
 instructions within.
 The Vim files are most developed. In particular, I recommend some optional
 setup in subx.vim to use multiple colors for comments.
 If you use [Exuberant Ctags](http://ctags.sourceforge.net) for jumping easily
 from names to their definitions in your editor, copy the contents of `exuberant_ctags_rc`
 into your `.ctags` file.
 [Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
 are some tips on my setup for quickly finding the right opcode for any
 situation from within Vim.
--- a/html/apps/ex2.2.mu.html
+++ b/html/apps/ex2.2.mu.html
@ -1,81 +0,0 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
 <html>
 <head>
 <meta http-equiv="content-type" content="text/html; charset=UTF-8">
 <title>Mu - apps/ex2.2.mu</title>
 <meta name="Generator" content="Vim/8.1">
 <meta name="plugin-version" content="vim8.1_v1">
 <meta name="syntax" content="none">
 <meta name="settings" content="number_lines,use_css,pre_wrap,no_foldcolumn,expand_tabs,line_ids,prevent_copy=">
 <meta name="colorscheme" content="minimal-light">
 <style type="text/css">
 <!--
 pre { white-space: pre-wrap; font-family: monospace; color: #000000; background-color: #c6c6c6; }
 body { font-size:12pt; font-family: monospace; color: #000000; background-color: #c6c6c6; }
 a { color:inherit; }
 * { font-size:12pt; font-size: 1em; }
 .LineNr { }
 .muFunction { color: #af5f00; text-decoration: underline; }
 .SpecialChar { color: #d70000; }
 .Comment { color: #005faf; }
 .Constant { color: #008787; }
 .Delimiter { color: #c000c0; }
 .PreProc { color: #c000c0; }
 -->
 </style>
 <script type='text/javascript'>
 <!--
 /* function to open any folds containing a jumped-to line before jumping to it */
 function JumpToLine()
 {
  var lineNum;
  lineNum = window.location.hash;
  lineNum = lineNum.substr(1); /* strip off '#' */
  if (lineNum.indexOf('L') == -1) {
    lineNum = 'L'+lineNum;
  }
  var lineElem = document.getElementById(lineNum);
  /* Always jump to new location even if the line was hidden inside a fold, or
   * we corrected the raw number to a line ID.
   */
  if (lineElem) {
    lineElem.scrollIntoView(true);
  }
  return true;
 }
 if ('onhashchange' in window) {
  window.onhashchange = JumpToLine;
 }
 -->
 </script>
 </head>
 <body onload='JumpToLine();'>
 <a href='https://github.com/akkartik/mu/blob/master/apps/ex2.2.mu'>https://github.com/akkartik/mu/blob/master/apps/ex2.2.mu</a>
 <pre id='vimCodeElement'>
 <span id="L1" class="LineNr"> 1 </span><span class="Comment"># Increment a number, and return the result in the exit code.</span>
 <span id="L2" class="LineNr"> 2 </span><span class="Comment">#</span>
 <span id="L3" class="LineNr"> 3 </span><span class="Comment"># To run:</span>
 <span id="L4" class="LineNr"> 4 </span><span class="Comment">#   $ ./translate_mu apps/ex2.2.mu</span>
 <span id="L5" class="LineNr"> 5 </span><span class="Comment">#   $ ./a.elf</span>
 <span id="L6" class="LineNr"> 6 </span><span class="Comment"># Expected result:</span>
 <span id="L7" class="LineNr"> 7 </span><span class="Comment">#   $ echo $?</span>
 <span id="L8" class="LineNr"> 8 </span><span class="Comment">#   7</span>
 <span id="L9" class="LineNr"> 9 </span>
 <span id="L10" class="LineNr">10 </span><span class="PreProc">fn</span> <span class="muFunction">main</span><span class="PreProc"> -&gt; </span>result/<span class="Constant">ebx</span>: int <span class="Delimiter">{</span>
 <span id="L11" class="LineNr">11 </span>  result <span class="SpecialChar">&lt;-</span> <a href='ex2.2.mu.html#L14'>foo</a>
 <span id="L12" class="LineNr">12 </span><span class="Delimiter">}</span>
 <span id="L13" class="LineNr">13 </span>
 <span id="L14" class="LineNr">14 </span><span class="PreProc">fn</span> <span class="muFunction"><a href='ex2.2.mu.html#L14'>foo</a></span><span class="PreProc"> -&gt; </span>result/<span class="Constant">ebx</span>: int <span class="Delimiter">{</span>
 <span id="L15" class="LineNr">15 </span>  <span class="PreProc">var</span> n: int
 <span id="L16" class="LineNr">16 </span>  copy-to n, <span class="Constant">3</span>
 <span id="L17" class="LineNr">17 </span>  increment n
 <span id="L18" class="LineNr">18 </span>  result <span class="SpecialChar">&lt;-</span> copy n
 <span id="L19" class="LineNr">19 </span><span class="Delimiter">}</span>
 </pre>
 </body>
 </html>
 <!-- vim: set foldmethod=manual : -->
--- a/mu.md
+++ b/mu.md
@ -0,0 +1,456 @@
 # Mu Syntax
 Here are two valid statements in Mu:
 ```
 increment x
 y <- increment
 ```
 Understanding when to use one vs the other is the critical idea in Mu. In
 short, the former increments a value in memory, while the latter increments a
 value in a register.
 Most languages start from some syntax and do what it takes to implement it.
 Mu, however, is designed as a safe way to program in [a regular subset of
 32-bit x86 machine code](subx.md), _satisficing_ rather than optimizing for a
 clean syntax. To keep the mapping to machine code lightweight, Mu exclusively
 uses statements and most statements map to a single instruction of machine
 code.
 Since the x86 instruction set restricts how many memory locations an instruction
 can use, Mu makes registers explicit as well. Variables must be explicitly
 mapped to registers; otherwise they live in memory.
 Statements consist of 3 parts: the operation, optional _inouts_ and optional
 _outputs_. Outputs come before the operation name and `<-`.
 Outputs are always registers; memory locations that need to be modified are
 passed in by reference using the `addr` type described below.
 So Mu programmers need to make two new categories of decisions: whether to
 define variables in registers or memory, and whether to put variables to the
 left or right. There's always exactly one way to write any given operation. In
 return for this overhead you get a lightweight and future-proof stack. And Mu
 will provide good error message to support you.
 Further down, this page enumerates all available primitives in Mu, and [a
 separate page](http://akkartik.github.io/mu/html/mu_instructions.html)
 describes how each primitive is translated to machine code.
 ## Functions and calls
 Zooming out from single statements, here's a complete sample program in Mu:
 <img alt='ex2.mu' src='html/ex2.mu.png'>
 Mu programs are lists of functions. Each function has the following form:
  ```
  fn _name_ inout ... -> output ... {
    _statement_
    _statement_
    ...
  }
  ```
 Each function has a header line, and some number of statements, each on a
 separate line. Headers describe inouts and outputs. Inouts can't be registers,
 and outputs _must_ be registers.
 The above program also demonstrates a function call (to the function `do-add`).
 Function calls look the same as primitive statements: they can return (multiple)
 outputs in registers, and modify inouts passed in by reference. In addition,
 there's one more constraint: output registers must match the function header.
 For example:
  ```
  fn f -> x/eax: int {
    ...
  }
  fn g {
    a/eax <- f  # ok
    a/ebx <- f  # wrong
  }
  ```
 The function `main` is special; it is where the program starts running. It
 must always return a single int in register `ebx` (as the exit status of the
 process). It can also optionally accept an array of strings as input (from the
 shell command-line). To be precise, `main` must have one of the following
 two signatures:
 - `fn main -> x/ebx: int`
 - `fn main args: (addr array (addr array byte)) -> x/ebx: int`
 (The name of the output is flexible.)
 ## Blocks
 Blocks are useful for grouping related statements. They're delimited by `{`
 and `}`, both each alone on a line.
 Blocks can nest:
  ```
  {
    _statements_
    {
      _more statements_
    }
  }
  ```
 Blocks can be named (with the name ending in a `:` on the same line as the
 `{`):
  ```
  $name: {
    _statements_
  }
  ```
 Further down we'll see primitive statements for skipping or repeating blocks.
 Besides control flow, the other use for blocks is...
 ## Local variables
 Functions can define new variables at any time with the keyword `var`. There
 are two variants of the `var` statement, for defining variables in registers
 or memory.
  ```
  var name: type
  var name/reg: type <- ...
  ```
 Variables on the stack are never initialized. (They're always implicitly
 zeroed them out.) Variables in registers are always initialized.
 Variables exist from their definition until the end of their containing block.
 Register variables may also die earlier if their register is clobbered by a
 new variable.
 ## Arithmetic primitives
 Here is the list of arithmetic primitive operations supported by Mu. The name
 `n` indicates a literal integer rather than a variable, and `var/reg` indicates
 a variable in a register.
  ```
  var/reg <- increment
  increment var
  var/reg <- decrement
  decrement var
  var1/reg1 <- add var2/reg2
  var/reg <- add var2
  add-to var1, var2/reg
  var/reg <- add n
  add-to var, n
  var1/reg1 <- sub var2/reg2
  var/reg <- sub var2
  sub-from var1, var2/reg
  var/reg <- sub n
  sub-from var, n
  var1/reg1 <- and var2/reg2
  var/reg <- and var2
  and-with var1, var2/reg
  var/reg <- and n
  and-with var, n
  var1/reg1 <- or var2/reg2
  var/reg <- or var2
  or-with var1, var2/reg
  var/reg <- or n
  or-with var, n
  var1/reg1 <- xor var2/reg2
  var/reg <- xor var2
  xor-with var1, var2/reg
  var/reg <- xor n
  xor-with var, n
  var/reg <- copy var2/reg2
  copy-to var1, var2/reg
  var/reg <- copy var2
  var/reg <- copy n
  copy-to var, n
  compare var1, var2/reg
  compare var1/reg, var2
  compare var/eax, n
  compare var, n
  var/reg <- multiply var2
  ```
 Any statement above that takes a variable in memory can be replaced with a
 dereference (`*`) of an address variable (of type `(addr ...)`) in a register.
 (Types can have multiple words, and are wrapped in `()` when they do.) But you
 can't dereference variables in memory. You have to load them into a register
 first.
 Excluding dereferences, the above statements must operate on non-address
 primitive types: `int` or `boolean`. (Booleans are really just `int`s, and Mu
 assumes any value but `0` is true.)
 ## Operating on individual bytes
 A special-case is variables of type 'byte'. Mu is a 32-bit platform so for the
 most part only supports types that are multiples of 32 bits. However, we do
 want to support strings in ASCII and UTF-8, which will be arrays of bytes.
 Since most x86 instructions implicitly load 32 bits at a time from memory,
 variables of type 'byte' are only allowed in registers, not on the stack. Here
 are the possible statements for reading bytes to/from memory:
  ```
  var/reg <- copy-byte var2/reg2      # var: byte, var2: byte
  var/reg <- copy-byte *var2/reg2     # var: byte, var2: (addr byte)
  copy-byte-to *var1/reg1, var2/reg2  # var1: (addr byte), var2: byte
  ```
 In addition, variables of type 'byte' are restricted to (the lowest bytes of)
 just 4 registers: eax, ecx, edx and ebx.
 ## Primitive jumps
 There are two kinds of jumps, both with many variations: `break` and `loop`.
 `break` instructions jump to the end of the containing block. `loop` instructions
 jump to the beginning of the containing block.
 All jumps can take an optional label starting with '$':
  ```
  loop $foo
  ```
 This instruction jumps to the beginning of the block called $foo. The corresponding
 `break` jumps to the end of the block. Either jump statement must lie somewhere
 inside such a block. Jumps are only legal to containing blocks. (Use named
 blocks with restraint; jumps to places far away can get confusing.)
 There are two unconditional jumps:
  ```
  loop
  loop label
  break
  break label
  ```
 The remaining jump instructions are all conditional. Conditional jumps rely on
 the result of the most recently executed `compare` instruction. (To keep
 programs easy to read, keep compare instructions close to the jump that uses
 them.)
  ```
  break-if-=
  break-if-= label
  break-if-!=
  break-if-!= label
  ```
 Inequalities are similar, but have unsigned and signed variants. For simplicity,
 always use signed integers; use the unsigned variants only to compare addresses.
  ```
  break-if-<
  break-if-< label
  break-if->
  break-if-> label
  break-if-<=
  break-if-<= label
  break-if->=
  break-if->= label
  break-if-addr<
  break-if-addr< label
  break-if-addr>
  break-if-addr> label
  break-if-addr<=
  break-if-addr<= label
  break-if-addr>=
  break-if-addr>= label
  ```
 Similarly, conditional loops:
  ```
  loop-if-=
  loop-if-= label
  loop-if-!=
  loop-if-!= label
  loop-if-<
  loop-if-< label
  loop-if->
  loop-if-> label
  loop-if-<=
  loop-if-<= label
  loop-if->=
  loop-if->= label
  loop-if-addr<
  loop-if-addr< label
  loop-if-addr>
  loop-if-addr> label
  loop-if-addr<=
  loop-if-addr<= label
  loop-if-addr>=
  loop-if-addr>= label
  ```
 ## Addresses
 Passing objects by reference requires the `address` operation, which returns
 an object of type `addr`.
  ```
  var/reg: (addr T) <- address var2: T
  ```
 Here `var2` can't live in a register.
 ## Array operations
 Mu arrays are size-prefixed so that operations on them can check bounds as
 necessary at run-time. The `length` statement returns the number of elements
 in an array.
  ```
  var/reg: int <- length arr/reg: (addr array T)
  ```
 The `index` statement takes an `addr` to an `array` and returns an `addr` to
 one of its elements, that can be read from or written to.
  ```
  var/reg: (addr T) <- index arr/reg: (addr array T), n
  var/reg: (addr T) <- index arr: (array T sz), n
  ```
 The index can also be a variable in a register, with a caveat:
  ```
  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
  var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
  ```
 The caveat: the size of T must be 1, 2, 4 or 8 bytes. The x86 instruction set
 has complex addressing modes that can index into an array in a single instruction
 in these situations.
 For types in general you'll need to split up the work, performing a `compute-offset`
 before the `index`.
  ```
  var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int     # arr can be in reg or mem
  var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int         # arr can be in reg or mem
  ```
 The `compute-offset` statement returns a value of type `(offset T)` after
 performing any necessary bounds checking. Now the offset can be passed to
 `index` as usual:
  ```
  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
  ```
 ## Compound types
 Primitive types can be combined together using the `type` keyword. For
 example:
  ```
  type point {
    x: int
    y: int
  }
  ```
 Mu programs are currently sequences of `fn` and `type` definitions.
 To access within a compound type, use the `get` instruction. There are two
 forms. You need either a variable of the type itself (say `T`) in memory, or a
 variable of type `(addr T)` in a register.
  ```
  var/reg: (addr T_f) <- get var/reg: (addr T), f
  var/reg: (addr T_f) <- get var: T, f
  ```
 The `f` here is the field name from the `type` definition, and `T_f` must
 match the type of `f` in the `type` definition. For example, some legal
 instructions for the definition of `point` above:
  ```
  var a/eax: (addr int) <- get p, x
  var a/eax: (addr int) <- get p, y
  ```
 ## Handles for safe access to the heap
 We've seen the `addr` type, but it's intended to be short-lived. In particular,
 you can't save `addr` values inside compound `type`s. To do that you need a
 "fat pointer" called a `handle` that is safe to keep around for extended
 periods and ensures it's used safely without corrupting the heap and causing
 security issues or hard-to-debug misbehavior.
 To actually _use_ a `handle`, we have to turn it into an `addr` first using
 the `lookup` statement.
  ```
  var y/reg: (addr T) <- lookup x
  ```
 Now operate on the `addr` as usual, safe in the knowledge that you can later
 recover any writes to its payload from `x`.
 It's illegal to continue to use this `addr` after a function that reclaims
 heap memory. You have to repeat the lookup from the `handle`. (Luckily Mu
 doesn't implement reclamation yet.)
 Having two kinds of addresses takes some getting used to. Do we pass in
 variables by value, by `addr` or by `handle`? In inputs or outputs? Here are 3
 rules:
  * Functions that need to look at the payload should accept an `(addr ...)`.
  * Functions that need to treat a handle as a value, without looking at its
  payload, should accept a `(handle ...)`. Helpers that save handles into data
  structures are a common example.
  * Functions that need to allocate memory should accept an `(addr handle
  ...)`.
 Try to avoid mixing these use cases.
 You can copy handles to another variable on the stack like this:
  ```
  var x: (handle T)
  # ..some code initializing x..
  var y/eax: (addr handle T) <- address ...
  copy-handle x, y
  ```
 You can also save handles inside compound types like this:
  ```
  var y/reg: (addr handle T_f) <- get var: (addr T), f
  copy-handle-to *y, x
  ```
 Or this:
  ```
  var y/reg: (addr handle T) <- index arr: (addr array handle T), n
  copy-handle-to *y, x
  ```
 ## Conclusion
 Anything not allowed here is forbidden. At least until you modify mu.subx.
 Please [contact me](mailto:ak@akkartik.com) or [report issues](https://github.com/akkartik/mu/issues)
 when you encounter a missing or misleading error message.
--- a/259
+++ b/259
@ -1,259 +0,0 @@
 Mu programs are lists of functions. Each function has the following form:
  fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
    _instructions_
  }
 Each function has a header line, and some number of instructions, each on a
 separate line.
 Instructions may be primitives or function calls. Either way, all instructions
 have one of the following forms:
  # defining variables
  var _name_: _type_
  var _name_/_register_: _type_
  # doing things with variables
  _operation_ _inouts_
  _outputs_ <- _operation_ _inouts_
 Instructions and functions may have inouts and outputs. Both inouts and
 outputs are variables.
 As seen above, variables can be defined to live in a register, like this:
  n/eax
 Variables not assigned a register live in the stack.
 Function inouts must always be on the stack, and outputs must always be in
 registers. A function call must always write to the exact registers its
 definition requires. For example:
  fn foo -> x/eax: int {
    ...
  }
  fn main {
    a/eax <- foo  # ok
    a/ebx <- foo  # wrong
  }
 Primitive inouts may be on the stack or in registers, but outputs must always
 be in registers.
 Functions can contain nested blocks inside { and }. Variables defined in a
 block don't exist outside it.
  {
    _instructions_
    {
      _more instructions_
    }
  }
 Blocks can be named like so:
  $name: {
    _instructions_
  }
 ## Primitive instructions
 Primitive instructions currently supported in Mu ('n' indicates a literal
 integer rather than a variable, and 'var/reg' indicates a variable in a
 register):
  var/reg <- increment
  increment var
  var/reg <- decrement
  decrement var
  var1/reg1 <- add var2/reg2
  var/reg <- add var2
  add-to var1, var2/reg
  var/reg <- add n
  add-to var, n
  var1/reg1 <- sub var2/reg2
  var/reg <- sub var2
  sub-from var1, var2/reg
  var/reg <- sub n
  sub-from var, n
  var1/reg1 <- and var2/reg2
  var/reg <- and var2
  and-with var1, var2/reg
  var/reg <- and n
  and-with var, n
  var1/reg1 <- or var2/reg2
  var/reg <- or var2
  or-with var1, var2/reg
  var/reg <- or n
  or-with var, n
  var1/reg1 <- xor var2/reg2
  var/reg <- xor var2
  xor-with var1, var2/reg
  var/reg <- xor n
  xor-with var, n
  var/reg <- copy var2/reg2
  copy-to var1, var2/reg
  var/reg <- copy var2
  var/reg <- copy n
  copy-to var, n
  compare var1, var2/reg
  compare var1/reg, var2
  compare var/eax, n
  compare var, n
  var/reg <- multiply var2
 Notice that there are no primitive instructions operating on two variables in
 memory. That's a restriction of the underlying x86 processor.
 Any instruction above that takes a variable in memory can be replaced with a
 dereference (`*`) of an address variable in a register. But you can't dereference
 variables in memory.
 ## Byte operations
 A special-case is variables of type 'byte'. Mu is a 32-bit platform so for the
 most part only supports types that are multiples of 32 bits. However, we do
 want to support strings in ASCII and UTF-8, which will be arrays of bytes.
 Since most x86 instructions implicitly load 32 bits at a time from memory,
 variables of type 'byte' are only allowed in registers, not on the stack. Here
 are the possible instructions for reading bytes to/from memory:
  var/reg <- copy-byte var2/reg2      # var: byte, var2: byte
  var/reg <- copy-byte *var2/reg2     # var: byte, var2: (addr byte)
  copy-byte-to *var1/reg1, var2/reg2  # var1: (addr byte), var2: byte
 In addition, variables of type 'byte' are restricted to (the lowest bytes of)
 just 4 registers: eax, ecx, edx and ebx.
 ## Primitive jump instructions
 There are two kinds of jumps, both with many variations: `break` and `loop`.
 `break` instructions jump to the end of the containing block. `loop` instructions
 jump to the beginning of the containing block.
 Jumps can take an optional label starting with '$':
  loop $foo
 This instruction jumps to the beginning of the block called $foo. It must lie
 somewhere inside such a block. Jumps are only legal to containing blocks. Use
 named blocks with restraint; jumps to places far away can get confusing.
 There are two unconditional jumps:
  loop
  loop label
  break
  break label
 The remaining jump instructions are all conditional. Conditional jumps rely on
 the result of the most recently executed `compare` instruction. (To keep
 programs easy to read, keep compare instructions close to the jump that uses
 them.)
  break-if-=
  break-if-= label
  break-if-!=
  break-if-!= label
 Inequalities are similar, but have unsigned and signed variants. We assume
 unsigned variants are only ever used to compare addresses.
  break-if-<
  break-if-< label
  break-if->
  break-if-> label
  break-if-<=
  break-if-<= label
  break-if->=
  break-if->= label
  break-if-addr<
  break-if-addr< label
  break-if-addr>
  break-if-addr> label
  break-if-addr<=
  break-if-addr<= label
  break-if-addr>=
  break-if-addr>= label
 Similarly, conditional loops:
  loop-if-=
  loop-if-= label
  loop-if-!=
  loop-if-!= label
  loop-if-<
  loop-if-< label
  loop-if->
  loop-if-> label
  loop-if-<=
  loop-if-<= label
  loop-if->=
  loop-if->= label
  loop-if-addr<
  loop-if-addr< label
  loop-if-addr>
  loop-if-addr> label
  loop-if-addr<=
  loop-if-addr<= label
  loop-if-addr>=
  loop-if-addr>= label
 ## Address operations
  var/reg: (addr T) <- address var: T         # var must be in mem (on the stack)
 ## Array operations
  var/reg: int <- length arr/reg: (addr array T)
  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
  var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
  var/reg: (addr T) <- index arr/reg: (addr array T), n
  var/reg: (addr T) <- index arr: (array T sz), n
  var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int     # arr can be in reg or mem
  var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int         # arr can be in reg or mem
  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
 ## User-defined types
  var/reg: (addr T_f) <- get var/reg: (addr T), f
    where record (product) type T has elements a, b, c, ... of types T_a, T_b, T_c, ...
  var/reg: (addr T_f) <- get var: T, f
 ## Handles for safe access to the heap
 Say we created a handle like this on the stack (it can't be in a register)
  var x: (handle T)
  allocate Heap, T, x
 You can copy handles to another variable on the stack like this:
  var y: (handle T)
  copy-handle-to y, x
 You can also save handles inside other user-defined types like this:
  var y/reg: (addr handle T_f) <- get var: (addr T), f
  copy-handle-to *y, x
 Or this:
  var y/reg: (addr handle T) <- index arr: (addr array handle T), n
  copy-handle-to *y, x
 Handles can be converted into addresses like this:
  var y/reg: (addr T) <- lookup x
 It's illegal to continue to use this addr after a function that reclaims heap
 memory. You have to repeat the lookup.
--- a/subx.md
+++ b/subx.md
@ -0,0 +1,162 @@
 ## SubX
 SubX is a notation for a subset of x86 machine code. [The Mu translator](http://akkartik.github.io/mu/html/apps/mu.subx.html)
 is implemented in SubX and also emits SubX code.
 Here's an example program in SubX that adds 1 and 1 and returns the result to
 the parent shell process:
  ```sh
  == code
  Entry:
    # ebx = 1
    bb/copy-to-ebx  1/imm32
    # increment ebx
    43/increment-ebx
    # exit(ebx)
    e8/call  syscall_exit/disp32
  ```
 ## The syntax of SubX instructions
 Just like in regular machine code, SubX programs consist mostly of instructions,
 which are basically sequences of numbers (always in hex). Instructions consist
 of words separated by whitespace. Words may be _opcodes_ (defining the
 operation being performed) or _arguments_ (specifying the data the operation
 acts on). Any word can have extra _metadata_ attached to it after `/`. Some
 metadata is required (like the `/imm32` and `/imm8` above), but unrecognized
 metadata is silently skipped so you can attach comments to words (like the
 instruction name `/copy-to-eax` above, or the `/exit` argument).
 What do all these numbers mean? SubX supports a small subset of the 32-bit x86
 instruction set that likely runs on your computer. (Think of the name as short
 for "sub-x86".) The instruction set contains instructions like `89/copy`,
 `01/add`, `3d/compare` and `51/push-ecx` which modify registers and a byte-addressable
 memory. For a complete list of supported instructions, run `bootstrap help
 opcodes`.
 The registers instructions operate on are as follows:
 - Six general-purpose 32-bit registers: `0/eax`, `1/ebx`, `2/ecx`, `3/edx`,
  `6/esi` and `7/edi`.
 - Two additional 32-bit registers: `4/esp` and `5/ebp`. (I suggest you only
  use these to manage the call stack.)
 (SubX doesn't support floating-point registers yet. Intel processors support
 an 8-bit mode, 16-bit mode and 64-bit mode. SubX will never support them.
 There are also _many_ more instructions that SubX will never support.)
 While SubX doesn't provide the usual mnemonics for opcodes, it _does_ provide
 error-checking. If you miss an argument or accidentally add an extra argument,
 you'll get a nice error. SubX won't arbitrarily interpret bytes of data as
 instructions or vice versa.
 It's worth distinguishing between an instruction's arguments and its _operands_.
 Arguments are provided directly in instructions. Operands are pieces of data
 in register or memory that are operated on by instructions.
 Intel processors typically operate on no more than two operands, and at most
 one of them (the 'reg/mem' operand) can access memory. The address of the
 reg/mem operand is constructed by expressions of one of these forms:
  - `%reg`: operate on just a register, not memory
  - `*reg`: look up memory with the address in some register
  - `*(reg + disp)`: add a constant to the address in some register
  - `*(base + (index << scale) + disp)` where `base` and `index` are registers,
    and `scale` and `disp` are 2- and 32-bit constants respectively.
 Under the hood, SubX turns expressions of these forms into multiple arguments
 with metadata in some complex ways. See [the doc on bare SubX](subx_bare.md).
 That covers the complexities of the reg/mem operand. The second operand is
 simpler. It comes from exactly one of the following argument types:
  - `/r32`
  - displacement: `/disp8` or `/disp32`
  - immediate: `/imm8` or `/imm32`
 Putting all this together, here's an example that adds the integer in `eax` to
 the one at address `edx`:
  ```
  01/add %edx 0/r32/eax
  ```
 ## The syntax of SubX programs
 SubX programs map to the same ELF binaries that a conventional Linux system
 uses. Linux ELF binaries consist of a series of _segments_. In particular, they
 distinguish between code and data. Correspondingly, SubX programs consist of a
 series of segments, each starting with a header line: `==` followed by a name
 and approximate starting address.
 All code must lie in a segment called 'code'.
 Segments can be added to.
 ```sh
 == code 0x09000000  # first mention requires starting address
 ...A...
 == data 0x0a000000
 ...B...
 == code             # no address necessary when adding
 ...C...
 ```
 The `code` segment now contains the instructions of `A` as well as `C`.
 Within the `code` segment, each line contains a comment, label or instruction.
 Comments start with a `#` and are ignored. Labels should always be the first
 word on a line, and they end with a `:`.
 Instructions can refer to labels in displacement or immediate arguments, and
 they'll obtain a value based on the address of the label: immediate arguments
 will contain the address directly, while displacement arguments will contain
 the difference between the address and the address of the current instruction.
 The latter is mostly useful for `jump` and `call` instructions.
 Functions are defined using labels. By convention, labels internal to functions
 (that must only be jumped to) start with a `$`. Any other labels must only be
 called, never jumped to. All labels must be unique.
 Functions are called using the following syntax:
 ```
 (func arg1 arg2 ...)
 ```
 Function arguments must be either literals (integers or strings) or a reg/mem
 operand using the syntax in the previous section.
 A special label is `Entry`, which can be used to specify/override the entry
 point of the program. It doesn't have to be unique, and the latest definition
 will override earlier ones.
 (The `Entry` label, along with duplicate segment headers, allows programs to
 be built up incrementally out of multiple [_layers_](http://akkartik.name/post/wart-layers).)
 Another special pair of labels are the block delimiters `{` and `}`. They can
 be nested, and jump instructions can take arguments `loop` or `break` that
 jump to the enclosing `{` and `}` respectively.
 The data segment consists of labels as before and byte values. Referring to
 data labels in either `code` segment instructions or `data` segment values
 yields their address.
 Automatic tests are an important part of SubX, and there's a simple mechanism
 to provide a test harness: all functions that start with `test-` are called in
 turn by a special, auto-generated function called `run-tests`. How you choose
 to call it is up to you.
 I try to keep things simple so that there's less work to do when implementing
 SubX in SubX. But there _is_ one convenience: instructions can provide a
 string literal surrounded by quotes (`"`) in an `imm32` argument. SubX will
 transparently copy it to the `data` segment and replace it with its address.
 Strings are the only place where a SubX word is allowed to contain spaces.
 That should be enough information for writing SubX programs. The `apps/`
 directory provides some fodder for practice in the `apps/ex*.subx` files,
 giving a more gradual introduction to SubX features. In particular, you should
 work through `apps/factorial4.subx`, which demonstrates all the above ideas in
 concert.
--- a/subx_addressing_modes.md
+++ b/subx_addressing_modes.md
@ -1,7 +1,7 @@
-[The Readme](Readme.md) describes SubX notation with some details hidden
+[The SubX documentation](subx.md) describes SubX notation with some details
-behind _syntax sugar_ -- local rewrite rules that make programming in SubX
+hidden behind _syntax sugar_ -- local rewrite rules that make programming in
-less error-prone. However, much low-level SubX (before the syntax sugar is
+SubX less error-prone. However, much low-level SubX (before the syntax sugar
-implemented) is written without syntax sugar. This document describes some
+is implemented) is written without syntax sugar. This document describes some
 details of the syntax sugar: how the reg/mem operand is translated into
 arguments.
@ -123,3 +123,50 @@ $ echo $?
 These details should now be enough information for reading and modifying
 low-level SubX programs.
 ## Translating SubX programs
 This repo includes two translators for bare SubX. The first is [the bootstrap
 translator](bootstrap.md) implemented in C++. In addition, you can use SubX to
 translate itself. For example, running natively on Linux:
  ```sh
  # generate translator phases using the C++ translator
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/hex.subx    -o hex
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/survey.subx -o survey
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/pack.subx   -o pack
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/assort.subx -o assort
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/dquotes.subx -o dquotes
  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/tests.subx  -o tests
  $ chmod +x hex survey pack assort dquotes tests
  # use the generated translator phases to translate SubX programs
  $ cat init.linux apps/ex1.subx |./tests |./dquotes |./assort |./pack |./survey |./hex > a.elf
  $ chmod +x a.elf
  $ ./a.elf
  $ echo $?
  42
  # or, automating the above steps
  $ ./translate_subx init.linux apps/ex1.subx
  $ ./a.elf
  $ echo $?
  42
  ```
 Or, running in a VM on other platforms (much slower):
  ```sh
  $ ./translate_subx_emulated init.linux apps/ex1.subx  # generates identical a.elf to above
  $ ./bootstrap run a.elf
  $ echo $?
  42
  ```
 ## Resources
 - [Single-page cheatsheet for the x86 ISA](https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x86_opcode_structure_and_instruction_overview.pdf)
  (pdf; [cached local copy](https://github.com/akkartik/mu/blob/master/cheatsheet.pdf))
 - [Concise reference for the x86 ISA](https://c9x.me/x86)
 - [Concise reference for the x86 ISA #2](http://ref.x86asm.net/coder32.html)
 - [Intel processor manual](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf) (pdf)
--- a/15
+++ b/15
@ -47,11 +47,11 @@ echo ex2
 test "$1" = 'record'  ||  git diff --exit-code apps/ex2
 test $EMULATED  &&  {
  ./bootstrap run apps/ex2  ||  ret=$?
-  test $ret -eq 2  # 1 + 1
+  test $ret -eq 7  # 3 + 4
 }
 test $NATIVE  &&  {
  apps/ex2  ||  ret=$?
-  test $ret -eq 2  # 1 + 1
+  test $ret -eq 7  # 3 + 4
 }
 echo ex3
@ -365,17 +365,6 @@ test $NATIVE  &&  {
  test $ret -eq 7
 }
 echo ex2.2.mu
 ./translate_mu apps/ex2.2.mu
 test $EMULATED  &&  {
  ./bootstrap run a.elf  ||  ret=$?
  test $ret -eq 4
 }
 test $NATIVE  &&  {
  ./a.elf  ||  ret=$?
  test $ret -eq 4
 }
 echo ex3.mu
 ./translate_mu apps/ex3.mu
 test $EMULATED  &&  {
--- a/vocabulary.md
+++ b/vocabulary.md
@ -0,0 +1,208 @@
 ## Reference documentation on available primitives
 ### Data Structures
 - Kernel strings: null-terminated regions of memory. Unsafe and to be avoided,
  but needed for interacting with the kernel.
 - Arrays: length-prefixed regions of memory containing multiple elements of a
  single type. Contents are preceded by 4 bytes (32 bits) containing the
  `length` of the array in bytes.
 - Slices: a pair of 32-bit addresses denoting a [half-open](https://en.wikipedia.org/wiki/Interval_(mathematics))
  \[`start`, `end`) interval to live memory with a consistent lifetime.
  Invariant: `start` <= `end`
 - Streams: strings prefixed by 32-bit `write` and `read` indexes that the next
  write or read goes to, respectively.
  - offset 0: write index
  - offset 4: read index
  - offset 8: length of array (in bytes)
  - offset 12: start of array data
  Invariant: 0 <= `read` <= `write` <= `length`
 - File descriptors (fd): Low-level 32-bit integers that the kernel uses to
  track files opened by the program.
 - File: 32-bit value containing either a fd or an address to a stream (fake
  file).
 - Buffered files (buffered-file): Contain a file descriptor and a stream for
  buffering reads/writes. Each `buffered-file` must exclusively perform either
  reads or writes.
 ### 'system calls'
 As I said at the top, a primary design goal of SubX (and Mu more broadly) is
 to explore ways to turn arbitrary manual tests into reproducible automated
 tests. SubX aims for this goal by baking testable interfaces deep into the
 stack, at the OS syscall level. The idea is that every syscall that interacts
 with hardware (and so the environment) should be *dependency injected* so that
 it's possible to insert fake hardware in tests.
 But those are big goals. Here are the syscalls I have so far:
 - `write`: takes two arguments, a file `f` and an address to array `s`.
  Comparing this interface with the Unix `write()` syscall shows two benefits:
  1. SubX can handle 'fake' file descriptors in tests.
  1. `write()` accepts buffer and its length in separate arguments, which
     requires callers to manage the two separately and so can be error-prone.
     SubX's wrapper keeps the two together to increase the chances that we
     never accidentally go out of array bounds.
 - `read`: takes two arguments, a file `f` and an address to stream `s`. Reads
  as much data from `f` as can fit in (the free space of) `s`.
  Like with `write()`, this wrapper around the Unix `read()` syscall adds the
  ability to handle 'fake' file descriptors in tests, and reduces the chances
  of clobbering outside array bounds.
  One bit of weirdness here: in tests we do a redundant copy from one stream
  to another. See [the comments before the implementation](http://akkartik.github.io/mu/html/060read.subx.html)
  for a discussion of alternative interfaces.
 - `stop`: takes two arguments:
  - `ed` is an address to an _exit descriptor_. Exit descriptors allow us to
    `exit()` the program in production, but return to the test harness within
    tests. That allows tests to make assertions about when `exit()` is called.
  - `value` is the status code to `exit()` with.
  For more details on exit descriptors and how to create one, see [the
  comments before the implementation](http://akkartik.github.io/mu/html/059stop.subx.html).
 - `new-segment`
  Allocates a whole new segment of memory for the program, discontiguous with
  both existing code and data (heap) segments. Just a more opinionated form of
  [`mmap`](http://man7.org/linux/man-pages/man2/mmap.2.html).
 - `allocate`: takes two arguments, an address to allocation-descriptor `ad`
  and an integer `n`
  Allocates a contiguous range of memory that is guaranteed to be exclusively
  available to the caller. Returns the starting address to the range in `eax`.
  An allocation descriptor tracks allocated vs available addresses in some
  contiguous range of memory. The int specifies the number of bytes to allocate.
  Explicitly passing in an allocation descriptor allows for nested memory
  management, where a sub-system gets a chunk of memory and further parcels it
  out to individual allocations. Particularly helpful for (surprise) tests.
 - ... _(to be continued)_
 I will continue to import syscalls over time from [the old Mu VM in the parent
 directory](https://github.com/akkartik/mu), which has experimented with
 interfaces for the screen, keyboard, mouse, disk and network.
 ### primitives built atop system calls
 _(Compound arguments are usually passed in by reference. Where the results are
 compound objects that don't fit in a register, the caller usually passes in
 allocated memory for it.)_
 #### assertions for tests
 - `check-ints-equal`: fails current test if given ints aren't equal
 - `check-stream-equal`: fails current test if stream doesn't match string
 - `check-next-stream-line-equal`: fails current test if next line of stream
  until newline doesn't match string
 #### error handling
 - `error`: takes three arguments, an exit-descriptor, a file and a string (message)
  Prints out the message to the file and then exits using the provided
  exit-descriptor.
 - `error-byte`: like `error` but takes an extra byte value that it prints out
  at the end of the message.
 #### predicates
 - `kernel-string-equal?`: compares a kernel string with a string
 - `string-equal?`: compares two strings
 - `stream-data-equal?`: compares a stream with a string
 - `next-stream-line-equal?`: compares with string the next line in a stream, from
  `read` index to newline
 - `slice-empty?`: checks if the `start` and `end` of a slice are equal
 - `slice-equal?`: compares a slice with a string
 - `slice-starts-with?`: compares the start of a slice with a string
 - `slice-ends-with?`: compares the end of a slice with a string
 #### writing to disk
 - `write`: string -> file
  - Can also be used to cat a string into a stream.
  - Will abort the entire program if destination is a stream and doesn't have
    enough room.
 - `write-stream`: stream -> file
  - Can also be used to cat one stream into another.
  - Will abort the entire program if destination is a stream and doesn't have
    enough room.
 - `write-slice`: slice -> stream
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `append-byte`: int -> stream
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `append-byte-hex`: int -> stream
  - textual representation in hex, no '0x' prefix
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `print-int32`: int -> stream
  - textual representation in hex, including '0x' prefix
  - Will abort the entire program if there isn't enough room in the
    destination stream.
 - `write-buffered`: string -> buffered-file
 - `write-slice-buffered`: slice -> buffered-file
 - `flush`: buffered-file
 - `write-byte-buffered`: int -> buffered-file
 - `print-byte-buffered`: int -> buffered-file
  - textual representation in hex, no '0x' prefix
 - `print-int32-buffered`: int -> buffered-file
  - textual representation in hex, including '0x' prefix
 #### reading from disk
 - `read`: file -> stream
  - Can also be used to cat one stream into another.
  - Will silently stop reading when destination runs out of space.
 - `read-byte-buffered`: buffered-file -> byte
 - `read-line-buffered`: buffered-file -> stream
  - Will abort the entire program if there isn't enough room.
 #### non-IO operations on streams
 - `new-stream`: allocates space for a stream of `n` elements, each occupying
  `b` bytes.
  - Will abort the entire program if `n*b` requires more than 32 bits.
 - `clear-stream`: resets everything in the stream to `0` (except its `length`).
 - `rewind-stream`: resets the read index of the stream to `0` without modifying
  its contents.
 #### reading/writing hex representations of integers
 - `is-hex-int?`: takes a slice argument, returns boolean result in `eax`
 - `parse-hex-int`: takes a slice argument, returns int result in `eax`
 - `is-hex-digit?`: takes a 32-bit word containing a single byte, returns
  boolean result in `eax`.
 - `from-hex-char`: takes a hexadecimal digit character in `eax`, returns its
  numeric value in `eax`
 - `to-hex-char`: takes a single-digit numeric value in `eax`, returns its
  corresponding hexadecimal character in `eax`
 #### tokenization
 from a stream:
 - `next-token`: stream, delimiter byte -> slice
 - `skip-chars-matching`: stream, delimiter byte
 - `skip-chars-not-matching`: stream, delimiter byte
 from a slice:
 - `next-token-from-slice`: start, end, delimiter byte -> slice
  - Given a slice and a delimiter byte, returns a new slice inside the input
    that ends at the delimiter byte.
 - `skip-chars-matching-in-slice`: curr, end, delimiter byte -> new-curr (in `eax`)
 - `skip-chars-not-matching-in-slice`:  curr, end, delimiter byte -> new-curr (in `eax`)