mu/mu.arc.t

; Mu: An exploration on making the global structure of programs more accessible.
;
;   "Is it a language, or an operating system, or a virtual machine? Mu."
;   (with apologies to Robert Pirsig: http://en.wikipedia.org/wiki/Mu_%28negative%29#In_popular_culture)
;
; I want to live in a world where I can have an itch to tweak a program, clone
; its open-source repository, orient myself on how it's organized, and make
; the simple change I envisioned, all in an afternoon. This codebase tries to
; make this possible for its readers. (More details: http://akkartik.name/about)
;
; What helps comprehend the global structure of programs? For starters, let's
; enumerate what doesn't: idiomatic code, adherence to a style guide or naming
; convention, consistent indentation, API documentation for each class, etc.
; These conventional considerations improve matters in the small, but don't
; help understand global organization. They help existing programmers manage
; day-to-day operations, but they can't turn outsider programmers into
; insiders. (Elaboration: http://akkartik.name/post/readable-bad)
;
; In my experience, two things have improved matters so far: version control
; and automated tests. Version control lets me rewind back to earlier, simpler
; times when the codebase was simpler, when its core skeleton was easier to
; ascertain. Indeed, arguably what came first is by definition the skeleton of
; a program, modulo major rewrites. Once you understand the skeleton, it
; becomes tractable to 'play back' later major features one by one. (Previous
; project that fleshed out this idea: http://akkartik.name/post/wart-layers)
;
; The second and biggest boost to comprehension comes from tests. Tests are
; good for writers for well-understood reasons: they avoid regressions, and
; they can influence code to be more decoupled and easier to change. In
; addition, tests are also good for the outsider reader because they permit
; active reading. If you can't build a program and run its tests it can't help
; you understand it. It hangs limp at best, and might even be actively
; misleading. If you can run its tests, however, it comes alive. You can step
; through scenarios in a debugger. You can add logging and scan logs to make
; sense of them. You can run what-if scenarios: "why is this line not written
; like this?" Make a change, rerun tests: "Oh, that's why." (Elaboration:
; http://akkartik.name/post/literate-programming)
;
; However, tests are only useful to the extent that they exist. Think back to
; your most recent codebase. Do you feel comfortable releasing a new version
; just because the tests pass? I'm not aware of any such project. There's just
; too many situations envisaged by the authors that were never encoded in a
; test. Even disciplined authors can't test for performance or race conditions
; or fault tolerance. If a line is phrased just so because of some subtle
; performance consideration, it's hard to communicate to newcomers.
;
; This isn't an arcane problem, and it isn't just a matter of altruism. As
; more and more such implicit considerations proliferate, and as the original
; authors are replaced by latecomers for day-to-day operations, knowledge is
; actively forgotten and lost. The once-pristine codebase turns into legacy
; code that is hard to modify without expensive and stress-inducing
; regressions.
;
; How to write tests for performance, fault tolerance, race conditions, etc.?
; How can we state and verify that a codepath doesn't ever perform memory
; allocation, or write to disk? It requires better, more observable primitives
; than we currently have. Modern operating systems have their roots in the
; 70s. Their interfaces were not designed to be testable. They provide no way
; to simulate a full disk, or a specific sequence of writes from different
; threads. We need something better.
;
; This project tries to move, groping, towards that 'something better', a
; platform that is both thoroughly tested and allows programs written for it
; to be thoroughly tested. It tries to answer the question:
;
;   If Denis Ritchie and Ken Thompson were to set out today to co-design unix
;   and C, knowing what we know about automated tests, what would they do
;   differently?
;
; To try to impose *some* constraints on this gigantic yak-shave, we'll try to
; keep both language and OS as simple as possible, focused entirely on
; permitting more kinds of tests, on first *collecting* all the information
; about implicit considerations in some form so that readers and tools can
; have at least some hope of making sense of it.
;
; The initial language will be just assembly. We'll try to make it convenient
; to program in with some simple localized rewrite rules inspired by lisp
; macros and literate programming. Programmers will have to do their own
; memory management and register allocation, but we'll provide libraries to
; help with them.
;
; The initial OS will provide just memory management and concurrency
; primitives. No users or permissions (we don't live on mainframes anymore),
; no kernel- vs user-mode, no virtual memory or process abstraction, all
; threads sharing a single address space (use VMs for security and
; sandboxing). The only use case we care about is getting a test harness to
; run some code, feed it data through blocking channels, stop it and observe
; its internals. The code under test is expected to cooperate in such testing,
; by logging important events for the test harness to observe. (More info:
; http://akkartik.name/post/tracing-tests)
;
; The common thread here is elimination of abstractions, and it's not an
; accident. Abstractions help insiders manage the evolution of a codebase, but
; they actively hinder outsiders in understanding it from scratch. This
; matters, because the funnel to turn outsiders into insiders is critical to
; the long-term life of a codebase. Perhaps authors should raise their
; estimation of the costs of abstraction, and go against their instincts for
; introducing it. That's what I'll be trying to do: question every abstraction
; before I introduce it. We'll see how it goes.

; ---

; Mu is currently built atop Racket and Arc, but this is temporary and
; contingent. We want to keep our options open, whether to port to a different
; host language, and easy to rewrite to native code for any platform. So we'll
; try to avoid 'cheating': relying on the host platform for advanced
; functionality.
;
; Other than that, we'll say no more about the code, and focus in the rest of
; this file on the scenarios the code cares about.

(load "mu.arc")

; Our language is assembly-like in that functions consist of series of
; statements, and statements consist of an operation and its arguments (input
; and output).
;
;   oarg1, oarg2, ... <- op arg1, arg2, ...
;
; Args must be atomic, like an integer or a memory address, they can't be
; expressions doing arithmetic or function calls. But we can have any number
; of them.
;
; Since we're building on lisp, our code samples won't look quite like the
; idealized syntax above. For now they will be lists of lists:
;
;   (function-name
;     ((oarg1 oarg2 ... <- op arg1 arg2 ...)
;      ...
;      ...))
;
; Each arg/oarg is itself a list, with the payload value at the head, and
; various metadata in the rest. In this first example the only metadata is types:
; 'integer' for a memory location containing an integer, and 'literal' for a
; value included directly in code. (Assembly languages traditionally call them
; 'immediate' operands.) In the future a simple tool will check that the types
; line up as expected in each op. A different tool might add types where they
; aren't provided. Instead of a monolithic compiler I want to build simple,
; lightweight tools that can be combined in various ways, say for using
; different typecheckers in different subsystems.
;
; In our tests we'll define such mu functions using a call to 'add-fns', so
; look for it. Everything outside 'add-fns' is just test-harness details.

(reset)
(new-trace "literal")
(add-fns
  '((main
      ((1 integer) <- copy (23 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 23)
  (prn "F - 'copy' writes its lone 'arg' after the instruction name to its lone 'oarg' or output arg before the arrow. After this test, the value 23 is stored in memory address 1."))
;? (quit)

; Our basic arithmetic ops can operate on memory locations or literals.
; (Ignore hardware details like registers for now.)

(reset)
(new-trace "add")
(add-fns
  '((main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      ((3 integer) <- add (1 integer) (2 integer)))))
(run 'main)
(if (~iso memory* (obj 1 1  2 3  3 4))
  (prn "F - 'add' operates on two addresses"))

(reset)
(new-trace "add-literal")
(add-fns
  '((main
      ((1 integer) <- add (2 literal) (3 literal)))))
(run 'main)
(if (~is memory*.1 5)
  (prn "F - ops can take 'literal' operands (but not return them)"))

(reset)
(new-trace "sub-literal")
(add-fns
  '((main
      ((1 integer) <- sub (1 literal) (3 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 -2)
  (prn "F - 'sub' subtracts the second arg from the first"))

(reset)
(new-trace "mul-literal")
(add-fns
  '((main
      ((1 integer) <- mul (2 literal) (3 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 6)
  (prn "F - 'mul' multiplies like 'add' adds"))

(reset)
(new-trace "div-literal")
(add-fns
  '((main
      ((1 integer) <- div (8 literal) (3 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 (/ real.8 3))
  (prn "F - 'div' divides like 'sub' subtracts"))

(reset)
(new-trace "idiv-literal")
(add-fns
  '((main
      ((1 integer) (2 integer) <- idiv (23 literal) (6 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 3  2 5))
  (prn "F - 'idiv' performs integer division, returning quotient and remainder"))

; Basic boolean operations: and, or, not
; There are easy ways to encode booleans in binary, but we'll skip past those
; details for now.

(reset)
(new-trace "and-literal")
(add-fns
  '((main
      ((1 boolean) <- and (t literal) (nil literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 nil)
  (prn "F - logical 'and' for booleans"))

; Basic comparison operations: lt, le, gt, ge, eq, neq

(reset)
(new-trace "lt-literal")
(add-fns
  '((main
      ((1 boolean) <- lt (4 literal) (3 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 nil)
  (prn "F - 'lt' is the less-than inequality operator"))

(reset)
(new-trace "le-literal-false")
(add-fns
  '((main
      ((1 boolean) <- le (4 literal) (3 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 nil)
  (prn "F - 'le' is the <= inequality operator"))

(reset)
(new-trace "le-literal-true")
(add-fns
  '((main
      ((1 boolean) <- le (4 literal) (4 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 t)
  (prn "F - 'le' returns true for equal operands"))

(reset)
(new-trace "le-literal-true-2")
(add-fns
  '((main
      ((1 boolean) <- le (4 literal) (5 literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 t)
  (prn "F - le is the <= inequality operator - 2"))

; Control flow operations: jump, jump-if, jump-unless
; These introduce a new type -- 'offset' -- for literals that refer to memory
; locations relative to the current location.

(reset)
(new-trace "jump-skip")
(add-fns
  '((main
      ((1 integer) <- copy (8 literal))
      (jump (1 offset))
      ((2 integer) <- copy (3 literal))  ; should be skipped
      (reply))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 8))
  (prn "F - 'jump' skips some instructions"))

(reset)
(new-trace "jump-target")
(add-fns
  '((main
      ((1 integer) <- copy (8 literal))
      (jump (1 offset))
      ((2 integer) <- copy (3 literal))  ; should be skipped
      (reply)
      ((3 integer) <- copy (34 literal)))))  ; never reached
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 8))
  (prn "F - 'jump' doesn't skip too many instructions"))
;? (quit)

(reset)
(new-trace "jump-if-skip")
(add-fns
  '((main
      ((2 integer) <- copy (1 literal))
      ((1 boolean) <- eq (1 literal) (2 integer))
      (jump-if (1 boolean) (1 offset))
      ((2 integer) <- copy (3 literal))
      (reply)
      ((3 integer) <- copy (34 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 t  2 1))
  (prn "F - 'jump-if' is a conditional 'jump'"))

(reset)
(new-trace "jump-if-fallthrough")
(add-fns
  '((main
      ((1 boolean) <- eq (1 literal) (2 literal))
      (jump-if (3 boolean) (1 offset))
      ((2 integer) <- copy (3 literal))
      (reply)
      ((3 integer) <- copy (34 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 nil  2 3))
  (prn "F - if 'jump-if's first arg is false, it doesn't skip any instructions"))

(reset)
(new-trace "jump-if-backward")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (1 literal))
      ; loop
      ((2 integer) <- add (2 integer) (2 integer))
      ((3 boolean) <- eq (1 integer) (2 integer))
      (jump-if (3 boolean) (-3 offset))  ; to loop
      ((4 integer) <- copy (3 literal))
      (reply)
      ((3 integer) <- copy (34 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 4  3 nil  4 3))
  (prn "F - 'jump-if' can take a negative offset to make backward jumps"))

; Data movement relies on addressing modes:
;   'direct' - refers to a memory location; default for most types.
;   'literal' - directly encoded in the code; implicit for some types like 'offset'.

(reset)
(new-trace "direct-addressing")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 integer) <- copy (1 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 34))
  (prn "F - 'copy' performs direct addressing"))

; 'Indirect' addressing refers to an address stored in a memory location.
; Indicated by the metadata 'deref'. Usually requires an address type.
; In the test below, the memory location 1 contains '2', so an indirect read
; of location 1 returns the value of location 2.

(reset)
(new-trace "indirect-addressing")
(add-fns
  '((main
      ((1 integer-address) <- copy (2 literal))  ; unsafe; can't do this in general
      ((2 integer) <- copy (34 literal))
      ((3 integer) <- copy (1 integer-address deref)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 34  3 34))
  (prn "F - 'copy' performs indirect addressing"))

; Output args can use indirect addressing. In the test below the value is
; stored at the location stored in location 1 (i.e. location 2).

(reset)
(new-trace "indirect-addressing-oarg")
(add-fns
  '((main
      ((1 integer-address) <- copy (2 literal))
      ((2 integer) <- copy (34 literal))
      ((1 integer-address deref) <- add (2 integer) (2 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 36))
  (prn "F - instructions can perform indirect addressing on output arg"))

; Until now we've dealt with scalar types like integers and booleans and
; addresses, where mu looks like other assembly languages. In addition, mu
; provides first-class support for compound types: arrays and records.
;
; 'get' accesses fields in records
; 'index' accesses indices in arrays
;
; Both operations require knowledge about the types being worked on, so all
; types used in mu programs are defined in a single global system-wide table
; (see types* in mu.arc for the complete list of types; we'll add to it over
; time).

(reset)
(new-trace "get-record")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (nil literal))
      ((3 boolean) <- get (1 integer-boolean-pair) (1 offset))
      ((4 integer) <- get (1 integer-boolean-pair) (0 offset)))))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 nil  3 nil  4 34))
  (prn "F - 'get' accesses fields of records"))

(reset)
(new-trace "get-indirect")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (nil literal))
      ((3 integer-boolean-pair-address) <- copy (1 literal))
      ((4 boolean) <- get (3 integer-boolean-pair-address deref) (1 offset))
      ((5 integer) <- get (3 integer-boolean-pair-address deref) (0 offset)))))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 nil  3 1  4 nil  5 34))
  (prn "F - 'get' accesses fields of record address"))

(reset)
(new-trace "get-compound-field")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 integer) <- copy (35 literal))
      ((3 integer) <- copy (36 literal))
      ((4 integer-integer-pair) <- get (1 integer-point-pair) (1 offset)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 35  3 36  4 35  5 36))
  (prn "F - 'get' accesses fields spanning multiple locations"))

(reset)
(new-trace "get-address")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (t literal))
      ((3 boolean-address) <- get-address (1 integer-boolean-pair) (1 offset)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 t  3 2))
  (prn "F - 'get-address' returns address of fields of records"))

(reset)
(new-trace "get-address-indirect")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (t literal))
      ((3 integer-boolean-pair-address) <- copy (1 literal))
      ((4 boolean-address) <- get-address (3 integer-boolean-pair-address deref) (1 offset)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 t  3 1  4 2))
  (prn "F - 'get-address' accesses fields of record address"))

(reset)
(new-trace "index-literal")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer-boolean-pair) <- index (1 integer-boolean-pair-array) (1 literal)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 24 7 t))
  (prn "F - 'index' accesses indices of arrays"))
;? (quit)

(reset)
(new-trace "index-direct")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer) <- copy (1 literal))
      ((7 integer-boolean-pair) <- index (1 integer-boolean-pair-array) (6 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 1  7 24 8 t))
  (prn "F - 'index' accesses indices of arrays"))
;? (quit)

(reset)
(new-trace "index-indirect")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer) <- copy (1 literal))
      ((7 integer-boolean-pair-array-address) <- copy (1 literal))
      ((8 integer-boolean-pair) <- index (7 integer-boolean-pair-array-address deref) (6 integer)))))
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1")))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 1  7 1  8 24 9 t))
  (prn "F - 'index' accesses indices of array address"))
;? (quit)

(reset)
(new-trace "index-address")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer) <- copy (1 literal))
      ((7 integer-boolean-pair-address) <- index-address (1 integer-boolean-pair-array) (6 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 1  7 4))
  (prn "F - 'index-address' returns addresses of indices of arrays"))

(reset)
(new-trace "index-address-indirect")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer) <- copy (1 literal))
      ((7 integer-boolean-pair-array-address) <- copy (1 literal))
      ((8 integer-boolean-pair-address) <- index-address (7 integer-boolean-pair-array-address deref) (6 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 1  7 1  8 4))
  (prn "F - 'index-address' returns addresses of indices of array addresses"))

; Array values know their length. Record lengths are saved in the types table.

(reset)
(new-trace "len-array")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer) <- len (1 integer-boolean-pair-array)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 2))
  (prn "F - 'len' accesses length of array"))

(reset)
(new-trace "len-array-indirect")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 boolean) <- copy (nil literal))
      ((4 integer) <- copy (24 literal))
      ((5 boolean) <- copy (t literal))
      ((6 integer-address) <- copy (1 literal))
      ((7 integer) <- len (6 integer-boolean-pair-array-address deref)))))
;? (set dump-trace*)
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1")))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 2  2 23 3 nil  4 24 5 t  6 1  7 2))
  (prn "F - 'len' accesses length of array address"))

; 'sizeof' is a helper to determine the amount of memory required by a type.

(reset)
(new-trace "sizeof-record")
(add-fns
  '((main
      ((1 integer) <- sizeof (integer-boolean-pair literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 2)
  (prn "F - 'sizeof' returns space required by arg"))

(reset)
(new-trace "sizeof-record-not-len")
(add-fns
  '((main
      ((1 integer) <- sizeof (integer-point-pair literal)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.1 3)
  (prn "F - 'sizeof' is different from number of elems"))

; Regardless of a type's length, you can move it around just like a primitive.

(reset)
(new-trace "compound-operand-copy")
(add-fns
  '((main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (nil literal))
      ((4 boolean) <- copy (t literal))
      ((3 integer-boolean-pair) <- copy (1 integer-boolean-pair)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 nil  3 34  4 nil))
  (prn "F - ops can operate on records spanning multiple locations"))

(reset)
(new-trace "compound-arg")
(add-fns
  '((test1
      ((4 integer-boolean-pair) <- arg))
    (main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (nil literal))
      (test1 (1 integer-boolean-pair)))))
(run 'main)
(if (~iso memory* (obj 1 34  2 nil  4 34  5 nil))
  (prn "F - 'arg' can copy records spanning multiple locations"))

(reset)
(new-trace "compound-arg-indirect")
;? (set dump-trace*)
(add-fns
  '((test1
      ((4 integer-boolean-pair) <- arg))
    (main
      ((1 integer) <- copy (34 literal))
      ((2 boolean) <- copy (nil literal))
      ((3 integer-boolean-pair-address) <- copy (1 literal))
      (test1 (3 integer-boolean-pair-address deref)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 34  2 nil  3 1  4 34  5 nil))
  (prn "F - 'arg' can copy records spanning multiple locations in indirect mode"))

; A special kind of record is the 'tagged type'. It lets us represent
; dynamically typed values, which save type information in memory rather than
; in the code to use them. This will let us do things like create heterogenous
; lists containing both integers and strings. Tagged values admit two
; operations:
;
;   'save-type' - turns a regular value into a tagged-value of the appropriate type
;   'maybe-coerce' - turns a tagged value into a regular value if the type matches

(reset)
(new-trace "tagged-value")
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1")))
(add-fns
  '((main
      ((1 type) <- copy (integer-address literal))
      ((2 integer-address) <- copy (34 literal))  ; pointer to nowhere
      ((3 integer-address) (4 boolean) <- maybe-coerce (1 tagged-value) (integer-address literal)))))
(run 'main)
;? (prn memory*)
;? (prn completed-routines*)
(let last-routine (deq completed-routines*)
  (aif rep.last-routine!error (prn "error - " it)))
(if (or (~is memory*.3 34) (~is memory*.4 t))
  (prn "F - 'maybe-coerce' copies value only if type tag matches"))
;? (quit)

(reset)
(new-trace "tagged-value-2")
;? (set dump-trace*)
(add-fns
  '((main
      ((1 type) <- copy (integer-address literal))
      ((2 integer-address) <- copy (34 literal))  ; pointer to nowhere
      ((3 integer-address) (4 boolean) <- maybe-coerce (1 tagged-value) (boolean-address literal)))))
(run 'main)
;? (prn memory*)
(if (or (~is memory*.3 0) (~is memory*.4 nil))
  (prn "F - 'maybe-coerce' doesn't copy value when type tag doesn't match"))

(reset)
(new-trace "save-type")
(add-fns
  '((main
      ((1 integer-address) <- copy (34 literal))  ; pointer to nowhere
      ((2 tagged-value) <- save-type (1 integer-address)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj  1 34  2 'integer-address  3 34))
  (prn "F - 'save-type' saves the type of a value at runtime, turning it into a tagged-value"))

(reset)
(new-trace "new-tagged-value")
(add-fns
  '((main
      ((1 integer-address) <- copy (34 literal))  ; pointer to nowhere
      ((2 tagged-value-address) <- new-tagged-value (integer-address literal) (1 integer-address))
      ((3 integer-address) (4 boolean) <- maybe-coerce (2 tagged-value-address deref) (integer-address literal)))))
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1" "sizeof")))
(run 'main)
;? (prn memory*)
(if (or (~is memory*.3 34) (~is memory*.4 t))
  (prn "F - 'new-tagged-value' is the converse of 'maybe-coerce'"))
;? (quit)

; Now that we can record types for values we can construct a dynamically typed
; list.

(reset)
(new-trace "list")
;? (set dump-trace*)
(add-fns
  '((test1
      ; 1 points at first node: tagged-value (int 34)
      ((1 list-address) <- new (list literal))
      ((2 tagged-value-address) <- list-value-address (1 list-address))
      ((3 type-address) <- get-address (2 tagged-value-address deref) (0 offset))
      ((3 type-address deref) <- copy (integer literal))
      ((4 location) <- get-address (2 tagged-value-address deref) (1 offset))
      ((4 location deref) <- copy (34 literal))
      ((5 list-address-address) <- get-address (1 list-address deref) (1 offset))
      ((5 list-address-address deref) <- new (list literal))
      ; 6 points at second node: tagged-value (boolean t)
      ((6 list-address) <- copy (5 list-address-address deref))
      ((7 tagged-value-address) <- list-value-address (6 list-address))
      ((8 type-address) <- get-address (7 tagged-value-address deref) (0 offset))
      ((8 type-address deref) <- copy (boolean literal))
      ((9 location) <- get-address (7 tagged-value-address deref) (1 offset))
      ((9 location deref) <- copy (t literal))
      ((10 list-address) <- get (6 list-address deref) (1 offset))
      )))
(let first Memory-in-use-until
  (run 'test1)
;?   (prn memory*)
  (if (or (~all first (map memory* '(1 2 3)))
          (~is memory*.first  'integer)
          (~is memory*.4 (+ first 1))
          (~is (memory* (+ first 1))  34)
          (~is memory*.5 (+ first 2))
          (let second memory*.6
            (or
              (~is (memory* (+ first 2)) second)
              (~all second (map memory* '(6 7 8)))
              (~is memory*.second 'boolean)
              (~is memory*.9 (+ second 1))
              (~is (memory* (+ second 1)) t)
              (~is memory*.10 nil))))
    (prn "F - lists can contain elements of different types")))
(add-fns
  '((test2
      ((10 list-address) <- list-next (1 list-address)))))
(run 'test2)
;? (prn memory*)
(if (~is memory*.10 memory*.6)
  (prn "F - 'list-next can move a list pointer to the next node"))

; 'new-list' takes a variable number of args and constructs a list containing
; them.

(reset)
(new-trace "new-list")
(add-fns
  '((main
      ((1 integer) <- new-list (3 literal) (4 literal) (5 literal)))))
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1" "sizeof")))
(run 'main)
;? (prn memory*)
(let first memory*.1
;?   (prn first)
  (if (or (~is memory*.first  'integer)
          (~is (memory* (+ first 1))  3)
          (let second (memory* (+ first 2))
;?             (prn second)
            (or (~is memory*.second 'integer)
                (~is (memory* (+ second 1)) 4)
                (let third (memory* (+ second 2))
;?                   (prn third)
                  (or (~is memory*.third 'integer)
                      (~is (memory* (+ third 1)) 5)
                      (~is (memory* (+ third 2) nil)))))))
    (prn "F - 'new-list' can construct a list of integers")))

; Just like the table of types is centralized, functions are conceptualized as
; a centralized table of operations just like the "primitives" we've seen so
; far. If you create a function you can call it like any other op.

(reset)
(new-trace "new-fn")
(add-fns
  '((test1
      ((3 integer) <- add (1 integer) (2 integer)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      (test1))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4))
  (prn "F - calling a user-defined function runs its instructions"))
;? (quit)

(reset)
(new-trace "new-fn-once")
(add-fns
  '((test1
      ((1 integer) <- copy (1 literal)))
    (main
      (test1))))
(if (~is 2 (run 'main))
  (prn "F - calling a user-defined function runs its instructions exactly once"))
;? (quit)

; User-defined functions communicate with their callers through two
; primitives:
;
;   'arg' - to access inputs
;   'reply' - to return outputs

(reset)
(new-trace "new-fn-reply")
(add-fns
  '((test1
      ((3 integer) <- add (1 integer) (2 integer))
      (reply)
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      (test1))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4))
  (prn "F - 'reply' stops executing the current function"))
;? (quit)

(reset)
(new-trace "new-fn-reply-nested")
(add-fns
  `((test1
      ((3 integer) <- test2))
    (test2
      (reply (2 integer)))
    (main
      ((2 integer) <- copy (34 literal))
      (test1))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 2 34  3 34))
  (prn "F - 'reply' stops executing any callers as necessary"))
;? (quit)

(reset)
(new-trace "new-fn-reply-once")
(add-fns
  '((test1
      ((3 integer) <- add (1 integer) (2 integer))
      (reply)
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      (test1))))
(if (~is 4 (run 'main))  ; last reply sometimes not counted. worth fixing?
  (prn "F - 'reply' executes instructions exactly once"))
;? (quit)

(reset)
(new-trace "new-fn-arg-sequential")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- arg)
      ((3 integer) <- add (4 integer) (5 integer))
      (reply)
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      (test1 (1 integer) (2 integer))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4
                       ; add-fn's temporaries
                       4 1  5 3))
  (prn "F - 'arg' accesses in order the operands of the most recent function call (the caller)"))
;? (quit)

(reset)
(new-trace "new-fn-arg-random-access")
;? (set dump-trace*)
(add-fns
  '((test1
      ((5 integer) <- arg (1 literal))
      ((4 integer) <- arg (0 literal))
      ((3 integer) <- add (4 integer) (5 integer))
      (reply)
      ((4 integer) <- copy (34 literal)))  ; should never run
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      (test1 (1 integer) (2 integer))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4
                       ; add-fn's temporaries
                       4 1  5 3))
  (prn "F - 'arg' with index can access function call arguments out of order"))
;? (quit)

(reset)
(new-trace "new-fn-arg-status")
(add-fns
  '((test1
      ((4 integer) (5 boolean) <- arg))
    (main
      (test1 (1 literal))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 4 1  5 t))
  (prn "F - 'arg' sets a second oarg when arg exists"))
;? (quit)

(reset)
(new-trace "new-fn-arg-missing")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- arg))
    (main
      (test1 (1 literal))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 4 1))
  (prn "F - missing 'arg' doesn't cause error"))
;? (quit)

(reset)
(new-trace "new-fn-arg-missing-2")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) (6 boolean) <- arg))
    (main
      (test1 (1 literal))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 4 1  6 nil))
  (prn "F - missing 'arg' wipes second oarg when provided"))
;? (quit)

(reset)
(new-trace "new-fn-arg-missing-3")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- copy (34 literal))
      ((5 integer) (6 boolean) <- arg))
    (main
      (test1 (1 literal))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 4 1  6 nil))
  (prn "F - missing 'arg' consistently wipes its oarg"))
;? (quit)

(reset)
(new-trace "new-fn-arg-missing-4")
(add-fns
  '((test1
      ; if given two args, adds them; if given one arg, increments
      ((4 integer) <- arg)
      ((5 integer) (6 boolean) <- arg)
      { begin
        (break-if (6 boolean))
        ((5 integer) <- copy (1 literal))
      }
      ((7 integer) <- add (4 integer) (5 integer)))
    (main
      (test1 (34 literal))
    )))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 4 34  5 1  6 nil  7 35))
  (prn "F - function with optional second arg"))
;? (quit)

(reset)
(new-trace "new-fn-arg-by-value")
(add-fns
  '((test1
      ((1 integer) <- copy (0 literal))  ; overwrite caller memory
      ((2 integer) <- arg))  ; arg not clobbered
    (main
      ((1 integer) <- copy (34 literal))
      (test1 (1 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 0  2 34))
  (prn "F - 'arg' passes by value"))

(reset)
(new-trace "new-fn-reply-oarg")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- arg)
      ((6 integer) <- add (4 integer) (5 integer))
      (reply (6 integer))
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      ((3 integer) <- test1 (1 integer) (2 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4
                       ; add-fn's temporaries
                       4 1  5 3  6 4))
  (prn "F - 'reply' can take aguments that are returned, or written back into output args of caller"))

(reset)
(new-trace "new-fn-reply-oarg-multiple")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- arg)
      ((6 integer) <- add (4 integer) (5 integer))
      (reply (6 integer) (5 integer))
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      ((3 integer) (7 integer) <- test1 (1 integer) (2 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4    7 3
                         ; add-fn's temporaries
                         4 1  5 3  6 4))
  (prn "F - 'reply' permits a function to return multiple values at once"))

(reset)
(new-trace "new-fn-prepare-reply")
(add-fns
  '((test1
      ((4 integer) <- arg)
      ((5 integer) <- arg)
      ((6 integer) <- add (4 integer) (5 integer))
      (prepare-reply (6 integer) (5 integer))
      (reply)
      ((4 integer) <- copy (34 literal)))
    (main
      ((1 integer) <- copy (1 literal))
      ((2 integer) <- copy (3 literal))
      ((3 integer) (7 integer) <- test1 (1 integer) (2 integer)))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 1  2 3  3 4    7 3
                         ; add-fn's temporaries
                         4 1  5 3  6 4))
  (prn "F - without args, 'reply' returns values from previous 'prepare-reply'."))

; Our control operators are quite inconvenient to use, so mu provides a
; lightweight tool called 'convert-braces' to work in a slightly more
; convenient format with nested braces:
;
;   {
;     some instructions
;     {
;       more instructions
;     }
;   }
;
; Braces are just labels, they require no special parsing. The operations
; 'break' and 'continue' jump to just after the enclosing '}' and '{'
; respectively.
;
; Conditional and unconditional 'break' and 'continue' should give us 80% of
; the benefits of the control-flow primitives we're used to in other
; languages, like 'if', 'while', 'for', etc.

(reset)
(new-trace "convert-braces")
(if (~iso (convert-braces
            '(((1 integer) <- copy (4 literal))
              ((2 integer) <- copy (2 literal))
              ((3 integer) <- add (2 integer) (2 integer))
              { begin  ; 'begin' is just a hack because racket turns curlies into parens
                ((4 boolean) <- neq (1 integer) (3 integer))
                (break-if (4 boolean))
                ((5 integer) <- copy (34 literal))
              }
              (reply)))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((3 integer) <- add (2 integer) (2 integer))
            ((4 boolean) <- neq (1 integer) (3 integer))
            (jump-if (4 boolean) (1 offset))
            ((5 integer) <- copy (34 literal))
            (reply)))
  (prn "F - convert-braces replaces break-if with a jump-if to after the next close-curly"))

(reset)
(new-trace "convert-braces-empty-block")
(if (~iso (convert-braces
            '(((1 integer) <- copy (4 literal))
              ((2 integer) <- copy (2 literal))
              ((3 integer) <- add (2 integer) (2 integer))
              { begin
                (break)
              }
              (reply)))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((3 integer) <- add (2 integer) (2 integer))
            (jump (0 offset))
            (reply)))
  (prn "F - convert-braces works for degenerate blocks"))

(reset)
(new-trace "convert-braces-nested-break")
(if (~iso (convert-braces
            '(((1 integer) <- copy (4 literal))
              ((2 integer) <- copy (2 literal))
              ((3 integer) <- add (2 integer) (2 integer))
              { begin
                ((4 boolean) <- neq (1 integer) (3 integer))
                (break-if (4 boolean))
                { begin
                  ((5 integer) <- copy (34 literal))
                }
              }
              (reply)))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((3 integer) <- add (2 integer) (2 integer))
            ((4 boolean) <- neq (1 integer) (3 integer))
            (jump-if (4 boolean) (1 offset))
            ((5 integer) <- copy (34 literal))
            (reply)))
  (prn "F - convert-braces balances curlies when converting break"))

(reset)
(new-trace "convert-braces-nested-continue")
(if (~iso (convert-braces
            '(((1 integer) <- copy (4 literal))
              ((2 integer) <- copy (2 literal))
              { begin
                ((3 integer) <- add (2 integer) (2 integer))
                { begin
                  ((4 boolean) <- neq (1 integer) (3 integer))
                }
                (continue-if (4 boolean))
                ((5 integer) <- copy (34 literal))
              }
              (reply)))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((3 integer) <- add (2 integer) (2 integer))
            ((4 boolean) <- neq (1 integer) (3 integer))
            (jump-if (4 boolean) (-3 offset))
            ((5 integer) <- copy (34 literal))
            (reply)))
  (prn "F - convert-braces balances curlies when converting continue"))

(reset)
(new-trace "continue")
;? (set dump-trace*)
(add-fns
  '((main
      ((1 integer) <- copy (4 literal))
      ((2 integer) <- copy (1 literal))
      { begin
        ((2 integer) <- add (2 integer) (2 integer))
        ((3 boolean) <- neq (1 integer) (2 integer))
        (continue-if (3 boolean))
        ((4 integer) <- copy (34 literal))
      }
      (reply))))
;? (each stmt function*!main
;?   (prn stmt))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 4  2 4  3 nil  4 34))
  (prn "F - continue correctly loops"))

; todo: fuzz-test invariant: convert-braces offsets should be robust to any
; number of inner blocks inside but not around the continue block.

(reset)
(new-trace "continue-nested")
;? (set dump-trace*)
(add-fns
  '((main
      ((1 integer) <- copy (4 literal))
      ((2 integer) <- copy (1 literal))
      { begin
        ((2 integer) <- add (2 integer) (2 integer))
        { begin
          ((3 boolean) <- neq (1 integer) (2 integer))
        }
        (continue-if (3 boolean))
        ((4 integer) <- copy (34 literal))
      }
      (reply))))
;? (each stmt function*!main
;?   (prn stmt))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 4  2 4  3 nil  4 34))
  (prn "F - continue correctly loops"))

(reset)
(new-trace "continue-fail")
(add-fns
  '((main
      ((1 integer) <- copy (4 literal))
      ((2 integer) <- copy (2 literal))
      { begin
        ((2 integer) <- add (2 integer) (2 integer))
        { begin
          ((3 boolean) <- neq (1 integer) (2 integer))
        }
        (continue-if (3 boolean))
        ((4 integer) <- copy (34 literal))
      }
      (reply))))
(run 'main)
;? (prn memory*)
(if (~iso memory* (obj 1 4  2 4  3 nil  4 34))
  (prn "F - continue might never trigger"))

; A big convenience high-level languages provide is the ability to name memory
; locations. In mu, a lightweight tool called 'convert-names' provides this
; convenience.

(reset)
;? (new-trace "convert-names")
(if (~iso (convert-names
            '(((x integer) <- copy (4 literal))
              ((y integer) <- copy (2 literal))
              ((z integer) <- add (x integer) (y integer))))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((3 integer) <- add (1 integer) (2 integer))))
  (prn "F - convert-names renames symbolic names to integer locations"))

(reset)
;? (new-trace "convert-names-compound")
(if (~iso (convert-names
            '(((x integer-boolean-pair) <- copy (4 literal))
              ((y integer) <- copy (2 literal))))
          '(((1 integer-boolean-pair) <- copy (4 literal))
            ((3 integer) <- copy (2 literal))))
  (prn "F - convert-names increments integer locations by the size of the type of the previous var"))

(reset)
;? (new-trace "convert-names-nil")
(if (~iso (convert-names
            '(((x integer) <- copy (4 literal))
              ((y integer) <- copy (2 literal))
              ((nil integer) <- add (x integer) (y integer))))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((nil integer) <- add (1 integer) (2 integer))))
  (prn "F - convert-names never renames nil"))

(reset)
;? (new-trace "convert-names-global")
(if (~iso (convert-names
            '(((x integer) <- copy (4 literal))
              ((y integer global) <- copy (2 literal))
              ((default-scope integer) <- add (x integer) (y integer global))))
          '(((1 integer) <- copy (4 literal))
            ((y integer global) <- copy (2 literal))
            ((default-scope integer) <- add (1 integer) (y integer global))))
  (prn "F - convert-names never renames global operands"))

; kludgy support for 'fork'
(reset)
;? (new-trace "convert-names-functions")
(if (~iso (convert-names
            '(((x integer) <- copy (4 literal))
              ((y integer) <- copy (2 literal))
              ((z fn) <- add (x integer) (y integer))))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((z fn) <- add (1 integer) (2 integer))))
  (prn "F - convert-names never renames nil"))

(reset)
;? (new-trace "convert-names-record-fields")
(if (~iso (convert-names
            '(((x integer) <- get (34 integer-boolean-pair) (bool offset))))
          '(((1 integer) <- get (34 integer-boolean-pair) (1 offset))))
  (prn "F - convert-names replaces record field offsets"))

(reset)
;? (new-trace "convert-names-record-fields-ambiguous")
(if (errsafe (convert-names
               '(((bool boolean) <- copy (t literal))
                 ((x integer) <- get (34 integer-boolean-pair) (bool offset)))))
  (prn "F - convert-names doesn't allow offsets and variables with the same name in a function"))

(reset)
;? (new-trace "convert-names-record-fields-ambiguous-2")
(if (errsafe (convert-names
               '(((x integer) <- get (34 integer-boolean-pair) (bool offset))
                 ((bool boolean) <- copy (t literal)))))
  (prn "F - convert-names doesn't allow offsets and variables with the same name in a function - 2"))

(reset)
;? (new-trace "convert-names-record-fields-indirect")
(if (~iso (convert-names
            '(((x integer) <- get (34 integer-boolean-pair-address deref) (bool offset))))
          '(((1 integer) <- get (34 integer-boolean-pair-address deref) (1 offset))))
  (prn "F - convert-names replaces field offsets for record addresses"))

; A rudimentary memory allocator. Eventually we want to write this in mu.
;
; No deallocation yet; let's see how much code we can build in mu before we
; feel the need for it.

(reset)
(new-trace "new-primitive")
(add-fns
  '((main
      ((1 integer-address) <- new (integer literal)))))
(let before Memory-in-use-until
  (run 'main)
;?   (prn memory*)
  (if (~iso memory*.1 before)
    (prn "F - 'new' returns current high-water mark"))
  (if (~iso Memory-in-use-until (+ before 1))
    (prn "F - 'new' on primitive types increments high-water mark by their size")))

(reset)
(new-trace "new-array-literal")
(add-fns
  '((main
      ((1 type-array-address) <- new (type-array literal) (5 literal)))))
(let before Memory-in-use-until
  (run 'main)
;?   (prn memory*)
  (if (~iso memory*.1 before)
    (prn "F - 'new' on array with literal size returns current high-water mark"))
  (if (~iso Memory-in-use-until (+ before 6))
    (prn "F - 'new' on primitive arrays increments high-water mark by their size")))

(reset)
(new-trace "new-array-direct")
(add-fns
  '((main
      ((1 integer) <- copy (5 literal))
      ((2 type-array-address) <- new (type-array literal) (1 integer)))))
(let before Memory-in-use-until
  (run 'main)
;?   (prn memory*)
  (if (~iso memory*.2 before)
    (prn "F - 'new' on array with variable size returns current high-water mark"))
  (if (~iso Memory-in-use-until (+ before 6))
    (prn "F - 'new' on primitive arrays increments high-water mark by their (variable) size")))

; Even though our memory locations can now have names, the names are all
; globals, accessible from any function. To isolate functions from their
; callers we need local variables, and mu provides them using a special
; variable called default-scope. When you initialize such a variable (likely
; with a call to our just-defined memory allocator) mu interprets memory
; locations as offsets from its value. If default-scope is set to 1000, for
; example, reads and writes to memory location 1 will really go to 1001.
;
; 'default-scope' is itself hard-coded to be function-local; it's nil in a new
; function, and it's restored when functions return to their callers. But the
; actual scope allocation is independent. So you can define closures, or do
; even more funky things like share locals between two coroutines.

(reset)
(new-trace "set-default-scope")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (2 literal))
      ((1 integer) <- copy (23 literal)))))
(let before Memory-in-use-until
;?   (set dump-trace*)
  (run 'main)
;?   (prn memory*)
  (if (~and (~is 23 memory*.1)
            (is 23 (memory* (+ before 1))))
    (prn "F - default-scope implicitly modifies variable locations")))

(reset)
(new-trace "set-default-scope-skips-offset")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (2 literal))
      ((1 integer) <- copy (23 offset)))))
(let before Memory-in-use-until
;?   (set dump-trace*)
  (run 'main)
;?   (prn memory*)
  (if (~and (~is 23 memory*.1)
            (is 23 (memory* (+ before 1))))
    (prn "F - default-scope skips 'offset' types just like literals")))

(reset)
(new-trace "default-scope-bounds-check")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (2 literal))
      ((2 integer) <- copy (23 literal)))))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(let last-routine (deq completed-routines*)
  (if (no rep.last-routine!error)
    (prn "F - default-scope checks bounds")))

(reset)
(new-trace "default-scope-and-get-indirect")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (5 literal))
      ((1 integer-boolean-pair-address) <- new (integer-boolean-pair literal))
      ((2 integer-address) <- get-address (1 integer-boolean-pair-address deref) (0 offset))
      ((2 integer-address deref) <- copy (34 literal))
      ((3 integer global) <- get (1 integer-boolean-pair-address deref) (0 offset)))))
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1")))
(run 'main)
;? (prn memory*)
;? (prn (as cons completed-routines*))
(let last-routine (deq completed-routines*)
  (aif rep.last-routine!error (prn "error - " it)))
(if (~is 34 memory*.3)
  (prn "F - indirect 'get' works in the presence of default-scope"))
;? (quit)

(reset)
(new-trace "default-scope-and-index-indirect")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (5 literal))
      ((1 integer-array-address) <- new (integer-array literal) (4 literal))
      ((2 integer-address) <- index-address (1 integer-array-address deref) (2 offset))
      ((2 integer-address deref) <- copy (34 literal))
      ((3 integer global) <- index (1 integer-array-address deref) (2 offset)))))
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "cvt0" "cvt1")))
(run 'main)
;? (prn memory*)
;? (prn (as cons completed-routines*))
(let last-routine (deq completed-routines*)
  (aif rep.last-routine!error (prn "error - " it)))
(if (~is 34 memory*.3)
  (prn "F - indirect 'index' works in the presence of default-scope"))
;? (quit)

(reset)
;? (new-trace "convert-names-default-scope")
(if (~iso (convert-names
            '(((x integer) <- copy (4 literal))
              ((y integer) <- copy (2 literal))
              ; unsafe in general; don't write random values to 'default-scope'
              ((default-scope integer) <- add (x integer) (y integer))))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (2 literal))
            ((default-scope integer) <- add (1 integer) (2 integer))))
  (prn "F - convert-names never renames default-scope"))

(reset)
(new-trace "suppress-default-scope")
(add-fns
  '((main
      ((default-scope scope-address) <- new (scope literal) (2 literal))
      ((1 integer global) <- copy (23 literal)))))
(let before Memory-in-use-until
;?   (set dump-trace*)
  (run 'main)
;?   (prn memory*)
  (if (~and (is 23 memory*.1)
            (~is 23 (memory* (+ before 1))))
    (prn "F - default-scope skipped for locations with metadata 'global'")))

; Putting it all together, here's how you define generic functions that run
; different code based on the types of their args.

(reset)
(new-trace "dispatch-clause")
;? (set dump-trace*)
(add-fns
  '((test1
      ; doesn't matter too much how many locals you allocate space for (here 20)
      ; if it's slightly too many -- memory is plentiful
      ; if it's too few -- mu will raise an error
      ((default-scope scope-address) <- new (scope literal) (20 literal))
      ((first-arg-box tagged-value-address) <- arg)
      ; if given integers, add them
      { begin
        ((first-arg integer) (match? boolean) <- maybe-coerce (first-arg-box tagged-value-address deref) (integer literal))
        (break-unless (match? boolean))
        ((second-arg-box tagged-value-address) <- arg)
        ((second-arg integer) <- maybe-coerce (second-arg-box tagged-value-address deref) (integer literal))
        ((result integer) <- add (first-arg integer) (second-arg integer))
        (reply (result integer))
      }
      (reply (nil literal)))
    (main
      ((1 tagged-value-address) <- new-tagged-value (integer literal) (34 literal))
      ((2 tagged-value-address) <- new-tagged-value (integer literal) (3 literal))
      ((3 integer) <- test1 (1 tagged-value-address) (2 tagged-value-address)))))
(run 'main)
;? (prn memory*)
(if (~is memory*.3 37)
  (prn "F - an example function that checks that its oarg is an integer"))
;? (quit)

; todo - test that reply increments pc for caller frame after popping current frame

(reset)
(new-trace "dispatch-multiple-clauses")
;? (set dump-trace*)
(add-fns
  '((test1
      ((default-scope scope-address) <- new (scope literal) (20 literal))
      ((first-arg-box tagged-value-address) <- arg)
      ; if given integers, add them
      { begin
        ((first-arg integer) (match? boolean) <- maybe-coerce (first-arg-box tagged-value-address deref) (integer literal))
        (break-unless (match? boolean))
        ((second-arg-box tagged-value-address) <- arg)
        ((second-arg integer) <- maybe-coerce (second-arg-box tagged-value-address deref) (integer literal))
        ((result integer) <- add (first-arg integer) (second-arg integer))
        (reply (result integer))
      }
      ; if given booleans, or them (it's a silly kind of generic function)
      { begin
        ((first-arg boolean) (match? boolean) <- maybe-coerce (first-arg-box tagged-value-address deref) (boolean literal))
        (break-unless (match? boolean))
        ((second-arg-box tagged-value-address) <- arg)
        ((second-arg boolean) <- maybe-coerce (second-arg-box tagged-value-address deref) (boolean literal))
        ((result boolean) <- or (first-arg boolean) (second-arg boolean))
        (reply (result integer))
      }
      (reply (nil literal)))
    (main
      ((1 tagged-value-address) <- new-tagged-value (boolean literal) (t literal))
      ((2 tagged-value-address) <- new-tagged-value (boolean literal) (nil literal))
      ((3 boolean) <- test1 (1 tagged-value-address) (2 tagged-value-address)))))
;? (each stmt function*!test-fn
;?   (prn "  " stmt))
(run 'main)
;? (wipe dump-trace*)
;? (prn memory*)
(if (~is memory*.3 t)
  (prn "F - an example function that can do different things (dispatch) based on the type of its args or oargs"))
;? (quit)

(reset)
(new-trace "dispatch-multiple-calls")
(add-fns
  '((test1
      ((default-scope scope-address) <- new (scope literal) (20 literal))
      ((first-arg-box tagged-value-address) <- arg)
      ; if given integers, add them
      { begin
        ((first-arg integer) (match? boolean) <- maybe-coerce (first-arg-box tagged-value-address deref) (integer literal))
        (break-unless (match? boolean))
        ((second-arg-box tagged-value-address) <- arg)
        ((second-arg integer) <- maybe-coerce (second-arg-box tagged-value-address deref) (integer literal))
        ((result integer) <- add (first-arg integer) (second-arg integer))
        (reply (result integer))
      }
      ; if given booleans, or them (it's a silly kind of generic function)
      { begin
        ((first-arg boolean) (match? boolean) <- maybe-coerce (first-arg-box tagged-value-address deref) (boolean literal))
        (break-unless (match? boolean))
        ((second-arg-box tagged-value-address) <- arg)
        ((second-arg boolean) <- maybe-coerce (second-arg-box tagged-value-address deref) (boolean literal))
        ((result boolean) <- or (first-arg boolean) (second-arg boolean))
        (reply (result integer))
      }
      (reply (nil literal)))
    (main
      ((1 tagged-value-address) <- new-tagged-value (boolean literal) (t literal))
      ((2 tagged-value-address) <- new-tagged-value (boolean literal) (nil literal))
      ((3 boolean) <- test1 (1 tagged-value-address) (2 tagged-value-address))
      ((10 tagged-value-address) <- new-tagged-value (integer literal) (34 literal))
      ((11 tagged-value-address) <- new-tagged-value (integer literal) (3 literal))
      ((12 integer) <- test1 (10 tagged-value-address) (11 tagged-value-address)))))
(run 'main)
;? (prn memory*)
(if (~and (is memory*.3 t) (is memory*.12 37))
  (prn "F - different calls can exercise different clauses of the same function"))

; A rudimentary process scheduler. You can 'run' multiple functions at once,
; and they share the virtual processor.
;
; There's also a 'fork' primitive to let functions create new threads of
; execution (we call them routines).
;
; Eventually we want to allow callers to influence how much of their CPU they
; give to their 'children', or to rescind a child's running privileges.

(reset)
(new-trace "scheduler")
(add-fns
  '((f1
      ((1 integer) <- copy (3 literal)))
    (f2
      ((2 integer) <- copy (4 literal)))))
(let ninsts (run 'f1 'f2)
  (when (~iso 2 ninsts)
    (prn "F - scheduler didn't run the right number of instructions: " ninsts)))
(if (~iso memory* (obj 1 3  2 4))
  (prn "F - scheduler runs multiple functions: " memory*))
(check-trace-contents "scheduler orders functions correctly"
  '(("schedule" "f1")
    ("schedule" "f2")
  ))
(check-trace-contents "scheduler orders schedule and run events correctly"
  '(("schedule" "f1")
    ("run" "f1 0")
    ("schedule" "f2")
    ("run" "f2 0")
  ))

; The scheduler needs to keep track of the call stack for each routine.
; Eventually we'll want to save this information in mu's address space itself,
; along with the types array, the magic buffers for args and oargs, and so on.
;
; Eventually we want the right stack-management primitives to build delimited
; continuations in mu.

; Routines can throw errors.
(reset)
(new-trace "array-bounds-check")
(add-fns
  '((main
      ((1 integer) <- copy (2 literal))
      ((2 integer) <- copy (23 literal))
      ((3 integer) <- copy (24 literal))
      ((4 integer) <- index (1 integer-array) (2 literal)))))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(let last-routine (deq completed-routines*)
  (if (no rep.last-routine!error)
    (prn "F - 'index' throws an error if out of bounds")))

; Lightweight tools can also operate on quoted lists of statements surrounded
; by square brackets. In the example below, we mimic Go's 'defer' keyword
; using 'convert-quotes'. It lets us write code anywhere in a function, but
; have it run just before the function exits. Great for keeping code to
; reclaim memory or other resources close to the code to allocate it. (C++
; programmers know this as RAII.) We'll use 'defer' when we build a memory
; deallocation routine like C's 'free'.
;
; More powerful reorderings are also possible like in Literate Programming or
; Aspect-Oriented Programming; one advantage of prohibiting arbitrarily nested
; code is that we can naturally name 'join points' wherever we want.

(reset)
;? (new-trace "convert-quotes-defer")
(if (~iso (convert-quotes
            '(((1 integer) <- copy (4 literal))
              (defer [
                       ((3 integer) <- copy (6 literal))
                     ])
              ((2 integer) <- copy (5 literal))))
          '(((1 integer) <- copy (4 literal))
            ((2 integer) <- copy (5 literal))
            ((3 integer) <- copy (6 literal))))
  (prn "F - convert-quotes can handle 'defer'"))

; Synchronization using channels like in Erlang or Go.
; The two ends of a channel will usually belong to different routines, but
; each end should only be used by a single one. Don't try to read from or
; write to it from multiple routines at once.

(reset)
(new-trace "channel-new")
(add-fns
  '((main
      ((1 channel-address) <- new-channel (3 literal)))))
;? (set dump-trace*)
(run 'main)
;? (prn memory*)
(if (or (~is 0 (memory* memory*.1))
        (~is 0 (memory* (+ 1 memory*.1))))
  (prn "F - 'new-channel' initializes 'first-full and 'first-free to 0"))

(reset)
(new-trace "channel-write")
(add-fns
  '((main
      ((1 channel-address) <- new-channel (3 literal))
      ((2 integer-address) <- new (integer literal))
      ((2 integer-address deref) <- copy (34 literal))
      ((3 tagged-value-address) <- new-tagged-value (integer-address literal) (2 integer-address))
      ((1 channel-address deref) <- write (1 channel-address deref) (3 tagged-value-address deref)))))
;? (set dump-trace*)
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "array-len" "cvt0" "cvt1")))
(run 'main)
;? (prn memory*)
(if (or (~is 0 (memory* memory*.1))
        (~is 1 (memory* (+ 1 memory*.1))))
  (prn "F - 'write' enqueues item to channel"))

(reset)
(new-trace "channel-read")
(add-fns
  '((main
      ((1 channel-address) <- new-channel (3 literal))
      ((2 integer-address) <- new (integer literal))
      ((2 integer-address deref) <- copy (34 literal))
      ((3 tagged-value-address) <- new-tagged-value (integer-address literal) (2 integer-address))
      ((1 channel-address deref) <- write (1 channel-address deref) (3 tagged-value-address deref))
      ((4 tagged-value) (1 channel-address deref) <- read (1 channel-address deref))
      ((5 integer-address) <- maybe-coerce (4 tagged-value) (integer-address literal)))))
;? (set dump-trace*)
;? (= dump-trace* (obj blacklist '("sz" "m" "setm" "addr" "array-len" "cvt0" "cvt1")))
(run 'main)
;? (prn int-canon.memory*)
(if (~is memory*.5 memory*.2)
  (prn "F - 'read' returns written value"))
(if (or (~is 1 (memory* memory*.1))
        (~is 1 (memory* (+ 1 memory*.1))))
  (prn "F - 'read' dequeues item from channel"))

(reset)  ; end file with this to persist the trace for the final test