243 lines
6.0 KiB
Plaintext
243 lines
6.0 KiB
Plaintext
Mu programs are lists of functions. Each function has the following form:
|
|
|
|
fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
|
|
_instructions_
|
|
}
|
|
|
|
Each function has a header line, and some number of instructions, each on a
|
|
separate line.
|
|
|
|
Instructions may be primitives or function calls. Either way, all instructions
|
|
have one of the following forms:
|
|
|
|
# defining variables
|
|
var _name_: _type_
|
|
var _name_/_register_: _type_
|
|
|
|
# doing things with variables
|
|
_operation_ _inouts_
|
|
_outputs_ <- _operation_ _inouts_
|
|
|
|
Instructions and functions may have inouts and outputs. Both inouts and
|
|
outputs are variables.
|
|
|
|
As seen above, variables can be defined to live in a register, like this:
|
|
|
|
n/eax
|
|
|
|
Variables not assigned a register live in the stack.
|
|
|
|
Function inouts must always be on the stack, and outputs must always be in
|
|
registers. A function call must always write to the exact registers its
|
|
definition requires. For example:
|
|
|
|
fn foo -> x/eax: int {
|
|
...
|
|
}
|
|
fn main {
|
|
a/eax <- foo # ok
|
|
a/ebx <- foo # wrong
|
|
}
|
|
|
|
Primitive inouts may be on the stack or in registers, but outputs must always
|
|
be in registers.
|
|
|
|
Functions can contain nested blocks inside { and }. Variables defined in a
|
|
block don't exist outside it.
|
|
|
|
{
|
|
_instructions_
|
|
{
|
|
_more instructions_
|
|
}
|
|
}
|
|
|
|
Blocks can be named like so:
|
|
|
|
$name: {
|
|
_instructions_
|
|
}
|
|
|
|
## Primitive instructions
|
|
|
|
Primitive instructions currently supported in Mu ('n' indicates a literal
|
|
integer rather than a variable, and 'var/reg' indicates a variable in a
|
|
register):
|
|
|
|
var/reg <- increment
|
|
increment var
|
|
var/reg <- decrement
|
|
decrement var
|
|
var1/reg1 <- add var2/reg2
|
|
var/reg <- add var2
|
|
add-to var1, var2/reg
|
|
var/reg <- add n
|
|
add-to var, n
|
|
|
|
var1/reg1 <- sub var2/reg2
|
|
var/reg <- sub var2
|
|
sub-from var1, var2/reg
|
|
var/reg <- sub n
|
|
sub-from var, n
|
|
|
|
var1/reg1 <- and var2/reg2
|
|
var/reg <- and var2
|
|
and-with var1, var2/reg
|
|
var/reg <- and n
|
|
and-with var, n
|
|
|
|
var1/reg1 <- or var2/reg2
|
|
var/reg <- or var2
|
|
or-with var1, var2/reg
|
|
var/reg <- or n
|
|
or-with var, n
|
|
|
|
var1/reg1 <- xor var2/reg2
|
|
var/reg <- xor var2
|
|
xor-with var1, var2/reg
|
|
var/reg <- xor n
|
|
xor-with var, n
|
|
|
|
var1/reg1 <- copy var2/reg2
|
|
copy-to var1, var2/reg
|
|
var/reg <- copy var2
|
|
var/reg <- copy n
|
|
copy-to var, n
|
|
|
|
compare var1, var2/reg
|
|
compare var1/reg, var2
|
|
compare var/eax, n
|
|
compare var, n
|
|
|
|
var/reg <- multiply var2
|
|
|
|
Notice that there are no primitive instructions operating on two variables in
|
|
memory. That's a restriction of the underlying x86 processor.
|
|
|
|
Any instruction above that takes a variable in memory can be replaced with a
|
|
dereference (`*`) of an address variable in a register. But you can't dereference
|
|
variables in memory.
|
|
|
|
## Primitive jump instructions
|
|
|
|
There are two kinds of jumps, both with many variations: `break` and `loop`.
|
|
`break` instructions jump to the end of the containing block. `loop` instructions
|
|
jump to the beginning of the containing block.
|
|
|
|
Jumps can take an optional label starting with '$':
|
|
|
|
loop $foo
|
|
|
|
This instruction jumps to the beginning of the block called $foo. It must lie
|
|
somewhere inside such a block. Jumps are only legal to containing blocks. Use
|
|
named blocks with restraint; jumps to places far away can get confusing.
|
|
|
|
There are two unconditional jumps:
|
|
|
|
loop
|
|
loop label
|
|
break
|
|
break label
|
|
|
|
The remaining jump instructions are all conditional. Conditional jumps rely on
|
|
the result of the most recently executed `compare` instruction. (To keep
|
|
programs easy to read, keep compare instructions close to the jump that uses
|
|
them.)
|
|
|
|
break-if-=
|
|
break-if-= label
|
|
break-if-!=
|
|
break-if-!= label
|
|
|
|
Inequalities are similar, but have unsigned and signed variants. We assume
|
|
unsigned variants are only ever used to compare addresses.
|
|
|
|
break-if-<
|
|
break-if-< label
|
|
break-if->
|
|
break-if-> label
|
|
break-if-<=
|
|
break-if-<= label
|
|
break-if->=
|
|
break-if->= label
|
|
|
|
break-if-addr<
|
|
break-if-addr< label
|
|
break-if-addr>
|
|
break-if-addr> label
|
|
break-if-addr<=
|
|
break-if-addr<= label
|
|
break-if-addr>=
|
|
break-if-addr>= label
|
|
|
|
Similarly, conditional loops:
|
|
|
|
loop-if-=
|
|
loop-if-= label
|
|
loop-if-!=
|
|
loop-if-!= label
|
|
|
|
loop-if-<
|
|
loop-if-< label
|
|
loop-if->
|
|
loop-if-> label
|
|
loop-if-<=
|
|
loop-if-<= label
|
|
loop-if->=
|
|
loop-if->= label
|
|
|
|
loop-if-addr<
|
|
loop-if-addr< label
|
|
loop-if-addr>
|
|
loop-if-addr> label
|
|
loop-if-addr<=
|
|
loop-if-addr<= label
|
|
loop-if-addr>=
|
|
loop-if-addr>= label
|
|
|
|
## Address operations
|
|
|
|
var/reg: (addr T) <- address var: T # var must be in mem (on the stack)
|
|
|
|
## Array operations
|
|
|
|
var/reg: int <- length arr/reg: (addr array T)
|
|
var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
|
|
var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
|
|
var/reg: (addr T) <- index arr/reg: (addr array T), n
|
|
var/reg: (addr T) <- index arr: (array T sz), n
|
|
|
|
var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int # arr can be in reg or mem
|
|
var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int # arr can be in reg or mem
|
|
var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
|
|
|
|
## User-defined types
|
|
|
|
var/reg: (addr T_f) <- get var/reg: (addr T), f
|
|
where record (product) type T has elements a, b, c, ... of types T_a, T_b, T_c, ...
|
|
var/reg: (addr T_f) <- get var: T, f
|
|
|
|
## Handles for safe access to the heap
|
|
|
|
Say we created a handle like this on the stack (it can't be in a register)
|
|
var x: (handle T)
|
|
allocate Heap, T, x
|
|
|
|
You can copy handles to another variable on the stack like this:
|
|
var y: (handle T)
|
|
copy-handle-to y, x
|
|
|
|
You can also save handles inside other user-defined types like this:
|
|
var y/reg: (addr handle T_f) <- get var: (addr T), f
|
|
copy-handle-to *y, x
|
|
|
|
Or this:
|
|
var y/reg: (addr handle T) <- index arr: (addr array handle T), n
|
|
copy-handle-to *y, x
|
|
|
|
Handles can be converted into addresses like this:
|
|
var y/reg: (addr T) <- lookup x
|
|
|
|
It's illegal to continue to use this addr after a function that reclaims heap
|
|
memory. You have to repeat the lookup.
|