This commit is contained in:
Kartik Agaram 2020-02-01 12:14:12 -08:00
parent 9977cfe53c
commit 924ed08aca
4 changed files with 226 additions and 34 deletions

View File

@ -66,8 +66,8 @@ statements in Mu translate to a single machine code instruction. Variables
reside in memory by default. Programs must specify registers when they want to
use them. Functions must return results in registers. Execution begins at the
function `main`, which always returns its result in register `ebx`. [This post](http://akkartik.name/post/mu-2019-2)
has more details. You can see a complete list of supported instructions and
their translations in [this summary](mu_instructions).
has more details, and there's a [summary](mu_summary) of all supported
instructions.
## SubX
@ -715,19 +715,21 @@ If you're still reading, here are some more things to check out:
a) Try running the tests: `./test_apps`
b) Check out the online help. Starting point: `./bootstrap`
b) There's a handy [summary](mu_instructions) of how the Mu compiler translates
instructions to SubX.
c) Familiarize yourself with `./bootstrap help opcodes`. If you program in Mu
you'll spend a lot of time with it. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
c) Check out the online help on SubX. Starting point: `./bootstrap`
d) Familiarize yourself with the list of opcodes supported in SubX: `./bootstrap
help opcodes`. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
[Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
are some tips on my setup for quickly finding the right opcode for any
situation from within Vim.
d) Try working on [the starter exercises](https://github.com/akkartik/mu/pulls)
e) Try working on [some starter SubX exercises](https://github.com/akkartik/mu/pulls)
(labelled `hello`).
e) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
Check it out.
f) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
## Credits

View File

@ -127,26 +127,14 @@ compare var, n {.name="compare", .inouts=[var, n],
var/reg <- multiply var2 {.name="multiply", .inouts=[var2], .outputs=[reg], .subx-name="0f af/multiply", .rm32="*(ebp+" inouts[0].stack-offset ")", .r32=outputs[0]}
Jumps have a slightly simpler format. Most of the time they take no inouts or
outputs. Occasionally you give them a label for a block to jump to the start
or end of.
outputs. Occasionally you give them a label for a containing block to jump to
the start or end of.
break-if-= {.name="break-if-=", .subx-name="0f 84/jump-if-= break/disp32"}
break-if-= label {.name="break-if-=", .inouts=[label], .subx-name="0f 84/jump-if-=", .disp32=inouts[0] ":break"}
break-if-!= {.name="break-if-!=", .subx-name="0f 85/jump-if-!= break/disp32"}
break-if-!= label {.name="break-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":break"}
Inequalities are similar, but have unsigned and signed variants. We assume
unsigned variants are only ever used to compare addresses.
break-if-addr< {.name="break-if-addr<", .subx-name="0f 82/jump-if-addr< break/disp32"}
break-if-addr< label {.name="break-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":break"}
break-if-addr> {.name="break-if-addr>", .subx-name="0f 87/jump-if-addr> break/disp32"}
break-if-addr> label {.name="break-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":break"}
break-if-addr<= {.name="break-if-addr<=", .subx-name="0f 86/jump-if-addr<= break/disp32"}
break-if-addr<= label {.name="break-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":break"}
break-if-addr>= {.name="break-if-addr>=", .subx-name="0f 83/jump-if-addr>= break/disp32"}
break-if-addr>= label {.name="break-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":break"}
break-if-< {.name="break-if-<", .subx-name="0f 8c/jump-if-< break/disp32"}
break-if-< label {.name="break-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":break"}
break-if-> {.name="break-if->", .subx-name="0f 8f/jump-if-> break/disp32"}
@ -156,6 +144,15 @@ break-if-<= label {.name="break-if-<=", .inouts=[label],
break-if->= {.name="break-if->=", .subx-name="0f 8d/jump-if->= break/disp32"}
break-if->= label {.name="break-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":break"}
break-if-addr< {.name="break-if-addr<", .subx-name="0f 82/jump-if-addr< break/disp32"}
break-if-addr< label {.name="break-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":break"}
break-if-addr> {.name="break-if-addr>", .subx-name="0f 87/jump-if-addr> break/disp32"}
break-if-addr> label {.name="break-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":break"}
break-if-addr<= {.name="break-if-addr<=", .subx-name="0f 86/jump-if-addr<= break/disp32"}
break-if-addr<= label {.name="break-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":break"}
break-if-addr>= {.name="break-if-addr>=", .subx-name="0f 83/jump-if-addr>= break/disp32"}
break-if-addr>= label {.name="break-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":break"}
Finally, we repeat all the 'break' variants almost identically for 'loop'
instructions. This works because the compiler inserts ':loop' labels at the
start of such named blocks, and ':break' labels at the end.
@ -165,15 +162,6 @@ loop-if-= label {.name="loop-if-=", .inouts=[label],
loop-if-!= {.name="loop-if-!=", .subx-name="0f 85/jump-if-!= loop/disp32"}
loop-if-!= label {.name="loop-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":loop"}
loop-if-addr< {.name="loop-if-addr<", .subx-name="0f 82/jump-if-addr< loop/disp32"}
loop-if-addr< label {.name="loop-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":loop"}
loop-if-addr> {.name="loop-if-addr>", .subx-name="0f 87/jump-if-addr> loop/disp32"}
loop-if-addr> label {.name="loop-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":loop"}
loop-if-addr<= {.name="loop-if-addr<=", .subx-name="0f 86/jump-if-addr<= loop/disp32"}
loop-if-addr<= label {.name="loop-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":loop"}
loop-if-addr>= {.name="loop-if-addr>=", .subx-name="0f 83/jump-if-addr>= loop/disp32"}
loop-if-addr>= label {.name="loop-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":loop"}
loop-if-< {.name="loop-if-<", .subx-name="0f 8c/jump-if-< loop/disp32"}
loop-if-< label {.name="loop-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":loop"}
loop-if-> {.name="loop-if->", .subx-name="0f 8f/jump-if-> loop/disp32"}
@ -183,10 +171,20 @@ loop-if-<= label {.name="loop-if-<=", .inouts=[label],
loop-if->= {.name="loop-if->=", .subx-name="0f 8d/jump-if->= loop/disp32"}
loop-if->= label {.name="loop-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":loop"}
There are also unconditional loop instructions. So far it doesn't seem like
unconditional breaks have much use.
loop-if-addr< {.name="loop-if-addr<", .subx-name="0f 82/jump-if-addr< loop/disp32"}
loop-if-addr< label {.name="loop-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":loop"}
loop-if-addr> {.name="loop-if-addr>", .subx-name="0f 87/jump-if-addr> loop/disp32"}
loop-if-addr> label {.name="loop-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":loop"}
loop-if-addr<= {.name="loop-if-addr<=", .subx-name="0f 86/jump-if-addr<= loop/disp32"}
loop-if-addr<= label {.name="loop-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":loop"}
loop-if-addr>= {.name="loop-if-addr>=", .subx-name="0f 83/jump-if-addr>= loop/disp32"}
loop-if-addr>= label {.name="loop-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":loop"}
Finally, unconditional jumps:
loop {.name="loop", .subx-name="e9/jump loop/disp32"}
loop label {.name="loop", .inouts=[label], .subx-name="e9/jump", .disp32=inouts[0] ":loop"}
(So far it doesn't seem like unconditional breaks have much use.)
vim:ft=c:nowrap

192
mu_summary Normal file
View File

@ -0,0 +1,192 @@
Mu programs are lists of functions. Each function has the following form:
fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
_instructions_
}
Instructions may be primitives or function calls. Either way, all instructions
have one of the following forms:
# defining variables
var _name_: _type_
var _name_/_register_: _type_
# doing things with variables
_operation_ _inouts_
_outputs_ <- _operation_ _inouts_
Instructions and functions may have inouts and outputs. Both inouts and
outputs are variables.
As seen above, variables can be defined to live in a register, like this:
n/eax
Variables not assigned a register live in the stack.
Function inouts must always be on the stack, and outputs must always be in
registers. A function call must always write to the exact registers its
definition requires. For example:
fn foo -> x/eax: int {
...
}
fn main {
a/eax <- foo # ok
a/ebx <- foo # wrong
}
Primitive inouts may be on the stack or in registers, but outputs must always
be in registers.
Functions can contain nested blocks inside { and }. Variables defined in a
block don't exist outside it.
## Primitive instructions
Primitive instructions currently supported in Mu:
var/eax <- increment
var/ecx <- increment
var/edx <- increment
var/ebx <- increment
var/esi <- increment
var/edi <- increment
increment var
var/eax <- decrement
var/ecx <- decrement
var/edx <- decrement
var/ebx <- decrement
var/esi <- decrement
var/edi <- decrement
decrement var
var1/reg1 <- add var2/reg2
var/reg <- add var2
add-to var1, var2/reg
var/eax <- add n
var/reg <- add n
add-to var, n
var1/reg1 <- sub var2/reg2
var/reg <- sub var2
sub-from var1, var2/reg
var/eax <- sub n
var/reg <- sub n
sub-from var, n
var1/reg1 <- and var2/reg2
var/reg <- and var2
and-with var1, var2/reg
var/eax <- and n
var/reg <- and n
and-with var, n
var1/reg1 <- or var2/reg2
var/reg <- or var2
or-with var1, var2/reg
var/eax <- or n
var/reg <- or n
or-with var, n
var1/reg1 <- xor var2/reg2
var/reg <- xor var2
xor-with var1, var2/reg
var/eax <- xor n
var/reg <- xor n
xor-with var, n
var/eax <- copy n
var/ecx <- copy n
var/edx <- copy n
var/ebx <- copy n
var/esi <- copy n
var/edi <- copy n
var1/reg1 <- copy var2/reg2
copy-to var1, var2/reg
var/reg <- copy var2
var/reg <- copy n
copy-to var, n
compare var1, var2/reg
compare var1/reg, var2
compare var/eax, n
compare var, n
var/reg <- multiply var2
## Primitive jump instructions
There are two kinds of jumps, both with many variations: `break` and `loop`.
`break` instructions jump to the end of the containing block. `loop` instructions
jump to the beginning of the containing block.
Jumps can take an optional label starting with '$':
loop $foo
This instruction jumps to the beginning of the block called $foo. It must lie
somewhere inside such a box. Jumps are only legal to containing blocks.
There are two unconditional jumps:
loop
loop label
# unconditional break instructions don't seem useful
The remaining jump instructions are all conditional. Conditional jumps rely on
the result of the most recently executed `compare` instruction. (To keep
programs easy to read, keep compare instructions close to the jump that uses
them.)
break-if-=
break-if-= label
break-if-!=
break-if-!= label
Inequalities are similar, but have unsigned and signed variants. We assume
unsigned variants are only ever used to compare addresses.
break-if-<
break-if-< label
break-if->
break-if-> label
break-if-<=
break-if-<= label
break-if->=
break-if->= label
break-if-addr<
break-if-addr< label
break-if-addr>
break-if-addr> label
break-if-addr<=
break-if-addr<= label
break-if-addr>=
break-if-addr>= label
Similarly, conditional loops:
loop-if-=
loop-if-= label
loop-if-!=
loop-if-!= label
loop-if-<
loop-if-< label
loop-if->
loop-if-> label
loop-if-<=
loop-if-<= label
loop-if->=
loop-if->= label
loop-if-addr<
loop-if-addr< label
loop-if-addr>
loop-if-addr> label
loop-if-addr<=
loop-if-addr<= label
loop-if-addr>=
loop-if-addr>= label

View File

@ -100,7 +100,7 @@ Opcodes currently supported by SubX:
0f 8e: jump disp32 bytes away if lesser or equal (signed), if ZF is set or SF != OF (jcc/jle/jng)
0f 8f: jump disp32 bytes away if greater (signed), if ZF is unset and SF == OF (jcc/jg/jnle)
0f af: multiply rm32 into r32 (imul)
Run `subx help instructions` for details on words like 'r32' and 'disp8'.
Run `bootstrap help instructions` for details on words like 'r32' and 'disp8'.
For complete details on these instructions, consult the IA-32 manual (volume 2).
There's various versions of it online, such as https://c9x.me/x86.
The mnemonics in brackets will help you locate each instruction.