5968
This commit is contained in:
parent
9977cfe53c
commit
924ed08aca
18
README.md
18
README.md
|
@ -66,8 +66,8 @@ statements in Mu translate to a single machine code instruction. Variables
|
||||||
reside in memory by default. Programs must specify registers when they want to
|
reside in memory by default. Programs must specify registers when they want to
|
||||||
use them. Functions must return results in registers. Execution begins at the
|
use them. Functions must return results in registers. Execution begins at the
|
||||||
function `main`, which always returns its result in register `ebx`. [This post](http://akkartik.name/post/mu-2019-2)
|
function `main`, which always returns its result in register `ebx`. [This post](http://akkartik.name/post/mu-2019-2)
|
||||||
has more details. You can see a complete list of supported instructions and
|
has more details, and there's a [summary](mu_summary) of all supported
|
||||||
their translations in [this summary](mu_instructions).
|
instructions.
|
||||||
|
|
||||||
## SubX
|
## SubX
|
||||||
|
|
||||||
|
@ -715,19 +715,21 @@ If you're still reading, here are some more things to check out:
|
||||||
|
|
||||||
a) Try running the tests: `./test_apps`
|
a) Try running the tests: `./test_apps`
|
||||||
|
|
||||||
b) Check out the online help. Starting point: `./bootstrap`
|
b) There's a handy [summary](mu_instructions) of how the Mu compiler translates
|
||||||
|
instructions to SubX.
|
||||||
|
|
||||||
c) Familiarize yourself with `./bootstrap help opcodes`. If you program in Mu
|
c) Check out the online help on SubX. Starting point: `./bootstrap`
|
||||||
you'll spend a lot of time with it. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
|
|
||||||
|
d) Familiarize yourself with the list of opcodes supported in SubX: `./bootstrap
|
||||||
|
help opcodes`. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
|
||||||
[Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
|
[Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
|
||||||
are some tips on my setup for quickly finding the right opcode for any
|
are some tips on my setup for quickly finding the right opcode for any
|
||||||
situation from within Vim.
|
situation from within Vim.
|
||||||
|
|
||||||
d) Try working on [the starter exercises](https://github.com/akkartik/mu/pulls)
|
e) Try working on [some starter SubX exercises](https://github.com/akkartik/mu/pulls)
|
||||||
(labelled `hello`).
|
(labelled `hello`).
|
||||||
|
|
||||||
e) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
|
f) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
|
||||||
Check it out.
|
|
||||||
|
|
||||||
## Credits
|
## Credits
|
||||||
|
|
||||||
|
|
|
@ -127,26 +127,14 @@ compare var, n {.name="compare", .inouts=[var, n],
|
||||||
var/reg <- multiply var2 {.name="multiply", .inouts=[var2], .outputs=[reg], .subx-name="0f af/multiply", .rm32="*(ebp+" inouts[0].stack-offset ")", .r32=outputs[0]}
|
var/reg <- multiply var2 {.name="multiply", .inouts=[var2], .outputs=[reg], .subx-name="0f af/multiply", .rm32="*(ebp+" inouts[0].stack-offset ")", .r32=outputs[0]}
|
||||||
|
|
||||||
Jumps have a slightly simpler format. Most of the time they take no inouts or
|
Jumps have a slightly simpler format. Most of the time they take no inouts or
|
||||||
outputs. Occasionally you give them a label for a block to jump to the start
|
outputs. Occasionally you give them a label for a containing block to jump to
|
||||||
or end of.
|
the start or end of.
|
||||||
|
|
||||||
break-if-= {.name="break-if-=", .subx-name="0f 84/jump-if-= break/disp32"}
|
break-if-= {.name="break-if-=", .subx-name="0f 84/jump-if-= break/disp32"}
|
||||||
break-if-= label {.name="break-if-=", .inouts=[label], .subx-name="0f 84/jump-if-=", .disp32=inouts[0] ":break"}
|
break-if-= label {.name="break-if-=", .inouts=[label], .subx-name="0f 84/jump-if-=", .disp32=inouts[0] ":break"}
|
||||||
break-if-!= {.name="break-if-!=", .subx-name="0f 85/jump-if-!= break/disp32"}
|
break-if-!= {.name="break-if-!=", .subx-name="0f 85/jump-if-!= break/disp32"}
|
||||||
break-if-!= label {.name="break-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":break"}
|
break-if-!= label {.name="break-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":break"}
|
||||||
|
|
||||||
Inequalities are similar, but have unsigned and signed variants. We assume
|
|
||||||
unsigned variants are only ever used to compare addresses.
|
|
||||||
|
|
||||||
break-if-addr< {.name="break-if-addr<", .subx-name="0f 82/jump-if-addr< break/disp32"}
|
|
||||||
break-if-addr< label {.name="break-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":break"}
|
|
||||||
break-if-addr> {.name="break-if-addr>", .subx-name="0f 87/jump-if-addr> break/disp32"}
|
|
||||||
break-if-addr> label {.name="break-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":break"}
|
|
||||||
break-if-addr<= {.name="break-if-addr<=", .subx-name="0f 86/jump-if-addr<= break/disp32"}
|
|
||||||
break-if-addr<= label {.name="break-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":break"}
|
|
||||||
break-if-addr>= {.name="break-if-addr>=", .subx-name="0f 83/jump-if-addr>= break/disp32"}
|
|
||||||
break-if-addr>= label {.name="break-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":break"}
|
|
||||||
|
|
||||||
break-if-< {.name="break-if-<", .subx-name="0f 8c/jump-if-< break/disp32"}
|
break-if-< {.name="break-if-<", .subx-name="0f 8c/jump-if-< break/disp32"}
|
||||||
break-if-< label {.name="break-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":break"}
|
break-if-< label {.name="break-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":break"}
|
||||||
break-if-> {.name="break-if->", .subx-name="0f 8f/jump-if-> break/disp32"}
|
break-if-> {.name="break-if->", .subx-name="0f 8f/jump-if-> break/disp32"}
|
||||||
|
@ -156,6 +144,15 @@ break-if-<= label {.name="break-if-<=", .inouts=[label],
|
||||||
break-if->= {.name="break-if->=", .subx-name="0f 8d/jump-if->= break/disp32"}
|
break-if->= {.name="break-if->=", .subx-name="0f 8d/jump-if->= break/disp32"}
|
||||||
break-if->= label {.name="break-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":break"}
|
break-if->= label {.name="break-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":break"}
|
||||||
|
|
||||||
|
break-if-addr< {.name="break-if-addr<", .subx-name="0f 82/jump-if-addr< break/disp32"}
|
||||||
|
break-if-addr< label {.name="break-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":break"}
|
||||||
|
break-if-addr> {.name="break-if-addr>", .subx-name="0f 87/jump-if-addr> break/disp32"}
|
||||||
|
break-if-addr> label {.name="break-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":break"}
|
||||||
|
break-if-addr<= {.name="break-if-addr<=", .subx-name="0f 86/jump-if-addr<= break/disp32"}
|
||||||
|
break-if-addr<= label {.name="break-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":break"}
|
||||||
|
break-if-addr>= {.name="break-if-addr>=", .subx-name="0f 83/jump-if-addr>= break/disp32"}
|
||||||
|
break-if-addr>= label {.name="break-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":break"}
|
||||||
|
|
||||||
Finally, we repeat all the 'break' variants almost identically for 'loop'
|
Finally, we repeat all the 'break' variants almost identically for 'loop'
|
||||||
instructions. This works because the compiler inserts ':loop' labels at the
|
instructions. This works because the compiler inserts ':loop' labels at the
|
||||||
start of such named blocks, and ':break' labels at the end.
|
start of such named blocks, and ':break' labels at the end.
|
||||||
|
@ -165,15 +162,6 @@ loop-if-= label {.name="loop-if-=", .inouts=[label],
|
||||||
loop-if-!= {.name="loop-if-!=", .subx-name="0f 85/jump-if-!= loop/disp32"}
|
loop-if-!= {.name="loop-if-!=", .subx-name="0f 85/jump-if-!= loop/disp32"}
|
||||||
loop-if-!= label {.name="loop-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":loop"}
|
loop-if-!= label {.name="loop-if-!=", .inouts=[label], .subx-name="0f 85/jump-if-!=", .disp32=inouts[0] ":loop"}
|
||||||
|
|
||||||
loop-if-addr< {.name="loop-if-addr<", .subx-name="0f 82/jump-if-addr< loop/disp32"}
|
|
||||||
loop-if-addr< label {.name="loop-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":loop"}
|
|
||||||
loop-if-addr> {.name="loop-if-addr>", .subx-name="0f 87/jump-if-addr> loop/disp32"}
|
|
||||||
loop-if-addr> label {.name="loop-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":loop"}
|
|
||||||
loop-if-addr<= {.name="loop-if-addr<=", .subx-name="0f 86/jump-if-addr<= loop/disp32"}
|
|
||||||
loop-if-addr<= label {.name="loop-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":loop"}
|
|
||||||
loop-if-addr>= {.name="loop-if-addr>=", .subx-name="0f 83/jump-if-addr>= loop/disp32"}
|
|
||||||
loop-if-addr>= label {.name="loop-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":loop"}
|
|
||||||
|
|
||||||
loop-if-< {.name="loop-if-<", .subx-name="0f 8c/jump-if-< loop/disp32"}
|
loop-if-< {.name="loop-if-<", .subx-name="0f 8c/jump-if-< loop/disp32"}
|
||||||
loop-if-< label {.name="loop-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":loop"}
|
loop-if-< label {.name="loop-if-<", .inouts=[label], .subx-name="0f 8c/jump-if-<", .disp32=inouts[0] ":loop"}
|
||||||
loop-if-> {.name="loop-if->", .subx-name="0f 8f/jump-if-> loop/disp32"}
|
loop-if-> {.name="loop-if->", .subx-name="0f 8f/jump-if-> loop/disp32"}
|
||||||
|
@ -183,10 +171,20 @@ loop-if-<= label {.name="loop-if-<=", .inouts=[label],
|
||||||
loop-if->= {.name="loop-if->=", .subx-name="0f 8d/jump-if->= loop/disp32"}
|
loop-if->= {.name="loop-if->=", .subx-name="0f 8d/jump-if->= loop/disp32"}
|
||||||
loop-if->= label {.name="loop-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":loop"}
|
loop-if->= label {.name="loop-if->=", .inouts=[label], .subx-name="0f 8d/jump-if->=", .disp32=inouts[0] ":loop"}
|
||||||
|
|
||||||
There are also unconditional loop instructions. So far it doesn't seem like
|
loop-if-addr< {.name="loop-if-addr<", .subx-name="0f 82/jump-if-addr< loop/disp32"}
|
||||||
unconditional breaks have much use.
|
loop-if-addr< label {.name="loop-if-addr<", .inouts=[label], .subx-name="0f 82/jump-if-addr<", .disp32=inouts[0] ":loop"}
|
||||||
|
loop-if-addr> {.name="loop-if-addr>", .subx-name="0f 87/jump-if-addr> loop/disp32"}
|
||||||
|
loop-if-addr> label {.name="loop-if-addr>", .inouts=[label], .subx-name="0f 87/jump-if-addr>", .disp32=inouts[0] ":loop"}
|
||||||
|
loop-if-addr<= {.name="loop-if-addr<=", .subx-name="0f 86/jump-if-addr<= loop/disp32"}
|
||||||
|
loop-if-addr<= label {.name="loop-if-addr<=", .inouts=[label], .subx-name="0f 86/jump-if-addr<=", .disp32=inouts[0] ":loop"}
|
||||||
|
loop-if-addr>= {.name="loop-if-addr>=", .subx-name="0f 83/jump-if-addr>= loop/disp32"}
|
||||||
|
loop-if-addr>= label {.name="loop-if-addr>=", .inouts=[label], .subx-name="0f 83/jump-if-addr>=", .disp32=inouts[0] ":loop"}
|
||||||
|
|
||||||
|
Finally, unconditional jumps:
|
||||||
|
|
||||||
loop {.name="loop", .subx-name="e9/jump loop/disp32"}
|
loop {.name="loop", .subx-name="e9/jump loop/disp32"}
|
||||||
loop label {.name="loop", .inouts=[label], .subx-name="e9/jump", .disp32=inouts[0] ":loop"}
|
loop label {.name="loop", .inouts=[label], .subx-name="e9/jump", .disp32=inouts[0] ":loop"}
|
||||||
|
|
||||||
|
(So far it doesn't seem like unconditional breaks have much use.)
|
||||||
|
|
||||||
vim:ft=c:nowrap
|
vim:ft=c:nowrap
|
||||||
|
|
|
@ -0,0 +1,192 @@
|
||||||
|
Mu programs are lists of functions. Each function has the following form:
|
||||||
|
|
||||||
|
fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
|
||||||
|
_instructions_
|
||||||
|
}
|
||||||
|
|
||||||
|
Instructions may be primitives or function calls. Either way, all instructions
|
||||||
|
have one of the following forms:
|
||||||
|
|
||||||
|
# defining variables
|
||||||
|
var _name_: _type_
|
||||||
|
var _name_/_register_: _type_
|
||||||
|
|
||||||
|
# doing things with variables
|
||||||
|
_operation_ _inouts_
|
||||||
|
_outputs_ <- _operation_ _inouts_
|
||||||
|
|
||||||
|
Instructions and functions may have inouts and outputs. Both inouts and
|
||||||
|
outputs are variables.
|
||||||
|
|
||||||
|
As seen above, variables can be defined to live in a register, like this:
|
||||||
|
|
||||||
|
n/eax
|
||||||
|
|
||||||
|
Variables not assigned a register live in the stack.
|
||||||
|
|
||||||
|
Function inouts must always be on the stack, and outputs must always be in
|
||||||
|
registers. A function call must always write to the exact registers its
|
||||||
|
definition requires. For example:
|
||||||
|
|
||||||
|
fn foo -> x/eax: int {
|
||||||
|
...
|
||||||
|
}
|
||||||
|
fn main {
|
||||||
|
a/eax <- foo # ok
|
||||||
|
a/ebx <- foo # wrong
|
||||||
|
}
|
||||||
|
|
||||||
|
Primitive inouts may be on the stack or in registers, but outputs must always
|
||||||
|
be in registers.
|
||||||
|
|
||||||
|
Functions can contain nested blocks inside { and }. Variables defined in a
|
||||||
|
block don't exist outside it.
|
||||||
|
|
||||||
|
## Primitive instructions
|
||||||
|
|
||||||
|
Primitive instructions currently supported in Mu:
|
||||||
|
|
||||||
|
var/eax <- increment
|
||||||
|
var/ecx <- increment
|
||||||
|
var/edx <- increment
|
||||||
|
var/ebx <- increment
|
||||||
|
var/esi <- increment
|
||||||
|
var/edi <- increment
|
||||||
|
increment var
|
||||||
|
|
||||||
|
var/eax <- decrement
|
||||||
|
var/ecx <- decrement
|
||||||
|
var/edx <- decrement
|
||||||
|
var/ebx <- decrement
|
||||||
|
var/esi <- decrement
|
||||||
|
var/edi <- decrement
|
||||||
|
decrement var
|
||||||
|
|
||||||
|
var1/reg1 <- add var2/reg2
|
||||||
|
var/reg <- add var2
|
||||||
|
add-to var1, var2/reg
|
||||||
|
var/eax <- add n
|
||||||
|
var/reg <- add n
|
||||||
|
add-to var, n
|
||||||
|
|
||||||
|
var1/reg1 <- sub var2/reg2
|
||||||
|
var/reg <- sub var2
|
||||||
|
sub-from var1, var2/reg
|
||||||
|
var/eax <- sub n
|
||||||
|
var/reg <- sub n
|
||||||
|
sub-from var, n
|
||||||
|
|
||||||
|
var1/reg1 <- and var2/reg2
|
||||||
|
var/reg <- and var2
|
||||||
|
and-with var1, var2/reg
|
||||||
|
var/eax <- and n
|
||||||
|
var/reg <- and n
|
||||||
|
and-with var, n
|
||||||
|
|
||||||
|
var1/reg1 <- or var2/reg2
|
||||||
|
var/reg <- or var2
|
||||||
|
or-with var1, var2/reg
|
||||||
|
var/eax <- or n
|
||||||
|
var/reg <- or n
|
||||||
|
or-with var, n
|
||||||
|
|
||||||
|
var1/reg1 <- xor var2/reg2
|
||||||
|
var/reg <- xor var2
|
||||||
|
xor-with var1, var2/reg
|
||||||
|
var/eax <- xor n
|
||||||
|
var/reg <- xor n
|
||||||
|
xor-with var, n
|
||||||
|
|
||||||
|
var/eax <- copy n
|
||||||
|
var/ecx <- copy n
|
||||||
|
var/edx <- copy n
|
||||||
|
var/ebx <- copy n
|
||||||
|
var/esi <- copy n
|
||||||
|
var/edi <- copy n
|
||||||
|
var1/reg1 <- copy var2/reg2
|
||||||
|
copy-to var1, var2/reg
|
||||||
|
var/reg <- copy var2
|
||||||
|
var/reg <- copy n
|
||||||
|
copy-to var, n
|
||||||
|
|
||||||
|
compare var1, var2/reg
|
||||||
|
compare var1/reg, var2
|
||||||
|
compare var/eax, n
|
||||||
|
compare var, n
|
||||||
|
|
||||||
|
var/reg <- multiply var2
|
||||||
|
|
||||||
|
## Primitive jump instructions
|
||||||
|
|
||||||
|
There are two kinds of jumps, both with many variations: `break` and `loop`.
|
||||||
|
`break` instructions jump to the end of the containing block. `loop` instructions
|
||||||
|
jump to the beginning of the containing block.
|
||||||
|
|
||||||
|
Jumps can take an optional label starting with '$':
|
||||||
|
|
||||||
|
loop $foo
|
||||||
|
|
||||||
|
This instruction jumps to the beginning of the block called $foo. It must lie
|
||||||
|
somewhere inside such a box. Jumps are only legal to containing blocks.
|
||||||
|
|
||||||
|
There are two unconditional jumps:
|
||||||
|
|
||||||
|
loop
|
||||||
|
loop label
|
||||||
|
# unconditional break instructions don't seem useful
|
||||||
|
|
||||||
|
The remaining jump instructions are all conditional. Conditional jumps rely on
|
||||||
|
the result of the most recently executed `compare` instruction. (To keep
|
||||||
|
programs easy to read, keep compare instructions close to the jump that uses
|
||||||
|
them.)
|
||||||
|
|
||||||
|
break-if-=
|
||||||
|
break-if-= label
|
||||||
|
break-if-!=
|
||||||
|
break-if-!= label
|
||||||
|
|
||||||
|
Inequalities are similar, but have unsigned and signed variants. We assume
|
||||||
|
unsigned variants are only ever used to compare addresses.
|
||||||
|
|
||||||
|
break-if-<
|
||||||
|
break-if-< label
|
||||||
|
break-if->
|
||||||
|
break-if-> label
|
||||||
|
break-if-<=
|
||||||
|
break-if-<= label
|
||||||
|
break-if->=
|
||||||
|
break-if->= label
|
||||||
|
|
||||||
|
break-if-addr<
|
||||||
|
break-if-addr< label
|
||||||
|
break-if-addr>
|
||||||
|
break-if-addr> label
|
||||||
|
break-if-addr<=
|
||||||
|
break-if-addr<= label
|
||||||
|
break-if-addr>=
|
||||||
|
break-if-addr>= label
|
||||||
|
|
||||||
|
Similarly, conditional loops:
|
||||||
|
|
||||||
|
loop-if-=
|
||||||
|
loop-if-= label
|
||||||
|
loop-if-!=
|
||||||
|
loop-if-!= label
|
||||||
|
|
||||||
|
loop-if-<
|
||||||
|
loop-if-< label
|
||||||
|
loop-if->
|
||||||
|
loop-if-> label
|
||||||
|
loop-if-<=
|
||||||
|
loop-if-<= label
|
||||||
|
loop-if->=
|
||||||
|
loop-if->= label
|
||||||
|
|
||||||
|
loop-if-addr<
|
||||||
|
loop-if-addr< label
|
||||||
|
loop-if-addr>
|
||||||
|
loop-if-addr> label
|
||||||
|
loop-if-addr<=
|
||||||
|
loop-if-addr<= label
|
||||||
|
loop-if-addr>=
|
||||||
|
loop-if-addr>= label
|
|
@ -100,7 +100,7 @@ Opcodes currently supported by SubX:
|
||||||
0f 8e: jump disp32 bytes away if lesser or equal (signed), if ZF is set or SF != OF (jcc/jle/jng)
|
0f 8e: jump disp32 bytes away if lesser or equal (signed), if ZF is set or SF != OF (jcc/jle/jng)
|
||||||
0f 8f: jump disp32 bytes away if greater (signed), if ZF is unset and SF == OF (jcc/jg/jnle)
|
0f 8f: jump disp32 bytes away if greater (signed), if ZF is unset and SF == OF (jcc/jg/jnle)
|
||||||
0f af: multiply rm32 into r32 (imul)
|
0f af: multiply rm32 into r32 (imul)
|
||||||
Run `subx help instructions` for details on words like 'r32' and 'disp8'.
|
Run `bootstrap help instructions` for details on words like 'r32' and 'disp8'.
|
||||||
For complete details on these instructions, consult the IA-32 manual (volume 2).
|
For complete details on these instructions, consult the IA-32 manual (volume 2).
|
||||||
There's various versions of it online, such as https://c9x.me/x86.
|
There's various versions of it online, such as https://c9x.me/x86.
|
||||||
The mnemonics in brackets will help you locate each instruction.
|
The mnemonics in brackets will help you locate each instruction.
|
||||||
|
|
Loading…
Reference in New Issue