7251
This commit is contained in:
parent
8d2dece291
commit
3d31467c0d
|
@ -23296,7 +23296,7 @@ $check-mu-populate-stmt:get-length:
|
|||
89/<- %esi 0/r32/eax
|
||||
# 1 inout
|
||||
3d/compare-eax-and 0/imm32
|
||||
0f 84/jump-if-= $check-mu-copy-object-stmt:error-incorrect-inouts/disp32
|
||||
0f 84/jump-if-= $check-mu-populate-stmt:error-incorrect-inouts/disp32
|
||||
# > 2 inouts
|
||||
(lookup *(esi+8) *(esi+0xc)) # Stmt-var-next Stmt-var-next => eax
|
||||
3d/compare-eax-and 0/imm32
|
||||
|
@ -23446,7 +23446,7 @@ $check-mu-populate-stream-stmt:get-length:
|
|||
89/<- %esi 0/r32/eax
|
||||
# 1 inout
|
||||
3d/compare-eax-and 0/imm32
|
||||
0f 84/jump-if-= $check-mu-copy-object-stmt:error-incorrect-inouts/disp32
|
||||
0f 84/jump-if-= $check-mu-populate-stream-stmt:error-incorrect-inouts/disp32
|
||||
# > 2 inouts
|
||||
(lookup *(esi+8) *(esi+0xc)) # Stmt-var-next Stmt-var-next => eax
|
||||
3d/compare-eax-and 0/imm32
|
||||
|
|
|
@ -21906,7 +21906,7 @@ if ('onhashchange' in window) {
|
|||
<span id="L23296" class="LineNr">23296 </span> 89/<- %esi 0/r32/eax
|
||||
<span id="L23297" class="LineNr">23297 </span> <span class="subxComment"># 1 inout</span>
|
||||
<span id="L23298" class="LineNr">23298 </span> 3d/compare-eax-and 0/imm32
|
||||
<span id="L23299" class="LineNr">23299 </span> 0f 84/jump-if-= $check-mu-copy-object-stmt:error-incorrect-inouts/disp32
|
||||
<span id="L23299" class="LineNr">23299 </span> 0f 84/jump-if-= $check-mu-populate-stmt:error-incorrect-inouts/disp32
|
||||
<span id="L23300" class="LineNr">23300 </span> <span class="subxComment"># > 2 inouts</span>
|
||||
<span id="L23301" class="LineNr">23301 </span> (<a href='../120allocate.subx.html#L256'>lookup</a> *(esi+8) *(esi+0xc)) <span class="subxComment"># Stmt-var-next Stmt-var-next => eax</span>
|
||||
<span id="L23302" class="LineNr">23302 </span> 3d/compare-eax-and 0/imm32
|
||||
|
@ -22056,7 +22056,7 @@ if ('onhashchange' in window) {
|
|||
<span id="L23446" class="LineNr">23446 </span> 89/<- %esi 0/r32/eax
|
||||
<span id="L23447" class="LineNr">23447 </span> <span class="subxComment"># 1 inout</span>
|
||||
<span id="L23448" class="LineNr">23448 </span> 3d/compare-eax-and 0/imm32
|
||||
<span id="L23449" class="LineNr">23449 </span> 0f 84/jump-if-= $check-mu-copy-object-stmt:error-incorrect-inouts/disp32
|
||||
<span id="L23449" class="LineNr">23449 </span> 0f 84/jump-if-= $check-mu-populate-stream-stmt:error-incorrect-inouts/disp32
|
||||
<span id="L23450" class="LineNr">23450 </span> <span class="subxComment"># > 2 inouts</span>
|
||||
<span id="L23451" class="LineNr">23451 </span> (<a href='../120allocate.subx.html#L256'>lookup</a> *(esi+8) *(esi+0xc)) <span class="subxComment"># Stmt-var-next Stmt-var-next => eax</span>
|
||||
<span id="L23452" class="LineNr">23452 </span> 3d/compare-eax-and 0/imm32
|
||||
|
|
95
mu.md
95
mu.md
|
@ -45,7 +45,7 @@ and [vocabulary.md](vocabulary.md).
|
|||
|
||||
Zooming out from single statements, here's a complete sample program in Mu:
|
||||
|
||||
<img alt='ex2.mu' src='html/ex2.mu.png'>
|
||||
<img alt='ex2.mu' src='html/ex2.mu.png' width='400px'>
|
||||
|
||||
Mu programs are lists of functions. Each function has the following form:
|
||||
|
||||
|
@ -59,9 +59,7 @@ fn _name_ _inout_ ... -> _output_ ... {
|
|||
|
||||
Each function has a header line, and some number of statements, each on a
|
||||
separate line. Headers describe inouts and outputs. Inouts can't be registers,
|
||||
and outputs _must_ be registers. Outputs can't take names. In the above
|
||||
example, the outputs of both `do-add` and `main` have type `int` and are
|
||||
available in register `ebx` at the end of the respective calls.
|
||||
and outputs _must_ be registers. Outputs can't take names.
|
||||
|
||||
The above program also demonstrates a function call (to the function `do-add`).
|
||||
Function calls look the same as primitive statements: they can return (multiple)
|
||||
|
@ -92,7 +90,7 @@ two signatures:
|
|||
- `fn main -> _/ebx: int`
|
||||
- `fn main args: (addr array (addr array byte)) -> _/ebx: int`
|
||||
|
||||
(The names of the inout and output are flexible.)
|
||||
(The name of the inout is flexible.)
|
||||
|
||||
Mu encloses multi-word types in parentheses, and types can get quite expressive.
|
||||
For example, you read `main`'s inout type as "an address to an array of
|
||||
|
@ -103,7 +101,7 @@ always strings in Mu, you'll quickly learn to mentally shorten this type to
|
|||
## Blocks
|
||||
|
||||
Blocks are useful for grouping related statements. They're delimited by `{`
|
||||
and `}`, both each alone on a line.
|
||||
and `}`, each alone on a line.
|
||||
|
||||
Blocks can nest:
|
||||
|
||||
|
@ -225,9 +223,8 @@ var/reg <- multiply var2
|
|||
|
||||
Any statement above that takes a variable in memory can be replaced with a
|
||||
dereference (`*`) of an address variable (of type `(addr ...)`) in a register.
|
||||
(Types can have multiple words, and are wrapped in `()` when they do.) But you
|
||||
can't dereference variables in memory. You have to load them into a register
|
||||
first.
|
||||
You can't dereference variables in memory. You have to load them into a
|
||||
register first.
|
||||
|
||||
Excluding dereferences, the above statements must operate on non-address
|
||||
values with primitive types: `int`, `boolean` or `byte`. (Booleans are really
|
||||
|
@ -238,7 +235,7 @@ to int variables, but not the other way around.
|
|||
|
||||
These instructions may use the floating-point registers `xmm0` ... `xmm7`
|
||||
(denoted by `/xreg2` or `/xrm32`). They also use integer values on occasion
|
||||
(`/rm32` and `/r32`). They can't take literal floating-point values.
|
||||
(`/rm32` and `/r32`).
|
||||
|
||||
```
|
||||
var/xreg <- add var2/xreg2
|
||||
|
@ -308,9 +305,8 @@ There are no instructions accepting floating-point literals. To obtain integer
|
|||
literals in floating-point registers, copy them to general-purpose registers
|
||||
and then convert them to floating-point.
|
||||
|
||||
One pattern you may have noticed above is that the floating-point instructions
|
||||
above always write to registers. The only exceptions are `copy` instructions,
|
||||
which can write to memory locations.
|
||||
The floating-point instructions above always write to registers. The only
|
||||
instructions that can write floats to memory are `copy` instructions.
|
||||
|
||||
```
|
||||
var/xreg <- copy var2/xreg2
|
||||
|
@ -319,7 +315,8 @@ var/xreg <- copy var2
|
|||
var/xreg <- copy *var2/reg2
|
||||
```
|
||||
|
||||
Floating-point comparisons always put a register on the left-hand side:
|
||||
Finally, there are floating-point comparisons. They must always put a register
|
||||
on the left-hand side:
|
||||
|
||||
```
|
||||
compare var1/xreg1, var2/xreg2
|
||||
|
@ -328,7 +325,7 @@ compare var1/xreg1, var2
|
|||
|
||||
## Operating on individual bytes
|
||||
|
||||
A special-case is variables of type `byte`. Mu is a 32-bit platform so for the
|
||||
A special case is variables of type `byte`. Mu is a 32-bit platform so for the
|
||||
most part only supports types that are multiples of 32 bits. However, we do
|
||||
want to support strings in ASCII and UTF-8, which will be arrays of 8-bit
|
||||
bytes.
|
||||
|
@ -375,7 +372,7 @@ break label
|
|||
|
||||
The remaining jump instructions are all conditional. Conditional jumps rely on
|
||||
the result of the most recently executed `compare` instruction. (To keep
|
||||
programs easy to read, keep compare instructions close to the jump that uses
|
||||
programs easy to read, keep `compare` instructions close to the jump that uses
|
||||
them.)
|
||||
|
||||
```
|
||||
|
@ -571,7 +568,7 @@ type point {
|
|||
|
||||
Mu programs are currently sequences of `fn` and `type` definitions.
|
||||
|
||||
Compound types can't include `addr` types for safety (use `handle` instead,
|
||||
Compound types can't include `addr` types for safety reasons (use `handle` instead,
|
||||
which is described below). They also can't currently include `array`, `stream`
|
||||
or `byte` types. Since arrays and streams carry their size with them, supporting
|
||||
them in compound types complicates variable initialization. Instead of
|
||||
|
@ -596,39 +593,55 @@ var a/eax: (addr int) <- get p, x
|
|||
var a/eax: (addr int) <- get p, y
|
||||
```
|
||||
|
||||
You can clear arbitrary types using the `clear-object` function:
|
||||
|
||||
```
|
||||
clear-object var: (addr T)
|
||||
```
|
||||
|
||||
Don't clear arrays or streams using `clear-object`; doing so will irreversibly
|
||||
make their length 0 as well.
|
||||
|
||||
You can shallow-copy arbitrary types using the `copy-object` function:
|
||||
|
||||
```
|
||||
copy-object src: (addr T), dest: (addr T)
|
||||
```
|
||||
|
||||
## Handles for safe access to the heap
|
||||
|
||||
We've seen the `addr` type, but it's intended to be short-lived. `addr` values
|
||||
should never escape from functions. In particular, save `addr` values inside
|
||||
compound `type`s. To do that you need a "fat pointer" called a `handle` that
|
||||
is safe to keep around for extended periods and ensures it's used safely
|
||||
without corrupting the heap and causing security issues or hard-to-debug
|
||||
misbehavior.
|
||||
should never escape from functions. Function outputs can't be `addr`s,
|
||||
function inouts can't include `addr` in their payload type. Finally, you can't
|
||||
save `addr` values inside compound `type`s. To do that you need a "fat
|
||||
pointer" called a `handle` that is safe to keep around for extended periods
|
||||
and ensures it's used safely without corrupting the heap and causing security
|
||||
issues or hard-to-debug misbehavior.
|
||||
|
||||
To actually _use_ a `handle`, we have to turn it into an `addr` first using
|
||||
the `lookup` statement.
|
||||
|
||||
```
|
||||
var y/reg: (addr T) <- lookup x
|
||||
var y/reg: (addr T) <- lookup x: (handle T)
|
||||
```
|
||||
|
||||
Now operate on the `addr` as usual, safe in the knowledge that you can later
|
||||
recover any writes to its payload from `x`.
|
||||
Now operate on `y` as usual, safe in the knowledge that you can later recover
|
||||
any writes to its payload from `x`.
|
||||
|
||||
It's illegal to continue to use this `addr` after a function that reclaims
|
||||
heap memory. You have to repeat the lookup from the `handle`. (Luckily Mu
|
||||
doesn't implement reclamation yet.)
|
||||
It's illegal to continue to use an `addr` after a function that reclaims heap
|
||||
memory. You have to repeat the lookup from the `handle`. (Luckily Mu doesn't
|
||||
implement reclamation yet.)
|
||||
|
||||
Having two kinds of addresses takes some getting used to. Do we pass in
|
||||
variables by value, by `addr` or by `handle`? In inputs or outputs? Here are 3
|
||||
rules of thumb:
|
||||
|
||||
* Functions that need to look at the payload should accept an `(addr ...)`.
|
||||
* Functions that need to look at the payload should accept an `(addr ...)`
|
||||
where possible.
|
||||
* Functions that need to treat a handle as a value, without looking at its
|
||||
payload, should accept a `(handle ...)`. Helpers that save handles into data
|
||||
structures are a common example.
|
||||
* Functions that need to allocate memory should accept an `(addr handle
|
||||
...)`.
|
||||
payload, should accept a `(handle ...)`. Helpers that save handles into
|
||||
data structures are a common example.
|
||||
* Functions that need to allocate memory should accept an `(addr handle ...)`.
|
||||
|
||||
Try to avoid mixing these use cases.
|
||||
|
||||
|
@ -655,7 +668,7 @@ var x: (addr handle T)
|
|||
allocate x
|
||||
```
|
||||
|
||||
To create handles to array types, use `populate`:
|
||||
To create handles to array types (of potentially dynamic size), use `populate`:
|
||||
|
||||
```
|
||||
var x: (addr handle array T)
|
||||
|
@ -663,15 +676,6 @@ var x: (addr handle array T)
|
|||
populate x, 3 # array of 3 T's
|
||||
```
|
||||
|
||||
You can copy handles to another variable on the stack like this:
|
||||
|
||||
```
|
||||
var x: (handle T)
|
||||
# ..some code initializing x..
|
||||
var y/eax: (addr handle T) <- address ...
|
||||
copy-handle x, y
|
||||
```
|
||||
|
||||
## Seams
|
||||
|
||||
I said at the start that most instructions map 1:1 to x86 machine code. To
|
||||
|
@ -689,8 +693,7 @@ the above exceptions.
|
|||
|
||||
## Conclusion
|
||||
|
||||
Anything not allowed here is forbidden. Even if the compiler doesn't currently
|
||||
Anything not allowed here is forbidden, even if the compiler doesn't currently
|
||||
detect and complain about it. Please [contact me](mailto:ak@akkartik.com) or
|
||||
[report issues](https://github.com/akkartik/mu/issues) when you encounter a
|
||||
missing or misleading error message. Thank you for bearing with the dust! I'm
|
||||
here for the long haul, and everything will be clean and checked in due time.
|
||||
missing or misleading error message.
|
||||
|
|
Loading…
Reference in New Issue