Commit Graph

412 Commits

Author SHA1 Message Date
Kartik Agaram
8bf95d294a 6637
Be more consistent about what we interpret as integer literals.
2020-07-11 22:02:31 -07:00
Kartik Agaram
5c86f9be66 6636 2020-07-11 21:33:51 -07:00
Kartik Agaram
355073129b 6635 - bugfix 2020-07-11 21:30:45 -07:00
Kartik Agaram
c5a3f65502 6630 - define type signatures for SubX functions
This was easier than I'd feared.
2020-07-10 23:41:34 -07:00
Kartik Agaram
c1b6ecc874 6628 2020-07-10 21:23:32 -07:00
Kartik Agaram
c532373e29 6626 2020-07-09 12:32:07 -07:00
Kartik Agaram
f16f569060 6622 - new syscalls: time and ntime
As a side-effect I find that my Linode can print ~100k chars/s. At 50 rows
and 200 columns per screen, it's 10 frames/s.
2020-07-08 22:14:42 -07:00
Kartik Agaram
996402e8fd 6604 - new app
https://archive.org/details/akkartik-2min-2020-07-01

In the process I found a bug, added a new syscall, and 'emulated' it.
2020-07-01 16:47:20 -07:00
Kartik Agaram
59cf3ae983 6597 2020-06-29 18:33:52 -07:00
Kartik Agaram
1afc882890 6596 2020-06-29 18:31:17 -07:00
Kartik Agaram
690fa191f1 6595 2020-06-29 18:01:44 -07:00
Kartik Agaram
05dabd816a 6594 - start standardizing the meaning of 'print' 2020-06-29 17:58:01 -07:00
Kartik Agaram
8ce50909c4 6592 - error-checking for integer stmts feels done 2020-06-28 23:25:18 -07:00
Kartik Agaram
9483fdc1fb 6591 2020-06-28 23:17:34 -07:00
Kartik Agaram
4ddc2620f7 6590 2020-06-28 23:10:45 -07:00
Kartik Agaram
896e3bcfb2 6589 2020-06-28 23:09:10 -07:00
Kartik Agaram
d31bd529f7 6588 2020-06-28 18:49:50 -07:00
Kartik Agaram
52d3ee0326 6587 2020-06-28 14:35:45 -07:00
Kartik Agaram
f7a174c2a1 6586 - error-checking for 'get' stmts feels done 2020-06-28 14:23:31 -07:00
Kartik Agaram
1a89b13b9c 6585 2020-06-28 11:50:50 -07:00
Kartik Agaram
bf4fbab76d 6584 2020-06-28 10:19:02 -07:00
Kartik Agaram
60bffdaa49 6583 2020-06-28 09:53:32 -07:00
Kartik Agaram
76a669f689 6582 2020-06-28 09:10:16 -07:00
Kartik Agaram
fbc4544ff8 6581 2020-06-28 08:49:08 -07:00
Kartik Agaram
f58a6acc3f 6579 2020-06-28 00:55:31 -07:00
Kartik Agaram
1da16dd6cb 6578 - redo error if 'get' on unknown field
This commit reimplements commit 6515 to happen during type-checking rather
than as early as possible. That way we naturally get a more informative
error message.
2020-06-27 21:03:43 -07:00
Kartik Agaram
ab42709e14 6577 2020-06-27 14:30:07 -07:00
Kartik Agaram
14d9b56668 6576 2020-06-27 14:23:50 -07:00
Kartik Agaram
ca9dec80b6 6575 2020-06-27 12:13:43 -07:00
Kartik Agaram
3b02c3dfa2 6572
Small change to mu.subx to keep the treeshaker working with it. That's
currently the only place where we prevent jumps across 'functions'.
2020-06-21 17:31:38 -07:00
Kartik Agaram
6bfb565819 6570 - error on use of a clobbered var
All tests now passing, and factorial.mu and all other apps now working.
The new checks caught one problem in a few prototypes.
2020-06-21 17:08:03 -07:00
Kartik Agaram
c70beadc7a 6562
The new failing test is now passing, and so is this manual test that had
been throwing a spurious error:
  fn foo {
    var a/eax: int <- copy 0
    var b/ebx: int <- copy 0
    {
      var a1/eax: int <- copy 0
      var b1/ebx: int <- copy a1
    }
    b <- copy a
  }

However, factorial.mu is still throwing a spurious error.

Some history on this commit's fix: When I moved stack-location tracking
out of the parsing phase (commit 6116, Mar 10) I thoughtlessly moved block-depth
tracking as well. And the reason that happened: I'd somehow gotten by without
ever cleaning up vars from a block during parsing. For all my tests, this
is a troubling sign that I'm not testing enough.

The good news: clean-up-blocks works perfectly during parsing.
2020-06-21 11:06:25 -07:00
Kartik Agaram
d1ad96e038 6556 - check for uses of clobbered vars
Now all tests passing again. In the process I found a bug where one of
my tests actually generated incorrect code.
2020-06-19 23:25:43 -07:00
Kartik Agaram
4d2f171ce1 6553 - Mu: disallow registers esp and ebp 2020-06-19 20:52:37 -07:00
Kartik Agaram
88908608b0 6550 - type-checking for function calls
There were a couple of benign type errors in arith.mu but nowhere else.
2020-06-18 08:57:48 -07:00
Kartik Agaram
9ab613b233 6549 - starting on type checking 2020-06-17 13:59:22 -07:00
Kartik Agaram
5a6d2d0db7 6528 2020-06-15 16:57:39 -07:00
Kartik Agaram
43957d3ce5 6525 2020-06-15 14:16:38 -07:00
Kartik Agaram
39bcd3a8bc 6524 2020-06-15 14:11:11 -07:00
Kartik Agaram
532068bfc6 6523
Still some issues; add some tests. I have more that were passing a couple
of days ago but aren't currently.
2020-06-15 14:06:30 -07:00
Kartik Agaram
89c2b59a5f 6522 - redo support for 'byte'
Before: bytes can't live on the stack, so size(byte) == 1 just for array
elements.

After: bytes mostly can't live on the stack except for function args (which
seem too useful to disallow), so size(byte) == 4 except there's now a new
primitive called element-size for array elements where size(byte) == 1.

Now apps/browse.subx starts working again.
2020-06-15 14:02:41 -07:00
Kartik Agaram
5945986cc5 6521 - new primitive: array size in bytes 2020-06-14 00:40:16 -07:00
Kartik Agaram
ad61776f49 6520 - new app: parse-int
Several bugs fixed in the process, and expectation of further bugs is growing.
I'd somehow started assuming I don't need to have separate cases for rm32
as a register vs mem. That's not right. We might need more reg-reg Primitives.
2020-06-14 00:28:23 -07:00
Kartik Agaram
57f92a9334 6518 - extra args through a whole swathe of places
Most unbelievably, I'd forgotten to pass the output 'out' arg to 'lookup-var'
long before the recent additions of 'err' and 'ed' args. But things continued
to work because an earlier call just happened to leave the arg at just
the right place on the stack. So we only caught all these places when we
had to provide error messages.
2020-06-13 23:05:51 -07:00
Kartik Agaram
ef845524e9 6516 - operations on bytes
Byte-oriented addressing is only supported in a couple of instructions
in SubX. As a result, variables of type 'byte' can't live on the stack,
or in registers 'esi' and 'edi'.
2020-06-13 20:23:51 -07:00
Kartik Agaram
7e55a20ff4 6515 - error if 'get' on unknown field
We can't yet say in the error message precisely where the 'get' occurs.
2020-06-12 23:04:22 -07:00
Kartik Agaram
0e20925078 6511 - start of error-checking
We now raise an error if a variable is declared on the stack with an initializer.
And there are unit tests for this functionality.
2020-06-12 00:43:50 -07:00
Kartik Agaram
73cec3939f 6509 - mu.subx: exit-descriptors everywhere 2020-06-11 08:01:37 -07:00
Kartik Agaram
80f53f4a18 6508 - support null exit-descriptor 2020-06-10 23:34:42 -07:00
Kartik Agaram
7dac9ade15 6507 - use syscall names everywhere 2020-06-10 23:09:30 -07:00
Kartik Agaram
8a065c536e 6479
Fix a stray copy-paste when deciding whether to emit spills for registers
(commit 6464).
2020-06-05 22:06:13 -07:00
Kartik Agaram
8122f5d391 6477
I had a little "optimization" to avoid creating nested blocks if "they weren't
needed". Except, of course, they were. Lose the optimization. Sometimes
we create multiple jumps when a single one would suffice. Ignore that for
now.
2020-06-05 21:54:33 -07:00
Kartik Agaram
cde591e03e 6466 2020-06-04 21:25:38 -07:00
Kartik Agaram
20411cc442 6464 - support temporaries in fn output registers
The rule: emit spills for a register unless the output is written somewhere
in the current block after the current instruction. Including in nested
blocks.

Let's see if this is right.
2020-06-04 20:28:27 -07:00
Kartik Agaram
04b3afd67e 6463 - clean up some duplication
Rather than have two ways to decide whether to emit push/pop instructions,
just record for each var on the 'vars' stack whether we emitted a push
for it, and reuse the decision to emit a pop.
2020-06-03 23:40:52 -07:00
Kartik Agaram
da1a29f121 6462
Stack bug.
2020-06-03 23:09:00 -07:00
Kartik Agaram
91c80ca32c 6461 2020-06-03 22:36:35 -07:00
Kartik Agaram
5aed53d809 6460 2020-06-03 22:32:24 -07:00
Kartik Agaram
69a29fc62c 6459 2020-06-03 00:27:01 -07:00
Kartik Agaram
e38a95ecd9 6458 2020-06-03 00:20:05 -07:00
Kartik Agaram
242a50f229 6441
Just always flush Stdout when printing numbers.
2020-05-29 16:44:40 -07:00
Kartik Agaram
eb40c70e26 6440
Minor reordering; the hacky flush-stdout is now only needed if we ever
call print-int32-to-screen.
2020-05-29 16:41:51 -07:00
Kartik Agaram
967d11f102 6424 2020-05-28 22:48:40 -07:00
Kartik Agaram
b22fa8afd8 6423 - done with sample app 'print-file'
Observations:
  - the orchestration from 'in' to 'addr-in' to '_in-addr' to 'in-addr'
    is quite painful. Once to turn a handle into its address, once to turn
    a handle into the address of its payload, and a third time to switch
    a variable out of the overloaded 'eax' variable to make room for read-byte-buffered.
  - I'm starting to use SubX as an escape hatch for features missing in Mu:
    - access to syscalls (which pass args in registers)
    - access to global variables
2020-05-28 22:41:26 -07:00
Kartik Agaram
583a966d3e 6422 - size-of for handles 2020-05-28 22:31:52 -07:00
Kartik Agaram
3962ac5959 6419 - new primitive for opening files 2020-05-28 21:33:04 -07:00
Kartik Agaram
dd6e4bc789 6415 2020-05-27 23:48:27 -07:00
Kartik Agaram
d3bca9dbfe 6413
Fix CI.
2020-05-27 01:53:16 -07:00
Kartik Agaram
9511ff5cd7 6409 - primitives for text-mode UIs 2020-05-27 00:09:22 -07:00
Kartik Agaram
3b5b19df66 6406 - primitive 'copy-handle' 2020-05-25 19:26:18 -07:00
Kartik Agaram
d1179723a9 6394 - a catastrophic bug
How did new-literal ever work?! Somehow we had eax silently being clobbered
without affecting behavior over like 5 apps. Unsafe languages suck.

Anyways, factorial.mu is now part of CI.
2020-05-24 20:54:12 -07:00
Kartik Agaram
4d14c3fefd 6393 - start running .mu apps in CI 2020-05-24 20:36:31 -07:00
Kartik Agaram
27b1e19ebe 6392 - 'length' instruction done in all complexity 2020-05-24 19:46:47 -07:00
Kartik Agaram
1eecd0934f 6391 2020-05-24 16:42:26 -07:00
Kartik Agaram
156763463d 6390 - return length in elements 2020-05-24 14:48:41 -07:00
Kartik Agaram
4335309233 6388 - fix regression #1 2020-05-24 00:51:52 -07:00
Kartik Agaram
e992b940e4 6387 2020-05-23 13:01:39 -07:00
Kartik Agaram
5bc9a5b72e update binaries
CI should start passing again now.
2020-05-22 16:59:53 -07:00
Kartik Agaram
1f38b75e31 6219 2020-05-18 00:43:34 -07:00
Kartik Agaram
6a1cc7d65a 6216 2020-05-05 14:55:52 -07:00
Kartik Agaram
84a0297229 6208 2020-04-22 16:34:08 -07:00
Kartik Agaram
d6f9813650 6205
Rip out scaffolding for function overloading.
2020-04-15 16:25:04 -07:00
Kartik Agaram
d70fca1c3f 6204 2020-04-15 16:09:04 -07:00
Kartik Agaram
0671315c1a 6203 2020-04-12 02:33:01 -07:00
Kartik Agaram
677f98c6f2 6184
Why the heck are we bumping this pointer? Seems like a bug.
2020-04-05 00:05:36 -07:00
Kartik Agaram
bfcc0f858a 6182 - start of support for safe handles
So far it's unclear how to do this in a series of small commits. Still
nibbling around the edges. In this commit we standardize some terminology:

The length of an array or stream is denominated in the high-level elements.
The _size_ is denominated in bytes.

The thing we encode into the type is always the size, not the length.

There's still an open question of what to do about the Mu `length` operator.
I'd like to modify it to provide the length. Currently it provides the
size. If I can't fix that I'll rename it.
2020-04-03 12:35:53 -07:00
Kartik Agaram
f730f2f2c7 6181 2020-04-03 01:05:01 -07:00
Kartik Agaram
df609237c1 6158 - standardize opcode names
At the lowest level, SubX without syntax sugar uses names without prepositions.
For example, 01 and 03 are both called 'add', irrespective of source and
destination operand. Horizontal space is at a premium, and we rely on the
comments at the end of each line to fully describe what is happening.

Above that, however, we standardize on a slightly different naming convention
across:
  a) SubX with syntax sugar,
  b) Mu, and
  c) the SubX code that the Mu compiler emits.

Conventions, in brief:
  - by default, the source is on the left and destination on the right.
    e.g. add %eax, 1/r32/ecx ("add eax to ecx")
  - prepositions reverse the direction.
    e.g. add-to %eax, 1/r32/ecx ("add ecx to eax")
         subtract-from %eax, 1/r32/ecx ("subtract ecx from eax")
  - by default, comparisons are left to right while 'compare<-' reverses.

Before, I was sometimes swapping args to make the operation more obvious,
but that would complicate the code-generation of the Mu compiler, and it's
nice to be able to read the output of the compiler just like hand-written
code.

One place where SubX differs from Mu: copy opcodes are called '<-' and
'->'. Hopefully that fits with the spirit of Mu rather than the letter
of the 'copy' and 'copy-to' instructions.
2020-03-21 16:51:52 -07:00
Kartik Agaram
c6886c1c97 6157 2020-03-21 15:07:59 -07:00
Kartik Agaram
c48ce3c8bf 6153 - switch 'main' to use Mu strings
At the SubX level we have to put up with null-terminated kernel strings
for commandline args. But so far we haven't done much with them. Rather
than try to support them we'll just convert them transparently to standard
length-prefixed strings.

In the process I realized that it's not quite right to treat the combination
of argc and argv as an array of kernel strings. Argc counts the number
of elements, whereas the length of an array is usually denominated in bytes.
2020-03-15 21:03:12 -07:00
Kartik Agaram
f559236bdf 6152 - fix regression in factorial.mu
I had to amend commit 6148 three times yesterday as I kept finding bugs
by inspection. And yet I stubbornly thought I didn't need a test.
2020-03-15 16:44:55 -07:00
Kartik Agaram
abb66df2c7 6150 - call-by-reference is working 2020-03-14 16:04:57 -07:00
Kartik Agaram
afa4d6bb4c 6149 - pass multi-word objects to functions
This is quite inefficient; don't use it for very large objects.
2020-03-14 15:46:38 -07:00
Kartik Agaram
9655878e1b 6148 2020-03-14 15:31:35 -07:00
Kartik Agaram
999292dfdd 6147 2020-03-14 15:15:46 -07:00
Kartik Agaram
114641e2c8 6145 - 'address' operator
This could be a can of worms, but I think I have a set of checks that will
keep use of addresses type-safe.
2020-03-14 14:46:45 -07:00
Kartik Agaram
ff0f216b41 6133 2020-03-12 00:45:17 -07:00
Kartik Agaram
b50c0e3279 6132 2020-03-12 00:36:36 -07:00
Kartik Agaram
92ca78429c 6131 - operating on arrays on the stack 2020-03-12 00:17:35 -07:00
Kartik Agaram
0aa7420745 6128 - arrays on the stack 2020-03-11 19:55:45 -07:00
Kartik Agaram
15655a1246 6126 - support 8-byte register names
Using these is quite unsafe. But what isn't, here?
2020-03-11 18:16:56 -07:00
Kartik Agaram
39eb1e4963 6125 2020-03-11 17:34:48 -07:00
Kartik Agaram
f2d6bb1cb8 6124 2020-03-11 17:30:06 -07:00
Kartik Agaram
28746b3666 6123 - runtime helper for initializing arrays
I built this in 3 phases:
a) create a helper in the bootstrap VM to render the state of the stack.
b) interactively arrive at the right function (tools/stack_array.subx)
c) pull the final solution into the standard library (093stack_allocate.subx)

As the final layer says, this may not be the fastest approach for most
(or indeed any) Mu programs. Perhaps it's better on balance for the compiler
to just emit n/4 `push` instructions.

(I'm sure this solution can be optimized further.)
2020-03-11 17:21:59 -07:00
Kartik Agaram
bfb7c60135 6122 2020-03-10 19:30:03 -07:00
Kartik Agaram
3ca9742e6e 6118 - support records on the stack 2020-03-10 16:39:06 -07:00
Kartik Agaram
fed9e7135c 6116 - stack locations now computed during codegen
We can't do it during parsing time because we may not have all type definitions
available yet. Mu supports using types before defining them.

At first I thought I should do it in populate-mu-type-sizes (appropriately
renamed). But there's enough complexity to tracking when stuff lands on
the stack that it's easiest to do while emitting code.

I don't think we need this information earlier in the compiler. If I'm
right, it seems simpler to colocate the computation of state close to where
it's used.
2020-03-10 16:20:33 -07:00
Kartik Agaram
9f8524a95c 6115 2020-03-10 15:19:18 -07:00
Kartik Agaram
fdce202105 6113 2020-03-10 14:08:59 -07:00
Kartik Agaram
dd0cdc6b80 6112
Move computation of offsets to record fields into the new phase as well.
Now we should be robust to type definitions in any order.
2020-03-08 22:44:56 -07:00
Kartik Agaram
c8784d1c0f 6111
Move out total-size computation from parsing to a separate phase.

I don't have any new tests yet, but it's encouraging that existing tests
continue to pass.

This may be the first time I've ever written this much machine code (with
mutual recursion!) and gotten it to work the first time.
2020-03-08 18:20:16 -07:00
Kartik Agaram
ab74038c0d 6109 2020-03-08 17:12:06 -07:00
Kartik Agaram
c67b26bbcb 6108 2020-03-08 17:09:27 -07:00
Kartik Agaram
a2a9d19f89 6107
Finally we're now able to track the index of a field in a record/struct/product
type.
2020-03-08 16:56:51 -07:00
Kartik Agaram
8c2eda1333 6106
Free up eax using the newly available register.
2020-03-08 16:49:22 -07:00
Kartik Agaram
b5615912fa 6105
Create space for another local.
2020-03-08 16:30:07 -07:00
Kartik Agaram
e33feb03a0 6104
parse-mu-types has a lot of local state. Move a local to the stack to free
up a register.
2020-03-08 16:24:25 -07:00
Kartik Agaram
b12e8c593f 6103 2020-03-08 16:02:32 -07:00
Kartik Agaram
826e054954 6101
Make room for additional information for each field in a record/product
type.

Fields can be used before they're declared, and we may not know the offsets
they correspond to at that point. This is going to necessitate a lot of
restructuring.
2020-03-08 15:47:50 -07:00
Kartik Agaram
39aea17926 6096
A new test, and a new bugfix.
2020-03-07 18:59:57 -08:00
Kartik Agaram
30f844ee8f 6095 2020-03-07 18:32:36 -08:00
Kartik Agaram
3cf0315859 6094 - new 'compute-offset' instruction
If indexing into a type with power-of-2-sized elements we can access them
in one instruction:

  x/reg1: (addr int) <- index A/reg2: (addr array int), idx/reg3: int

This translates to a single instruction because x86 instructions support
an addressing mode with left-shifts.

For non-powers-of-2, however, we need a multiply. To keep things type-safe,
it is performed like this:

  x/reg1: (offset T) <- compute-offset A: (addr array T), idx: int
  y/reg2: (addr T) <- index A, x

An offset is just an int that is guaranteed to be a multiple of size-of(T).
Offsets can only be used in index instructions, and the types will eventually
be required to line up.

In the process, I have to expand Input-size because mu.subx is growing
big.
2020-03-07 17:40:45 -08:00
Kartik Agaram
9ee4b34e06 6093
Some much-needed reorganization.
2020-03-07 15:48:11 -08:00
Kartik Agaram
5c26afb1de 6088 - start using setCC instructions 2020-03-06 17:42:17 -08:00
Kartik Agaram
7c109dffc8 6086 - index into arrays with a literal 2020-03-06 13:50:12 -08:00
Kartik Agaram
b5fbf20556 6085
Support parsing ints from strings rather than slices.
2020-03-06 13:44:54 -08:00
Kartik Agaram
c1737cbaae 6083 2020-03-06 12:08:42 -08:00
Kartik Agaram
4032286f9b 6082 - bugfix in spilling register vars
In the process I'm starting to realize that my approach to avoiding spills
isn't ideal. It works for local variables but not to avoid spilling outputs.

To correctly decide whether to spill to an output register or not, we really
need to analyze when a variable is live. If we don't do that, we'll end
up in one of two bad situations:

a) Don't spill the outermost use of an output register (or just the outermost
scope in a function). This is weird because it's hard to explain to the
programmer why they can overwrite a local with an output above a '{' but
not below.

b) Disallow overwriting entirely. This is easier to communicate but quite
inconvenient. It's nice to be able to use eax for some temporary purpose
before overwriting it with the final result of a function.

If we instead track liveness, things are convenient and also easier to
explain. If a temporary is used after the output has been written that's
an obvious problem: "you clobbered the output". (It seems more reasonable
to disallow multiple live ranges for the output. Once an output is written
it can only be shadowed in a nested block.)

That's the bad news. Now for some good news:

One lovely property Mu the language has at the moment is that live ranges
are guaranteed to be linear segments of code. We don't need to analyze
loop-carried dependences. This means that we can decide whether a variable
is live purely by scanning later statements for its use. (Defining 'register
use' is slightly non-trivial; primitives must somehow specify when they
read their output register.)

So we don't actually need to worry about a loop reading a register with
one type and writing to another type at the end of an iteration. The only
way that can happen is if the write at the end was to a local variable,
and we're guaranteeing that local variables will be reclaimed at the end
of the iteration.

So, the sequence of tasks:
  a) compute register liveness
  b1) verify that all register variables used at any point in a program
  are always the topmost use of that register.
  b2) decide whether to spill/shadow, clobber or flag an error.

There's still the open question of where to attach liveness state. It can't
be on a var, because liveness varies by use of the var. It can't be on a
statement because we may want to know the liveness of variables not referenced
in a given statement. Conceptually we want a matrix of locals x stmts (flattened).
But I think it's simpler than that. We just want to know liveness at the
time of variable declarations. A new register variable can be in one of
three states w.r.t. its previous definition: either it's shadowing it,
or it can clobber it, or there's a conflict and we need to raise an error.

I think we can compute this information for each variable definition by
an analysis similar to existing ones, maintaining a stack of variable definitions.
The major difference is that we don't pop variables when a block ends.
Details to be worked out. But when we do I hope to get these pending tests
passing.
2020-03-06 00:06:42 -08:00
Kartik Agaram
0c89528a38 6079 - optimize register spills
The second var to the same register in a block doesn't need to spill. We're
never going to restore the var it's shadowing.
2020-03-05 18:06:55 -08:00
Kartik Agaram
c984ace5c5 6074 2020-02-29 23:22:32 -08:00
Kartik Agaram
667865a95f 6073 2020-02-29 22:48:13 -08:00
Kartik Agaram
7ae5b71368 6071 - array indexing for non-int power-of-2 types 2020-02-29 06:49:47 -08:00
Kartik Agaram
af326d9e39 6070 2020-02-29 05:53:13 -08:00
Kartik Agaram
c51f590273 6069 2020-02-29 05:33:03 -08:00
Kartik Agaram
17c46e0b8c 6064
Fix CI.
2020-02-27 21:28:02 -08:00
Kartik Agaram
8dc8f2f65d 6062 2020-02-27 18:49:50 -08:00
Kartik Agaram
6163a55370 6061 2020-02-27 17:31:35 -08:00
Kartik Agaram
127a3dae73 6055 - record types and the 'get' instruction
This is a lot of code for a single test, and it took a long time to get
my data model just right. But the test coverage seems ok because it feels
mostly like straight-line code. We'll see.

I've also had to add a lot of prints. We really need app-level trace generation
pretty urgently. That requires deciding how to turn it on/off from the
commandline. And I've been reluctant to start relying on the hairy interface
that is POSIX open().
2020-02-27 16:43:00 -08:00
Kartik Agaram
5f3c324783 6054 2020-02-24 22:46:45 -08:00
Kartik Agaram
08b9511af5 6053 2020-02-23 14:42:22 -08:00
Kartik Agaram
d9b2b78e96 6051 2020-02-23 00:35:02 -08:00
Kartik Agaram
5a405cb2e0 6050 2020-02-23 00:30:31 -08:00
Kartik Agaram
43aa0fe310 6048 2020-02-21 20:29:08 -08:00
Kartik Agaram
f19abd4dd1 6047 2020-02-21 19:38:11 -08:00
Kartik Agaram
331909e744 6044 2020-02-21 15:36:18 -08:00
Kartik Agaram
4fc03f219c 6043
Test for 'index'.
2020-02-21 15:19:34 -08:00
Kartik Agaram
fee1bbd8b4 6041 - array indexing starting to work
And we're using it now in factorial.mu!

In the process I had to fix a couple of bugs in pointer dereferencing.

There are still some limitations:
a) Indexing by a literal doesn't work yet.
b) Only arrays of ints supported so far.

Looking ahead, I'm not sure how I can support indexing arrays by non-literals
(variables in registers) unless the element size is a power of 2.
2020-02-21 10:09:59 -08:00
Kartik Agaram
7fd79cdbc0 6037 - first passing test for pointer lookup 2020-02-20 18:43:45 -08:00
Kartik Agaram
f3ced2d488 6036 2020-02-20 17:55:44 -08:00
Kartik Agaram
f624fcbd1a 6035 2020-02-20 17:51:49 -08:00
Kartik Agaram
3fd4de7058 6034 2020-02-20 16:23:06 -08:00
Kartik Agaram
bf03f1a904 6033 - save pointer lookup state while parsing 2020-02-20 16:12:27 -08:00
Kartik Agaram
07bf3696aa 6032 - make room for '*' pointer lookups in stmts 2020-02-20 15:59:37 -08:00
Kartik Agaram
0d754bd6d9 6031 - bugfix in selecting codegen pattern 2020-02-20 11:56:50 -08:00
Kartik Agaram
5d1d61012d 6030 2020-02-20 11:54:19 -08:00
Kartik Agaram
227c1eaf57 6029 2020-02-20 00:22:11 -08:00
Kartik Agaram
a5469a3a7e 6023 - bug: vars with both stack-offset and reg
This was initially disquieting; was I writing enough tests? Then I noticed
I had TODOs for some missing checks.
2020-02-18 01:30:24 -08:00
Kartik Agaram
54b2ed1e4e 6022 - initial sketch of array length
This is a particularly large abstraction leak: SubX arrays track their
lengths in bytes, and therefore Mu as well.
2020-02-18 01:00:41 -08:00
Kartik Agaram
54fc4d952d 6022
Forgot to actually use the new type-dispatch in commit 6017.
2020-02-18 00:37:37 -08:00
Kartik Agaram
5c4eb680c0 6021 2020-02-18 00:08:20 -08:00
Kartik Agaram
18949b8689 6020
Some deduplication, though this may be a premature abstraction.
2020-02-18 00:07:11 -08:00
Kartik Agaram
01a28c56c7 6019 - finish supporting all branch primitives
I'd been thinking I didn't need unconditional `break` instructions, but
I just realized that non-local unconditional breaks have a use. Stop over-thinking
this, just support everything.

The code is quite duplicated.
2020-02-18 00:07:11 -08:00
Kartik Agaram
c089e33853 6017 - simplify type-dispatch for primitives
We'll be doing type-checking in a separate phase in future. For now we
need only to distinguish between literals and non-literals for x86 primitive
instructions.

I was tempted to support x86 set__ instructions for this change:
  https://c9x.me/x86/html/file_module_x86_id_288.html

That will happen at some point. And I'll simplify a bunch of branches for
results of predicate functions when it happens.
2020-02-17 20:16:28 -08:00
Kartik Agaram
df781efa7d 6011 2020-02-16 20:14:32 -08:00
Kartik Agaram
d9ff5c3fb8 6009 - significantly cleaner lexing
This cleans up a bunch of little warts that had historically accumulated
because of my bull-headedness in not designing a grammar up front. Let's
see if the lack of a grammar comes up again.

We now require that there be no space in variable declarations between
the name and the colon separating it from its type.
2020-02-16 01:44:29 -08:00
Kartik Agaram
deacf2c94e 6008
Allow comments at the end of all kinds of statements.

To do this I replaced all calls to next-word with next-mu-token.. except
one. I'm not seeing any bugs yet, any places where comments break things.
But this exception makes me nervous.
2020-02-16 01:14:06 -08:00
Kartik Agaram
1f029c3b57 6005
Support calling SubX code from Mu. I have _zero_ idea how to make this
safe.

Now we can start writing tests. We can't use commandline args yet. That
requires support for kernel strings.
2020-02-14 01:46:37 -08:00
Kartik Agaram
2c966386d1 6000 - clean up after no-local branches 2020-02-09 20:39:19 -08:00
Kartik Agaram
6c059c7ef3 5999
Fix CI. apps/survey was running out of space in the trace segment when
translating apps/mu.subx
2020-02-09 18:38:55 -08:00
Kartik Agaram
7b1786be40 5998 - redo code-generation for 'break'
I've been saying that we can convert this:

  {
    var x: int
    break-if-=
    ...
  }

..into this:

  {
    68/push 0/imm32
    {
      0f 84/jump-if-= break/disp32
      ...
    }
    81 0/subop/add %esp 4/imm32
  }

All subsequent instructions go into a nested block, so that they can be
easily skipped without skipping the stack cleanup.

However, I've been growing aware that this is a special case. Most of the
time we can't use this trick:
  for loops
  for non-local breaks
  for non-local loops

In most cases we need to figure out all the intervening variables on the
stack and emit code to clean them up.

And now it turns out even for local breaks like above, the trick doesn't
work. Consider what happens when there's a loop later in the block:

  {
    var x: int
    break-if-=
    ...
  }

If we emitted a nested block for the break, the local loop would become
non-local. So we replace one kind of state with another.

Easiest course of action is to just emit the exact same cleanup code for
all conditional branches.
2020-02-09 17:29:52 -08:00
Kartik Agaram
ab6a6ed997 5997 - clean up after unconditional loops
Turns out we can't handle them like conditional loops.

This function to emit cleanup code for jumps is getting quite terrible.
I don't yet know what subsidiary abstractions it needs.
2020-02-09 16:49:04 -08:00
Kartik Agaram
0da12d59b7 5993 - support for unlabeled loop instructions
Now that we have the infrastructure for emitting cleanup blocks, the labeled
variants should be easy as well.
2020-02-08 16:23:23 -08:00
Kartik Agaram
0b636ffe72 5992 2020-02-07 00:14:17 -08:00
Kartik Agaram
f3d054032d 5991 2020-02-07 00:02:09 -08:00
Kartik Agaram
a3dfc8ba82 5989
Start pushing dummy vars for labels on the stack as we encounter them.
This won't affect cleanup code, but will make it easy to ensure that jumps
are well-structured.
2020-02-06 19:23:28 -08:00
Kartik Agaram
56d83f7915 5988
Clean up data structures and eliminate the notion of named blocks.

Named blocks still exist in the Mu language. But they get parsed into a
uniform block data structure, same as unamed blocks.
2020-02-06 16:39:56 -08:00
Kartik Agaram
83bf1291e0 5987 2020-02-06 16:32:42 -08:00
Kartik Agaram
52f5ce1fd3 5986 2020-02-06 16:29:40 -08:00
Kartik Agaram
8dbffb83a8 5985
Standardize on a single block name. This simplifies some code and will
also help in the next couple of steps.
2020-02-06 16:24:19 -08:00
Kartik Agaram
0e203a3120 5984 - start labeling all blocks
This will come in handy for the remaining cases where we need to clean
up locals on the stack:
  loop after var
  non-local break with vars in intervening blocks
  non-local loop with vars in intervening blocks
2020-02-05 22:35:28 -08:00
Kartik Agaram
b9d666eff5 5982 - start putting block labels on the var stack
Before:
  we detected labels using a '$' at the start of an arg, and turned them
  into literals.

After:
  we put labels on the var stack and let the regular lookup of the var
  stack handle labels.

This adds complexity in one place and removes it from another. The crucial
benefit is that it allows us to store a block depth for each label. That
will come in handy later.

All this works only because of a salubrious coincidence: Mu labels are
always at the start of a block, and jumps always refer to the name at the
start of a block, even when the jump is in the forwards direction. So we
never see label uses before definitions.

Note on CI: this currently only works natively, not emulated.
2020-02-05 14:55:18 -08:00
Kartik Agaram
6b6b6851cc 5981 - decompose block cleanup into two traversals
Momentarily less efficient, but we will soon need the ability to emit cleanup
code without losing all our state.
2020-02-02 18:41:15 -08:00
Kartik Agaram
8099ed348d 5977 2020-02-02 07:55:19 -08:00
Kartik Agaram
84fd02c907 5974 - support for simple early exits
So far we only handle unlabeled break instructions correctly. That part
is elegance itself. But the rest will need more work:

a) For labeled breaks we need to insert code to unwind all intervening
blocks.
b) For unlabeled loops we need to insert code to unwind the current block
and then loop.
c) For labeled loops we need to insert code to unwind all intervening blocks
and then loop.

Is this even worth doing? I think so. It's pretty common for a conditional
block inside a loop to 'continue'. That requires looping to somewhere non-local.
2020-02-02 00:19:19 -08:00
Kartik Agaram
2ea560deac 5971 - emit code with indentation
This is easy now that we're tracking block depths everywhere.
2020-02-01 22:55:30 -08:00
Kartik Agaram
f0d3519a0c 5970 - support block-scoped variables 2020-02-01 17:05:59 -08:00
Kartik Agaram
aeac1e061d 5966 - document all supported Mu instructions 2020-01-31 18:55:37 -08:00
Kartik Agaram
4caddd3bb3 5962 - string literals 2020-01-30 01:16:59 -08:00
Kartik Agaram
c428664958 5961 2020-01-30 01:06:37 -08:00
Kartik Agaram
264cef5ed4 5953 - 'multiply' instruction 2020-01-29 23:10:08 -08:00
Kartik Agaram
9d06fcd3fd 5951 - 'compare' instructions 2020-01-29 19:43:20 -08:00
Kartik Agaram
d20fbf71c3 5948 - branching to named blocks 2020-01-29 17:34:07 -08:00
Kartik Agaram
c913d04dfa 5947 - add a new field to primitives
For supporting branches with a target.
2020-01-29 16:26:13 -08:00
Kartik Agaram
d1e76aaa9d 5946 2020-01-29 00:02:30 -08:00
Kartik Agaram
261a1b7480 5945 - branches 2020-01-28 23:47:19 -08:00
Kartik Agaram
fb6a36e862 5943 - initial support for named blocks 2020-01-28 21:41:02 -08:00
Kartik Agaram
5e6500a759 5942 - initial support for blocks
This was too easy. But there are dragons ahead.
2020-01-28 19:18:25 -08:00
Kartik Agaram
9640e4b91c 5940 - local vars in registers starting to work 2020-01-27 16:16:01 -08:00
Kartik Agaram
d9e98256fa 5936 - permit commas everywhere 2020-01-27 14:14:40 -08:00
Kartik Agaram
2e2ab53c49 5929 - local variables kinda working 2020-01-27 02:07:00 -08:00