akkartik/mu - mu - tildegit

Commit Graph

Author	SHA1	Message	Date
Kartik Agaram	bfcc0f858a	6182 - start of support for safe handles So far it's unclear how to do this in a series of small commits. Still nibbling around the edges. In this commit we standardize some terminology: The length of an array or stream is denominated in the high-level elements. The _size_ is denominated in bytes. The thing we encode into the type is always the size, not the length. There's still an open question of what to do about the Mu `length` operator. I'd like to modify it to provide the length. Currently it provides the size. If I can't fix that I'll rename it.	2020-04-03 12:35:53 -07:00
Kartik Agaram	f730f2f2c7	6181	2020-04-03 01:05:01 -07:00
Kartik Agaram	df609237c1	6158 - standardize opcode names At the lowest level, SubX without syntax sugar uses names without prepositions. For example, 01 and 03 are both called 'add', irrespective of source and destination operand. Horizontal space is at a premium, and we rely on the comments at the end of each line to fully describe what is happening. Above that, however, we standardize on a slightly different naming convention across: a) SubX with syntax sugar, b) Mu, and c) the SubX code that the Mu compiler emits. Conventions, in brief: - by default, the source is on the left and destination on the right. e.g. add %eax, 1/r32/ecx ("add eax to ecx") - prepositions reverse the direction. e.g. add-to %eax, 1/r32/ecx ("add ecx to eax") subtract-from %eax, 1/r32/ecx ("subtract ecx from eax") - by default, comparisons are left to right while 'compare<-' reverses. Before, I was sometimes swapping args to make the operation more obvious, but that would complicate the code-generation of the Mu compiler, and it's nice to be able to read the output of the compiler just like hand-written code. One place where SubX differs from Mu: copy opcodes are called '<-' and '->'. Hopefully that fits with the spirit of Mu rather than the letter of the 'copy' and 'copy-to' instructions.	2020-03-21 16:51:52 -07:00
Kartik Agaram	c6886c1c97	6157	2020-03-21 15:07:59 -07:00
Kartik Agaram	c48ce3c8bf	6153 - switch 'main' to use Mu strings At the SubX level we have to put up with null-terminated kernel strings for commandline args. But so far we haven't done much with them. Rather than try to support them we'll just convert them transparently to standard length-prefixed strings. In the process I realized that it's not quite right to treat the combination of argc and argv as an array of kernel strings. Argc counts the number of elements, whereas the length of an array is usually denominated in bytes.	2020-03-15 21:03:12 -07:00
Kartik Agaram	f559236bdf	6152 - fix regression in factorial.mu I had to amend commit 6148 three times yesterday as I kept finding bugs by inspection. And yet I stubbornly thought I didn't need a test.	2020-03-15 16:44:55 -07:00
Kartik Agaram	abb66df2c7	6150 - call-by-reference is working	2020-03-14 16:04:57 -07:00
Kartik Agaram	afa4d6bb4c	6149 - pass multi-word objects to functions This is quite inefficient; don't use it for very large objects.	2020-03-14 15:46:38 -07:00
Kartik Agaram	9655878e1b	6148	2020-03-14 15:31:35 -07:00
Kartik Agaram	999292dfdd	6147	2020-03-14 15:15:46 -07:00
Kartik Agaram	114641e2c8	6145 - 'address' operator This could be a can of worms, but I think I have a set of checks that will keep use of addresses type-safe.	2020-03-14 14:46:45 -07:00
Kartik Agaram	ff0f216b41	6133	2020-03-12 00:45:17 -07:00
Kartik Agaram	b50c0e3279	6132	2020-03-12 00:36:36 -07:00
Kartik Agaram	92ca78429c	6131 - operating on arrays on the stack	2020-03-12 00:17:35 -07:00
Kartik Agaram	0aa7420745	6128 - arrays on the stack	2020-03-11 19:55:45 -07:00
Kartik Agaram	15655a1246	6126 - support 8-byte register names Using these is quite unsafe. But what isn't, here?	2020-03-11 18:16:56 -07:00
Kartik Agaram	39eb1e4963	6125	2020-03-11 17:34:48 -07:00
Kartik Agaram	f2d6bb1cb8	6124	2020-03-11 17:30:06 -07:00
Kartik Agaram	28746b3666	6123 - runtime helper for initializing arrays I built this in 3 phases: a) create a helper in the bootstrap VM to render the state of the stack. b) interactively arrive at the right function (tools/stack_array.subx) c) pull the final solution into the standard library (093stack_allocate.subx) As the final layer says, this may not be the fastest approach for most (or indeed any) Mu programs. Perhaps it's better on balance for the compiler to just emit n/4 `push` instructions. (I'm sure this solution can be optimized further.)	2020-03-11 17:21:59 -07:00
Kartik Agaram	bfb7c60135	6122	2020-03-10 19:30:03 -07:00
Kartik Agaram	3ca9742e6e	6118 - support records on the stack	2020-03-10 16:39:06 -07:00
Kartik Agaram	fed9e7135c	6116 - stack locations now computed during codegen We can't do it during parsing time because we may not have all type definitions available yet. Mu supports using types before defining them. At first I thought I should do it in populate-mu-type-sizes (appropriately renamed). But there's enough complexity to tracking when stuff lands on the stack that it's easiest to do while emitting code. I don't think we need this information earlier in the compiler. If I'm right, it seems simpler to colocate the computation of state close to where it's used.	2020-03-10 16:20:33 -07:00
Kartik Agaram	9f8524a95c	6115	2020-03-10 15:19:18 -07:00
Kartik Agaram	fdce202105	6113	2020-03-10 14:08:59 -07:00
Kartik Agaram	dd0cdc6b80	6112 Move computation of offsets to record fields into the new phase as well. Now we should be robust to type definitions in any order.	2020-03-08 22:44:56 -07:00
Kartik Agaram	c8784d1c0f	6111 Move out total-size computation from parsing to a separate phase. I don't have any new tests yet, but it's encouraging that existing tests continue to pass. This may be the first time I've ever written this much machine code (with mutual recursion!) and gotten it to work the first time.	2020-03-08 18:20:16 -07:00
Kartik Agaram	ab74038c0d	6109	2020-03-08 17:12:06 -07:00
Kartik Agaram	c67b26bbcb	6108	2020-03-08 17:09:27 -07:00
Kartik Agaram	a2a9d19f89	6107 Finally we're now able to track the index of a field in a record/struct/product type.	2020-03-08 16:56:51 -07:00
Kartik Agaram	8c2eda1333	6106 Free up eax using the newly available register.	2020-03-08 16:49:22 -07:00
Kartik Agaram	b5615912fa	6105 Create space for another local.	2020-03-08 16:30:07 -07:00
Kartik Agaram	e33feb03a0	6104 parse-mu-types has a lot of local state. Move a local to the stack to free up a register.	2020-03-08 16:24:25 -07:00
Kartik Agaram	b12e8c593f	6103	2020-03-08 16:02:32 -07:00
Kartik Agaram	826e054954	6101 Make room for additional information for each field in a record/product type. Fields can be used before they're declared, and we may not know the offsets they correspond to at that point. This is going to necessitate a lot of restructuring.	2020-03-08 15:47:50 -07:00
Kartik Agaram	39aea17926	6096 A new test, and a new bugfix.	2020-03-07 18:59:57 -08:00
Kartik Agaram	30f844ee8f	6095	2020-03-07 18:32:36 -08:00
Kartik Agaram	3cf0315859	6094 - new 'compute-offset' instruction If indexing into a type with power-of-2-sized elements we can access them in one instruction: x/reg1: (addr int) <- index A/reg2: (addr array int), idx/reg3: int This translates to a single instruction because x86 instructions support an addressing mode with left-shifts. For non-powers-of-2, however, we need a multiply. To keep things type-safe, it is performed like this: x/reg1: (offset T) <- compute-offset A: (addr array T), idx: int y/reg2: (addr T) <- index A, x An offset is just an int that is guaranteed to be a multiple of size-of(T). Offsets can only be used in index instructions, and the types will eventually be required to line up. In the process, I have to expand Input-size because mu.subx is growing big.	2020-03-07 17:40:45 -08:00
Kartik Agaram	9ee4b34e06	6093 Some much-needed reorganization.	2020-03-07 15:48:11 -08:00
Kartik Agaram	5c26afb1de	6088 - start using setCC instructions	2020-03-06 17:42:17 -08:00
Kartik Agaram	7c109dffc8	6086 - `index` into arrays with a literal	2020-03-06 13:50:12 -08:00
Kartik Agaram	b5fbf20556	6085 Support parsing ints from strings rather than slices.	2020-03-06 13:44:54 -08:00
Kartik Agaram	c1737cbaae	6083	2020-03-06 12:08:42 -08:00
Kartik Agaram	4032286f9b	6082 - bugfix in spilling register vars In the process I'm starting to realize that my approach to avoiding spills isn't ideal. It works for local variables but not to avoid spilling outputs. To correctly decide whether to spill to an output register or not, we really need to analyze when a variable is live. If we don't do that, we'll end up in one of two bad situations: a) Don't spill the outermost use of an output register (or just the outermost scope in a function). This is weird because it's hard to explain to the programmer why they can overwrite a local with an output above a '{' but not below. b) Disallow overwriting entirely. This is easier to communicate but quite inconvenient. It's nice to be able to use eax for some temporary purpose before overwriting it with the final result of a function. If we instead track liveness, things are convenient and also easier to explain. If a temporary is used after the output has been written that's an obvious problem: "you clobbered the output". (It seems more reasonable to disallow multiple live ranges for the output. Once an output is written it can only be shadowed in a nested block.) That's the bad news. Now for some good news: One lovely property Mu the language has at the moment is that live ranges are guaranteed to be linear segments of code. We don't need to analyze loop-carried dependences. This means that we can decide whether a variable is live purely by scanning later statements for its use. (Defining 'register use' is slightly non-trivial; primitives must somehow specify when they read their output register.) So we don't actually need to worry about a loop reading a register with one type and writing to another type at the end of an iteration. The only way that can happen is if the write at the end was to a local variable, and we're guaranteeing that local variables will be reclaimed at the end of the iteration. So, the sequence of tasks: a) compute register liveness b1) verify that all register variables used at any point in a program are always the topmost use of that register. b2) decide whether to spill/shadow, clobber or flag an error. There's still the open question of where to attach liveness state. It can't be on a var, because liveness varies by use of the var. It can't be on a statement because we may want to know the liveness of variables not referenced in a given statement. Conceptually we want a matrix of locals x stmts (flattened). But I think it's simpler than that. We just want to know liveness at the time of variable declarations. A new register variable can be in one of three states w.r.t. its previous definition: either it's shadowing it, or it can clobber it, or there's a conflict and we need to raise an error. I think we can compute this information for each variable definition by an analysis similar to existing ones, maintaining a stack of variable definitions. The major difference is that we don't pop variables when a block ends. Details to be worked out. But when we do I hope to get these pending tests passing.	2020-03-06 00:06:42 -08:00
Kartik Agaram	0c89528a38	6079 - optimize register spills The second var to the same register in a block doesn't need to spill. We're never going to restore the var it's shadowing.	2020-03-05 18:06:55 -08:00
Kartik Agaram	c984ace5c5	6074	2020-02-29 23:22:32 -08:00
Kartik Agaram	667865a95f	6073	2020-02-29 22:48:13 -08:00
Kartik Agaram	7ae5b71368	6071 - array indexing for non-int power-of-2 types	2020-02-29 06:49:47 -08:00
Kartik Agaram	af326d9e39	6070	2020-02-29 05:53:13 -08:00
Kartik Agaram	c51f590273	6069	2020-02-29 05:33:03 -08:00
Kartik Agaram	17c46e0b8c	6064 Fix CI.	2020-02-27 21:28:02 -08:00

1 2 3 4

177 Commits