Commit Graph

83 Commits

Author SHA1 Message Date
Kartik Agaram c6b928be29 7841 2021-03-03 11:04:07 -08:00
Kartik Agaram b628655722 7526 2021-01-16 09:29:21 -08:00
Kartik Agaram 18d5bab2b6 7329 - snapshot: advent day 4 part 2
I've found two bugs in SubX libraries:

1. next-word had an out-of-bounds read
2. next-word was skipping comments, because that's what I need during bootstrapping.
I've created a new variant called next-raw-word that doesn't skip comments.
These really need better names.

We're now at the point where 4b.mu has the right structure and returns
identical result to 4a.mu.
2020-12-04 23:02:53 -08:00
Kartik Agaram 0e0a60013d 7238 - mu.subx: final restrictions on 'addr'
I had to tweak one app that wasn't following the rules.
2020-11-15 13:18:38 -08:00
Kartik Agaram 307745bcc2 7225
Both manual tests described in commit 7222 now work.

To make them work I had to figure out how to copy a file. It
requires a dependency on a new syscall: lseek.
2020-11-11 23:25:55 -08:00
Kartik Agaram bd57b37fc5 7173
All tests passing again.
2020-11-03 21:31:48 -08:00
Kartik Agaram 4cbe0ba1ac 7138 - type-check array 'length' instruction 2020-10-29 00:03:19 -07:00
Kartik Agaram a148b23a22 7101 - tile: remove quotes when evaluating strings
This found several bugs due to me not checking for null strings.
2020-10-25 18:45:11 -07:00
Kartik Agaram b8d613e7c2 6946 - print floats somewhat intuitively in hex 2020-10-04 11:18:23 -07:00
Kartik Agaram 72c42f90cf 6908 - compiling all floating-point operations
We don't yet support emulating these instructions in `bootstrap`. But generated
binaries containing them run natively just fine.
2020-09-30 21:17:37 -07:00
Kartik Agaram 37fd51f754 6783
An extra test that should have been in commit 6781.
2020-09-16 08:31:10 -07:00
Kartik Agaram ae470b42f1 6781 - new app: RPN (postfix) calculator
This was surprisingly hard; bugs discovered all over the place.
2020-09-15 22:52:41 -07:00
Kartik Agaram cd94852dbc 6733 - read utf-8 'grapheme' from byte stream
No support for combining characters. Graphemes are currently just utf-8
encodings of a single Unicode code-point. No support for code-points that
require more than 32 bits in utf-8.
2020-08-28 23:24:04 -07:00
Kartik Agaram e8ffaf29ce 6719 - error-checking for 'index' instructions
1000+ LoC spent; just 300+ excluding tests.

Still one known gap; we don't check the entirety of an array's element
type if it's a compound. So far we just check if say both sides start with
'addr'. Obviously that's not good enough.
2020-08-21 21:32:48 -07:00
Kartik Agaram f16f569060 6622 - new syscalls: time and ntime
As a side-effect I find that my Linode can print ~100k chars/s. At 50 rows
and 200 columns per screen, it's 10 frames/s.
2020-07-08 22:14:42 -07:00
Kartik Agaram 996402e8fd 6604 - new app
https://archive.org/details/akkartik-2min-2020-07-01

In the process I found a bug, added a new syscall, and 'emulated' it.
2020-07-01 16:47:20 -07:00
Kartik Agaram 59cf3ae983 6597 2020-06-29 18:33:52 -07:00
Kartik Agaram 1afc882890 6596 2020-06-29 18:31:17 -07:00
Kartik Agaram 690fa191f1 6595 2020-06-29 18:01:44 -07:00
Kartik Agaram 05dabd816a 6594 - start standardizing the meaning of 'print' 2020-06-29 17:58:01 -07:00
Kartik Agaram 5a6d2d0db7 6528 2020-06-15 16:57:39 -07:00
Kartik Agaram ad61776f49 6520 - new app: parse-int
Several bugs fixed in the process, and expectation of further bugs is growing.
I'd somehow started assuming I don't need to have separate cases for rm32
as a register vs mem. That's not right. We might need more reg-reg Primitives.
2020-06-14 00:28:23 -07:00
Kartik Agaram 80f53f4a18 6508 - support null exit-descriptor 2020-06-10 23:34:42 -07:00
Kartik Agaram 7dac9ade15 6507 - use syscall names everywhere 2020-06-10 23:09:30 -07:00
Kartik Agaram 9511ff5cd7 6409 - primitives for text-mode UIs 2020-05-27 00:09:22 -07:00
Kartik Agaram 3b5b19df66 6406 - primitive 'copy-handle' 2020-05-25 19:26:18 -07:00
Kartik Agaram 06b6e9d813 6382 - re-enable mu.subx in CI
I thought I'd done this in the previous commit, but I hadn't. And, what's
more, there was a bug that seemed pretty tough for a time. Turns out my
self-hosted translator doesn't support '.' comment tokens in data segments.

Hopefully I'm past the valley of the shadow of death now.

      "I HAVE NO TOOLS BECAUSE I’VE DESTROYED MY TOOLS WITH MY TOOLS."
      -- James Mickens (https://www.usenix.org/system/files/1311_05-08_mickens.pdf)
2020-05-22 22:50:39 -07:00
Kartik Agaram 5bc9a5b72e update binaries
CI should start passing again now.
2020-05-22 16:59:53 -07:00
Kartik Agaram e5118fa9fb handle nulls in lookup
Cleaner abstraction, but adds 3 instructions to our overhead for handles,
including one potentially-hard-to-predict jump :/

I wish I could have put the alloc id in eax for the comparison as well,
to save a few bytes of instruction space. But that messes up the non-null
case.
2020-05-18 00:44:49 -07:00
Kartik Agaram f7360e493a support 'fake' handles allocated statically
Mystery solved of why the syntax sugar phases don't work even though they
don't use any functions whose signatures changed in the migration to handles.

The answer: they use the Registers table, and it needs to use handles rather
than raw strings.
2020-05-18 00:44:46 -07:00
Kartik Agaram d56ce7e771 support 'fake' handles allocated statically
Mystery solved of why the syntax sugar phases don't work even though they
don't use any functions whose signatures changed in the migration to handles.

The answer: they use the Registers table, and it currently doesn't use
handles.

Rather than create a whole new set of functions that operate on addresses,
I'm going to create fake handles that are never intended to be reclaimed.
Which raises the question of the best way to do that. I'd like to continue
using string syntax, so I'm going to use a prefix in the payload that can
also be rendered as a string. But all the printable characters start with
0x20, and we don't currently have escape sequences for null or any other
non-printable characters.

I _could_ use newlines, but that seems overly clever. So instead I'll once
again not worry about some hypothetical problem with running out of alloc-ids,
and just carve out half of the id space that can't be used for real alloc
ids. Ascii doesn't use the most significant bit of bytes, so it seems like
a natural separation.
2020-05-18 00:44:46 -07:00
Kartik Agaram 5dc0ddfc9d Rebuild phases of self-hosted SubX translator
For this one commit we need to bootstrap ourselves with subx_translate_debug.
2020-05-18 00:44:46 -07:00
Kartik Agaram 84a0297229 6208 2020-04-22 16:34:08 -07:00
Kartik Agaram bfcc0f858a 6182 - start of support for safe handles
So far it's unclear how to do this in a series of small commits. Still
nibbling around the edges. In this commit we standardize some terminology:

The length of an array or stream is denominated in the high-level elements.
The _size_ is denominated in bytes.

The thing we encode into the type is always the size, not the length.

There's still an open question of what to do about the Mu `length` operator.
I'd like to modify it to provide the length. Currently it provides the
size. If I can't fix that I'll rename it.
2020-04-03 12:35:53 -07:00
Kartik Agaram f730f2f2c7 6181 2020-04-03 01:05:01 -07:00
Kartik Agaram c48ce3c8bf 6153 - switch 'main' to use Mu strings
At the SubX level we have to put up with null-terminated kernel strings
for commandline args. But so far we haven't done much with them. Rather
than try to support them we'll just convert them transparently to standard
length-prefixed strings.

In the process I realized that it's not quite right to treat the combination
of argc and argv as an array of kernel strings. Argc counts the number
of elements, whereas the length of an array is usually denominated in bytes.
2020-03-15 21:03:12 -07:00
Kartik Agaram 3cf0315859 6094 - new 'compute-offset' instruction
If indexing into a type with power-of-2-sized elements we can access them
in one instruction:

  x/reg1: (addr int) <- index A/reg2: (addr array int), idx/reg3: int

This translates to a single instruction because x86 instructions support
an addressing mode with left-shifts.

For non-powers-of-2, however, we need a multiply. To keep things type-safe,
it is performed like this:

  x/reg1: (offset T) <- compute-offset A: (addr array T), idx: int
  y/reg2: (addr T) <- index A, x

An offset is just an int that is guaranteed to be a multiple of size-of(T).
Offsets can only be used in index instructions, and the types will eventually
be required to line up.

In the process, I have to expand Input-size because mu.subx is growing
big.
2020-03-07 17:40:45 -08:00
Kartik Agaram b5fbf20556 6085
Support parsing ints from strings rather than slices.
2020-03-06 13:44:54 -08:00
Kartik Agaram c1737cbaae 6083 2020-03-06 12:08:42 -08:00
Kartik Agaram af326d9e39 6070 2020-02-29 05:53:13 -08:00
Kartik Agaram 17c46e0b8c 6064
Fix CI.
2020-02-27 21:28:02 -08:00
Kartik Agaram 2c966386d1 6000 - clean up after no-local branches 2020-02-09 20:39:19 -08:00
Kartik Agaram 6c059c7ef3 5999
Fix CI. apps/survey was running out of space in the trace segment when
translating apps/mu.subx
2020-02-09 18:38:55 -08:00
Kartik Agaram d20fbf71c3 5948 - branching to named blocks 2020-01-29 17:34:07 -08:00
Kartik Agaram cfdd5b8bf3 5933
Expand some buffer sizes to continue building mu.subx natively.
2020-01-27 02:35:35 -08:00
Kartik Agaram 622f1be099 5898 - strengthen slice-empty? check
Anytime we create a slice, the first check tends to be whether it's empty.
If we handle ill-formed slices here where start > end, that provides a
measure of safety.

In the Mu translator (mu.subx) we often check for a trailing ':' or ','
and decrement slice->end to ignore it. But that could conceivably yield
ill-formed slices if the slice started out empty. Now we make sure we never
operate on such ill-formed slices.
2020-01-19 17:37:11 -08:00
Kartik Agaram 51858e5d46 5887 - reorganize library
Layers 0-89 are used in self-hosting SubX.
Layers 90-99 are not needed for self-hosting SubX, and therefore could
use transitional levels of syntax sugar.
Layers 100 and up use all SubX syntax sugar.
2020-01-14 01:52:54 -08:00
Kartik Agaram a9baaac00b 5847 - literal inputs 2019-12-31 21:58:52 -08:00
Kartik Agaram 2a2a5b1e43 5804
Try to make the comments consistent with the type system we'll eventually
have.
2019-12-08 23:31:05 -08:00
Kartik Agaram a93cd189c9 5803 2019-12-07 20:50:23 -08:00