Commit Graph

109 Commits

Author SHA1 Message Date
Kartik Agaram
7397dc2ad3 5451
Even though the standard library is building and passing tests, the
binaries it generates aren't exactly bit for bit identical with the
originals. Comparing using `diff_ntranslate`, it looks like the data
segment starting address isn't computed right in survey.subx
(`compute-addresses`) when I start translating layer 058. Deleting some
tests brings the code segment to a p_offset where bits 8-11 (the lowest
4 bits excluding the lowermost byte) are cleared and everything works.
However, if bits 8-11 are set, then they don't make it to p_vaddr and
p_paddr.

Tried reproducing with a unit test, but the unit test passes fine.
2019-07-22 18:07:51 -07:00
Kartik Agaram
37cf4e0581 5443 - standard library is now self-hosted
Translates 5k lines of input in 26 seconds.

I'm not sure why I need to grow the label table. It was already 512 entries
long, and I'm only using 373 so far.
2019-07-22 01:25:33 -07:00
Kartik Agaram
6d79c2bbf6 5435 - redo 5426
We can now translate layers 49-55 using translate and ntranslate. Next
step is to support '\n' in dquotes.subx.
2019-07-21 01:09:51 -07:00
Kartik Agaram
47a9d22a05 5426 2019-07-20 09:48:57 -07:00
Kartik Agaram
88cbe04d7e 5422
Various buffer sizes needed to be grown for ex11. But the next
bottleneck is that we need to code-generate run-tests.
2019-07-19 23:30:33 -07:00
Kartik Agaram
31cb01daf4 5419
Bugfix fourteen: we need different address computation logic for code vs
data labels.

It's really about different categories of instructions having different
address computation logic. This subtle distinction will make good error
messages hard. But that's a problem for later.

Now there's just one example program not translating.
2019-07-19 11:29:52 -07:00
Kartik Agaram
3943b27a00 5417
Clean up.
2019-07-18 23:29:36 -07:00
Kartik Agaram
8da4c8c300 5416
Figured out what's going in with bug fourteen: displacement operands
aren't always used relative to the PC. Does this mean I need to track
instruction boundaries past pack? :'(

No, I just need different logic for labels in code vs data segments.

This was an interesting bug for reminding me of the difference between
the emulator-level trace and the application-level trace. The former has
1.5 million lines, while the latter has a dozen. Luckily, just dumping
the latter immediately made obvious what the issue was.

Though this experience does suggest some further ideas for debugging
tools:

  slice trace by line and phase
    slice trace by start and end label

  debug UI for SubX translator
    2D layout: rows = lines of code;  columns = translator phases
    each 'cell' in this layout contains a list of log lines
    shows what came in, what was emitted
    easily collapse any cell

These are domain-specific tools. Special-cased to the SubX translator
phases.
2019-07-18 23:24:49 -07:00
Kartik Agaram
5030d67c85 5415
Bugfix thirteen: displacement calculations were wrong because current
offset was not being updated properly as words were being read and
emitted.

Now 10/12 example programs are translated correctly.
2019-07-17 23:04:45 -07:00
Kartik Agaram
4d37fb5213 5409
Bugfix eleven: segment flags were incorrectly computed. examples/ex1 now
verified! Added to CI.
2019-07-17 00:29:52 -07:00
Kartik Agaram
c2a74205d6 5408
Bugfix ten: type error in `convert`. I was calling `rewind-stream` on a
`buffered-file`.

examples/ex1 is now just one nibble off the canonical.

I *have* found one missing feature in the self-hosted translator,
though: dquotes doesn't support newlines in strings, even though the C++
version does. dquotes parses them right, but the value initialized in
the data segment is wrong.
2019-07-16 23:40:25 -07:00
Kartik Agaram
70a999aaeb 5407
Bugfix nine: flush(out) after translation is done.

Still one remaining bug from comparing ELF binaries: emit-segments
prints nothing for some reason.
2019-07-15 17:03:39 -07:00
Kartik Agaram
5490993845 5406
Bugfix eight: incorrect segment count in ELF header.

The generated examples/ex1 is still not right. But it has the second
segment now. Or almost all of it. Final byte is missing for some reason.
2019-07-15 16:39:13 -07:00
Kartik Agaram
aef4efb959 5404 - subx/examples/ex1 now translating
The result isn't an identical binary to before, and it segfaults when
run. But it's bugfix seven.

A couple of places where we make .subx files a little more strict:

a) All .subx files must define a data segment. Even if they have no
data.

b) All .subx files must define an `Entry` label for the binary to start
at. Earlier we used to default to the start of the code label. That's
not too hard to add; we'd just need to:
  i) rename `get` to `get-or-abort`
  ii) clone a third variant of `get-or-insert` called `get` that returns
     null if the key is not found.
  iii) use `get` rather than `get-or-abort` when looking up the `Entry`
     label.
2019-07-15 12:26:41 -07:00
Kartik Agaram
c4412d299e . 2019-07-13 22:35:15 -07:00
Kartik Agaram
7f23be0107 .
Clean up.
2019-07-13 19:49:55 -07:00
Kartik Agaram
2773d5a48a survey.subx now passing all tests 2019-07-13 19:41:16 -07:00
Kartik Agaram
c6f91e15a4 test-convert-computes-addresses bugfix six
map of how far we've gotten by now (functions with '*' independently tested):
✓ compute-offsets*
✓ compute-addresses*
✓ emit-output
✓   emit-headers
✓     emit-elf-header
✓       emit-hex-array*
✓     first emit-elf-program-header-entry
✓       emit-hex-array*
?     second emit-elf-program-header-entry
        emit-hex-array*
    emit-segments*
2019-07-13 19:25:52 -07:00
Kartik Agaram
58c643c2c2 fixed fifth bug, hit sixth 2019-07-13 15:43:32 -07:00
Kartik Agaram
d30c716db2 .
Clean up.
2019-07-13 15:40:25 -07:00
Kartik Agaram
50ac5cab9c fixed fourth bug, hit fifth 2019-07-13 15:18:00 -07:00
Kartik Agaram
195a0d7d7d fixed third bug, hit fourth 2019-07-13 14:47:03 -07:00
Kartik Agaram
4c81119344 .
Clean up.
2019-07-13 12:30:38 -07:00
Kartik Agaram
a2593892de fixed second bug, hit third 2019-07-13 12:27:54 -07:00
Kartik Agaram
f518bd972e . 2019-07-13 08:56:15 -07:00
Kartik Agaram
fb935eaa7e fixed one bug, hit another
I carefully logged the segment a label is declared in but forgot to
actually save it in the table. This has been a theoretic concern for
some time, but I've never seen it actually happen until now. SubX is
just too low level.

Now I get past the first two phases but code generation fails to find
the 'Entry' label.
2019-07-12 23:41:23 -07:00
Kartik Agaram
8ba17d839e .
Snapshot at a random moment, showing a new debugging trick: hacking on
the C++ level to dump memory contents on specific labels.

For some reason label 'x' doesn't have a segment assigned by the time we
get to compute-addresses.
2019-07-12 23:14:13 -07:00
Kartik Agaram
f0eb631428 compute-offsets test now passing
The final integration test-convert-computes-addresses is still failing.
2019-07-12 11:39:35 -07:00
Kartik Agaram
e150e6e46e the pseudocode is pretty long, so add an outline 2019-07-12 11:26:50 -07:00
Kartik Agaram
38a314d320 rearrange compute-offsets cases
Now they're in the order you expect to see them at runtime: first you
see a segment header, then you see labels.
2019-07-12 11:23:23 -07:00
Kartik Agaram
0be794d440 .
move trace dump to before checks
2019-07-12 11:17:00 -07:00
Kartik Agaram
022075e59e . 2019-07-11 23:45:03 -07:00
Kartik Agaram
a0c877f8cc one failure remaining in test-compute-offsets
'curr-segment-name' is now a string, and it's stored in a register
rather than a global.

Paradoxically, this leaks *less* than before. Before, every call to
`get-or-insert-slice` leaked memory. Now we leak one string for every
new segment. Which is trivial.
2019-07-11 23:41:37 -07:00
Kartik Agaram
7fed7232c4 . 2019-07-11 22:27:01 -07:00
Kartik Agaram
2c45de094b .
Pseudocode is a little more truthful now about what variables are on the
stack.
2019-07-11 21:52:56 -07:00
Kartik Agaram
98994d5bcc the problem: curr-segment-name is stale
It's a slice into the 'line' stream. But we want to preserve the current
segment name across lines.

Let's leak some memory.
2019-07-11 21:47:00 -07:00
Kartik Agaram
bbfa2acaca .
Make the trace a little more consistent.
2019-07-11 21:42:54 -07:00
Kartik Agaram
554bd09968 label offset computation still has a bug
I changed the test a little to make it obvious.
Basically there's no way we can compute the segment offset correctly
without knowing the segment name in the previous assertion.
2019-07-11 21:38:25 -07:00
Kartik Agaram
be995e2193 revert compute-offsets to segment-relative offsets
The pseudocode was a mess here :/ I was saving the segment-offset but
tracing the file-offset.

Segments need file offsets (to tweak their starting address).
Labels need segment offsets (to add to segment starting address).
2019-07-11 21:31:00 -07:00
Kartik Agaram
f57a458e7b . 2019-07-11 21:24:09 -07:00
nc
3a3e7a90d7 move discard instruction to correct spot 2019-07-11 22:31:56 -04:00
nc
2805308785 updated test so 'x' is relative to file-offset not segment offset 2019-07-11 22:21:32 -04:00
nc
c2725b65b3 made test 2 pass 2019-07-11 22:13:03 -04:00
Kartik Agaram
538f24c296 . 2019-07-10 11:57:08 -07:00
Kartik Agaram
c5f7d9dd57 .
I think we're calling the wrong variant here.
2019-07-10 11:38:24 -07:00
Kartik Agaram
304bce5895 start distinguishing table lookups from inserts 2019-07-10 11:32:46 -07:00
Kartik Agaram
84aa2fad2d .
Fix infinite loop in the 2 remaining failing tests; now it's a segfault.
2019-07-10 09:39:38 -07:00
Kartik Agaram
7e9cdf3c7a . 2019-07-10 09:35:24 -07:00
Kartik Agaram
48aabc860a mostly done with emit-output
Some nooks and crannies will need light final debugging with xxd, but
emit-hex-output covers most of the logic.
2019-07-09 23:41:32 -07:00
Kartik Agaram
20a527702b done with emit-segments
Only failures now are the first two tests in survey.subx.
2019-07-09 22:11:44 -07:00