mu/subx
Kartik Agaram 071afeff5d 4445 - support labels 2018-07-27 13:30:19 -07:00
..
html 4351 2018-07-16 07:55:07 -07:00
teensy 4323 2018-07-07 13:54:16 -07:00
000organization.cc 4426 - error on unrecognized sub-commands 2018-07-26 16:58:54 -07:00
001help.cc 4436 2018-07-27 10:50:33 -07:00
002test.cc 4426 - error on unrecognized sub-commands 2018-07-26 16:58:54 -07:00
003trace.cc 4427 - support for '--trace' argv 2018-07-26 17:00:37 -07:00
003trace.test.cc 3930 - experimental bytecode interpreter 2017-06-19 21:47:07 -07:00
010vm.cc 4442 2018-07-27 11:55:47 -07:00
011run.cc 4444 2018-07-27 12:35:49 -07:00
012direct_addressing.cc 4442 2018-07-27 11:55:47 -07:00
013indirect_addressing.cc 4442 2018-07-27 11:55:47 -07:00
014immediate_addressing.cc 4442 2018-07-27 11:55:47 -07:00
015index_addressing.cc 4442 2018-07-27 11:55:47 -07:00
016jump_relative.cc 4442 2018-07-27 11:55:47 -07:00
017jump_relative.cc 4442 2018-07-27 11:55:47 -07:00
018functions.cc 4442 2018-07-27 11:55:47 -07:00
019syscalls.cc 4442 2018-07-27 11:55:47 -07:00
020elf.cc 4426 - error on unrecognized sub-commands 2018-07-26 16:58:54 -07:00
021translate.cc 4426 - error on unrecognized sub-commands 2018-07-26 16:58:54 -07:00
022check_instruction.cc 4444 2018-07-27 12:35:49 -07:00
023check_operand_bounds.cc 4445 - support labels 2018-07-27 13:30:19 -07:00
024pack_operands.cc 4445 - support labels 2018-07-27 13:30:19 -07:00
025non_code_segment.cc 4444 2018-07-27 12:35:49 -07:00
026labels.cc 4445 - support labels 2018-07-27 13:30:19 -07:00
Readme.md 4433 2018-07-27 08:28:01 -07:00
build 4403 2018-07-25 13:07:01 -07:00
build_and_test_until 4335 2018-07-10 20:18:11 -07:00
cheatsheet.pdf 4026 2017-10-12 09:36:55 -07:00
clean 4314 2018-07-06 22:50:30 -07:00
edit 4355 2018-07-16 10:04:27 -07:00
ex1 4356 - subx: first program with a data segment 2018-07-16 11:05:19 -07:00
ex1.1.subx 4430 2018-07-26 20:22:18 -07:00
ex1.2.subx 4431 - operate exclusively in hex 2018-07-26 22:32:57 -07:00
ex2 4356 - subx: first program with a data segment 2018-07-16 11:05:19 -07:00
ex2.subx 4431 - operate exclusively in hex 2018-07-26 22:32:57 -07:00
ex3 4356 - subx: first program with a data segment 2018-07-16 11:05:19 -07:00
ex3.subx 4445 - support labels 2018-07-27 13:30:19 -07:00
ex4 4396 2018-07-24 20:39:59 -07:00
ex4.subx 4431 - operate exclusively in hex 2018-07-26 22:32:57 -07:00
ex5 4365 2018-07-17 22:20:16 -07:00
ex5.subx 4424 2018-07-26 12:19:02 -07:00
ex6 4414 - subx: syntax checking 2018-07-25 20:53:43 -07:00
ex6.subx 4424 2018-07-26 12:19:02 -07:00
g 4321 2018-07-07 10:57:56 -07:00
gen 4321 2018-07-07 10:57:56 -07:00
gg 4321 2018-07-07 10:57:56 -07:00
ggdiff 4349 2018-07-15 22:29:01 -07:00
nrun 4321 2018-07-07 10:57:56 -07:00
opcodes 3968 2017-07-11 21:41:15 -07:00
run 4323 2018-07-07 13:54:16 -07:00
subx 4211 2018-02-20 01:38:15 -08:00
subx.vim 4299 2018-06-30 23:05:10 -07:00
test_layers 4024 - attempt to get CI working for SubX 2017-10-11 03:22:13 -07:00
vimrc.vim 4020 2017-10-11 02:32:38 -07:00
xdiff 4349 2018-07-15 22:29:01 -07:00

Readme.md

What is this?

A suite of tools for directly programming in (32-bit x86) machine code without a compiler. The generated ELF binaries require just a Unix-like kernel to run. (It isn't self-hosted yet, so generating the binaries requires a C++ compiler and runtime.)

Why in the world?

  1. It seems wrong-headed that our computers look polished but are plagued by foundational problems of security and reliability. I'd like to learn to walk before I try to run. The plan: start out using the computer only to check my program for errors rather than to hide low-level details. Force myself to think about security by living with raw machine code for a while. Reintroduce high level languages (HLLs) only after confidence is regained in the foundations (and when the foundations are ergonomic enough to support developing a compiler in them). Delegate only when I can verify with confidence.

  2. The software in our computers has grown incomprehensible. Nobody understands it all, not even experts. Even simple programs written by a single author require lots of time for others to comprehend. Compilers are a prime example, growing so complex that programmers have to choose to either program them or use them. I think they may also contribute to the incomprehensibility of the stack above them. I'd like to explore how much of a HLL I can build without a monolithic optimizing compiler, and see if deconstructing the work of the compiler can make the stack as a whole more comprehensible to others.

  3. I want to learn about the internals of the infrastructure we all rely on in our lives.

Running

$ git clone https://github.com/akkartik/mu
$ cd mu/subx
$ ./subx

Running subx will transparently compile it as necessary.

Usage

subx currently has the following sub-commands:

  • subx test: runs all automated tests.

  • subx translate <input file> <output ELF binary>: translates a text file containing hex bytes and macros into an executable ELF binary.

  • subx run <ELF binary>: simulates running the ELF binaries emitted by subx translate. Useful for debugging, and also enables more thorough testing of translate.

Putting them together, build and run one of the example programs:

ex1.1.subx
$ ./subx translate ex1.1.subx ex1
$ ./subx run ex1

If you're running on Linux, ex1 will also be runnable directly:

$ chmod +x ex1
$ ./ex1

There are a few such example programs here. At any commit an example's binary should be identical bit for bit with the output of translating the .subx file. The binary should also be natively runnable on a 32-bit Linux system. If either of these invariants is broken it's a bug on my part. The binary should also be runnable on a 64-bit Linux system. I can't guarantee it, but I'd appreciate hearing if it doesn't run.

However, there are a few more binaries in the teensy/ directory. They are not guaranteed to be runnable by subx. I'm not building general infrastructure here for all of the x86 ISA and ELF format. SubX is about programming with a small, regular subset of 32-bit x86:

  • Only instructions that operate on the 32-bit E*X registers. (No floating-point yet.)
  • Only instructions that assume a flat address space; no instructions that use segment registers.
  • No instructions that check the carry or parity flags; arithmetic operations always operate on signed integers (while bitwise operations always operate on unsigned integers)
  • Only relative jump instructions (with 8-bit or 16-bit offsets).

The ELF binaries generated are statically linked and missing a lot of advanced ELF features as well. But they will run.

For more details on programming in this subset, consult the online help:

$ ./subx help

Resources

Inspirations