Soul of a tiny new machine. More thorough tests → More comprehensible and rewrite-friendly software → More resilient society.
Go to file
Kartik K. Agaram 1b18ec6ee9 import a few more unicode blocks from Unifont
shell/ is currently broken; we've overflowed available contiguous space
for code.

Block names based on https://www.compart.com/en/unicode/block:
  0x0000 - 0x007f Basic Latin 128
  0x0080 - 0x00ff Latin-1 Supplement 128
  0x0100 - 0x017f Latin Extended-A 128
  0x0180 - 0x024f Latin Extended-B 208
  0x0250 - 0x02af IPA Extensions 96
  0x02b0 - 0x02ff Spacing Modifier Letters 80
  0x0300 - 0x036f Combining Diacritical Marks 112
  0x0370 - 0x03ff Greek and Coptic 135
  0x0400 - 0x04ff Cyrillic 256
  0x0500 - 0x052f Cyrillic Supplement 48
  0x0530 - 0x058f Armenian 91
  0x0590 - 0x05ff Hebrew 88
  0x0600 - 0x06ff Arabic 255
  0x0700 - 0x074f Syriac 77
  0x0750 - 0x077f Arabic Supplement 48
  0x0780 - 0x07bf Thaana 50
  0x07c0 - 0x07ff NKo 62
  0x0800 - 0x083f Samaritan 61
  0x0840 - 0x085f Mandaic 29
  0x0860 - 0x086f Syriac Supplement 11
  0x08a0 - 0x08ff Arabic Extended-A 84
  0x0900 - 0x097f Devanagari 128
  0x0980 - 0x09ff Bengali 96
  0x0a00 - 0x0a7f Gurmukhi 80
  0x0a80 - 0x0aff Gujarati 91
  0x0b00 - 0x0b7f Oriya 91
  0x0b80 - 0x0bff Tamil 72
  0x0c00 - 0x0c7f Telugu 98
  0x0c80 - 0x0cff Kannada 89
  0x0d00 - 0x0d7f Malayalam 118
  0x0d80 - 0x0dff Sinhala 91
  0x0e00 - 0x0e7f Thai 87
  0x0e80 - 0x0eff Lao 82
  0x0f00 - 0x0fff Tibetan 211
  0x1000 - 0x109f Myanmar 160
  0x10a0 - 0x10ff Georgian 88

But don't trust the block sizes above. Thanks to gdb[1] for this helper:

define z
  print 2 * (0x$arg1 - 0x$arg0 + 1)
end

e.g:
  (gdb) z 10a0 10ff
  192

[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Define.html
2021-08-29 11:20:47 -07:00
apps render wide glyphs in the font 2021-08-29 01:04:26 -07:00
archive 6206 2020-04-17 01:33:51 -07:00
browse-slack move gap buffer code to top-level 2021-08-15 21:09:17 -07:00
editor instructions for Universal Ctags 2021-07-07 19:34:54 -07:00
html . 2021-08-15 23:44:43 -07:00
linux compute-offset: literal index 2021-08-25 22:19:24 -07:00
shell . 2021-08-28 22:57:22 -07:00
tools . 2021-08-11 19:44:20 -07:00
.gitattributes 6690 2020-07-30 21:39:19 -07:00
.gitignore create .gitignore 2021-06-17 21:25:06 -07:00
101screen.subx helper to render fonts outside video RAM, take 2 2021-06-12 22:22:54 -07:00
102keyboard.subx . 2021-05-14 23:15:46 -07:00
103grapheme.subx import a few more unicode blocks from Unifont 2021-08-29 11:20:47 -07:00
104test.subx . 2021-06-08 15:06:08 -07:00
105string-equal.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
106stream.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
108write.subx periodic run of misc_checks 2021-06-12 22:34:22 -07:00
109stream-equal.subx snapshot 2021-06-20 20:36:47 -07:00
112read-byte.subx . 2021-08-10 05:52:48 -07:00
113write-stream.subx reading from streams 2021-07-03 18:27:01 -07:00
115write-byte.subx print call stack on all low-level errors 2021-05-15 00:15:24 -07:00
117write-int-hex.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
118parse-hex-int.subx print call stack on all low-level errors 2021-05-15 00:15:24 -07:00
120allocate.subx debugging helper: heap remaining 2021-08-10 22:47:30 -07:00
121new-stream.subx print call stack on all low-level errors 2021-05-15 00:15:24 -07:00
123slice.subx print call stack on all low-level errors 2021-05-15 00:15:24 -07:00
124next-token.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
126write-int-decimal.subx clarify a corner case in 2's complement integers 2021-07-14 01:15:10 -07:00
127next-word.subx support non-line-oriented processing in next-word 2021-07-29 20:07:13 -07:00
301array-equal.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
302stack_allocate.subx 7254 2020-11-17 01:06:43 -08:00
308allocate-array.subx 7254 2020-11-17 01:06:43 -08:00
309stream.subx . 2021-07-08 16:12:27 -07:00
310copy-bytes.subx reading from streams 2021-07-03 18:27:01 -07:00
311decimal-int.subx . 2021-05-14 23:15:46 -07:00
312copy.subx 7842 - new directory organization 2021-03-03 22:21:03 -08:00
313index-bounds-check.subx start double-buffering 2021-05-18 10:23:54 -07:00
314divide.subx 7290 2020-11-27 21:37:20 -08:00
315stack-debug.subx . 2021-05-14 23:15:46 -07:00
316colors.subx primitive: read r/g/b for color 2021-05-01 21:11:40 -07:00
317abort.subx . 2021-08-28 22:57:22 -07:00
318debug-counter.subx . 2021-06-29 22:37:00 -07:00
319timer.subx more general timer interface 2021-06-29 22:46:26 -07:00
400.mu width-aware drawing primitives 2021-08-29 00:01:08 -07:00
403unicode.mu 7842 - new directory organization 2021-03-03 22:21:03 -08:00
408float.mu maintain aspect ratio when rendering images 2021-07-29 08:24:27 -07:00
411string.mu 7690 2021-02-07 00:20:29 -08:00
412render-float-decimal.mu 7842 - new directory organization 2021-03-03 22:21:03 -08:00
500fake-screen.mu bugfix in commit 8e182e394 2021-08-29 00:51:57 -07:00
501draw-text.mu width-aware drawing primitives 2021-08-29 00:01:08 -07:00
502test.mu update vocabulary documentation 2021-03-08 23:50:35 -08:00
503manhattan-line.mu 7842 - new directory organization 2021-03-03 22:21:03 -08:00
504test-screen.mu width-aware drawing primitives 2021-08-29 00:01:08 -07:00
505colors.mu . 2021-07-14 01:25:18 -07:00
506math.mu press '+' and '-' to zoom in and out respectively 2021-05-16 21:58:13 -07:00
507line.mu animate transition from sum to filter node 2021-05-16 16:09:09 -07:00
508circle.mu reimplement Bresenham circle in Mu 2021-05-15 20:14:20 -07:00
509bezier.mu first bit of animation 2021-05-16 15:29:53 -07:00
510disk.mu more powerful load-sectors 2021-07-16 08:58:15 -07:00
511image.mu slack: first rendering of test data 2021-08-10 22:17:05 -07:00
512array.mu render functions in MRU order 2021-07-19 15:39:36 -07:00
513grapheme-stack.mu . 2021-08-28 21:53:37 -07:00
514gap-buffer.mu . 2021-08-28 21:53:37 -07:00
LICENSE.txt 7489 - include GNU Unifont 2021-01-09 18:20:28 -08:00
README.md . 2021-07-16 09:24:02 -07:00
boot.subx reorganize font before adding non-ASCII 2021-08-27 08:41:15 -07:00
cheatsheet.pdf 5485 - promote SubX to top-level 2019-07-27 17:47:59 -07:00
font.subx import a few more unicode blocks from Unifont 2021-08-29 11:20:47 -07:00
help make online help more obvious 2021-04-04 20:12:43 -07:00
misc_checks some hacky checks for common errors 2021-03-31 23:16:01 -07:00
misc_checks.subx some hacky checks for common errors 2021-03-31 23:16:01 -07:00
modrm.pdf 5485 - promote SubX to top-level 2019-07-27 17:47:59 -07:00
mu-init.subx . 2021-07-19 19:46:04 -07:00
mu.md . 2021-05-16 21:48:24 -07:00
mu_instructions compute-offset: literal index 2021-08-25 22:19:24 -07:00
sib.pdf 5485 - promote SubX to top-level 2019-07-27 17:47:59 -07:00
subx.md start throwing error on duplicate label 2021-08-22 21:09:28 -07:00
subx_bare.md . 2021-03-29 18:47:52 -07:00
subx_opcodes support checking overflow flag everywhere 2021-05-08 21:49:50 -07:00
translate reorganize font before adding non-ASCII 2021-08-27 08:41:15 -07:00
translate_emulated reorganize font before adding non-ASCII 2021-08-27 08:41:15 -07:00
translate_subx shell: literal images 2021-07-28 23:28:29 -07:00
translate_subx_emulated shell: literal images 2021-07-28 23:28:29 -07:00
vimrc.vim . 2021-08-07 22:57:15 -07:00
vocabulary.md font data structure now supports 16-bit glyphs 2021-08-28 21:11:45 -07:00

README.md

Mu: a human-scale computer

Mu is a minimal-dependency hobbyist computing stack (everything above the processor).

Mu is not designed to operate in large clusters providing services for millions of people. Mu is designed for you, to run one computer. (Or a few.) Running the code you want to run, and nothing else.

Here's the Mu computer running Conway's Game of Life.

git clone https://github.com/akkartik/mu
cd mu
./translate apps/life.mu  # emit a bootable code.img
qemu-system-i386 code.img
screenshot of Game of Life running on the Mu computer

(Colorized sources. This is memory-safe code, and most statements map to a single instruction of machine code.)

Rather than start from some syntax and introduce layers of translation to implement it, Mu starts from the processor's instruction set and tries to get to some safe and clear syntax with as few layers of translation as possible. The emphasis is on internal consistency at any point in time rather than compatibility with the past. (More details.)

Tests are a key mechanism here for creating a computer that others can make their own. I want to encourage a style of active and interactive reading with Mu. If something doesn't make sense, try changing it and see what tests break. Any breaking change should cause a failure in some well-named test somewhere.

Currently Mu requires a 32-bit x86 processor.

Goals

In priority order:

  • Reward curiosity.
  • Safe.
    • Thorough test coverage. If you break something you should immediately see an error message. If you can manually test for something you should be able to write an automated test for it.
    • Memory leaks over memory corruption.
  • Teach the computer bottom-up.

Thorough test coverage in particular deserves some elaboration. It implies that any manual test should be easy to turn into a reproducible automated test. Mu has some unconventional methods for providing this guarantee. It exposes testable interfaces for hardware using dependency injection so that tests can run on -- and make assertions against -- fake hardware. It also performs automated white-box testing which enables robust tests for performance, concurrency, fault-tolerance, etc.

Non-goals

  • Speed. Staying close to machine code should naturally keep Mu fast enough.
  • Efficiency. Controlling the number of abstractions should naturally keep Mu using far less than the gigabytes of memory modern computers have.
  • Portability. Mu will run on any computer as long as it's x86. I will enthusiastically contribute to support for other processors -- in separate forks. Readers shouldn't have to think about processors they don't have.
  • Compatibility. The goal is to get off mainstream stacks, not to perpetuate them. Sometimes the right long-term solution is to bump the major version number.
  • Syntax. Mu code is meant to be comprehended by running, not just reading. For now it's a thin memory-safe veneer over machine code. I'm working on a high-level "shell" for the Mu computer.

Toolchain

The Mu stack consists of:

  • the Mu type-safe and memory-safe language;
  • SubX, an unsafe notation for a subset of x86 machine code; and
  • bare SubX, a more rudimentary form of SubX without certain syntax sugar.

All Mu programs get translated through these layers into tiny zero-dependency binaries that run natively. The translators for most levels are built out of lower levels. The translator from Mu to SubX is written in SubX, and the translator from SubX to bare SubX is built in bare SubX. There is also an emulator for Mu's supported subset of x86, that's useful for debugging SubX programs.

Mu programs build natively either on Linux or on Windows using WSL 2. For Macs and other Unix-like systems, use the (much slower) emulator:

./translate_emulated apps/ex2.mu  # ~2 mins to emit code.img

Mu programs can be written for two very different environments:

  • At the top-level, Mu programs emit a bootable image that runs without an OS (under emulation; I haven't tested on native hardware yet). There's rudimentary support for some core peripherals: a 1024x768 screen, a keyboard with some key-combinations, a PS/2 mouse that must be polled, a slow ATA disk drive. No hardware acceleration, no virtual memory, no process separation, no multi-tasking, no network. Boot always runs all tests, and only gets to main if all tests pass.

  • The top-level is built using tools created under the linux/ sub-directory. This sub-directory contains an entirely separate set of libraries intended for building programs that run with just a Linux kernel, reading from stdin and writing to stdout. The Mu compiler is such a program, at linux/mu.subx. Individual programs typically run tests if given a command-line argument called test.

The largest program built in Mu today is its prototyping environment for writing slow, interpreted programs in a Lisp-based high-level language.

screenshot of the Mu shell

(For more details, see the shell/ directory.)

While I currently focus on programs without an OS, the linux/ sub-directory is fairly ergonomic. There's a couple of dozen example programs to try out there. It is likely to be the option for a network stack in the foreseeable future; I have no idea how to interact on the network without Linux.

Syntax

The entire stack shares certain properties and conventions. Programs consist of functions and functions consist of statements, each performing a single operation. Operands to statements are always variables or constants. You can't perform a + b*c in a single statement; you have to break it up into two. Variables can live in memory or in registers. Registers must be explicitly specified. There are some shared lexical rules. Comments always start with '#'. Numbers are always written in hex. Many terms can have context-dependent metadata attached after '/'.

Here's an example program in Mu:

ex2.mu

More resources on Mu:

Here's an example program in SubX:

== code
Entry:
  # ebx = 1
  bb/copy-to-ebx  1/imm32
  # increment ebx
  43/increment-ebx
  # exit(ebx)
  e8/call  syscall_exit/disp32

More resources on SubX:

Forks

Forks of Mu are encouraged. If you don't like something about this repo, feel free to make a fork. If you show it to me, I'll link to it here. I might even pull features upstream!

  • uCISC: a 16-bit processor being designed from scratch by Robert Butler and programmed with a SubX-like syntax.
  • subv: experimental SubX-like syntax by s-ol bekic for the RISC-V instruction set.
  • mu-x86_64: experimental fork for 64-bit x86 in collaboration with Max Bernstein. It's brought up a few concrete open problems that I don't have good solutions for yet.
  • mu-normie: with a more standard build system for the linux/bootstrap/ directory that organizes the repo by header files and compilation units. Stays in sync with this repo.

Desiderata

If you're still reading, here are some more things to check out:

Credits

Mu builds on many ideas that have come before, especially:

  • Peter Naur for articulating the paramount problem of programming: communicating a codebase to others;
  • Christopher Alexander and Richard Gabriel for the intellectual tools for reasoning about the higher order design of a codebase;
  • David Parnas and others for highlighting the value of separating concerns and stepwise refinement;
  • The folklore of debugging by print and the trace facility in many Lisp systems;
  • Automated tests for showing the value of developing programs inside an elaborate harness;

On a more tactical level, this project has made progress in a series of bursts as I discovered the following resources. In autobiographical order, with no claims of completeness: