Still some spurious warnings.
This was an insane experience building out generics. Time to reflect.
Where did I go wrong? How did I end up writing no tests? Let's take some
time and go over the last 50 commits with a fine-tooth comb.
Generics seems to be the feature that has moved mu from a VM project to
a compiler project.
Now we're back to trying to rerunning idempotent transforms on
specialized recipes. Still doesn't work, but at least we don't see
different results depending on whether the trace is enabled inside the
test or right at the start. That got fixed by the more disciplined
insertion into maps, looks like.
I'm still seeing all sorts of failures in turning on layer 11 of edit/,
so I'm backing away and nailing down every culprit I run into. First up:
stop accidentally inserting empty objects into maps during lookups.
Commands run:
$ sed -i 's/\(Recipe_ordinal\|Recipe\|Type_ordinal\|Type\|Memory\)\[\([^]]*\)\] = \(.*\);/put(\1, \2, \3);/' 0[1-9]*
$ vi 075scenario_console.cc # manually fix up Memory[Memory[CONSOLE]]
$ sed -i 's/\(Memory\)\[\([^]]*\)\]/get_or_insert(\1, \2)/' 0[1-9]*
$ sed -i 's/\(Recipe_ordinal\|Type_ordinal\)\[\([^]]*\)\]/get(\1, \2)/' 0[1-9]*
$ sed -i 's/\(Recipe\|Type\)\[\([^]]*\)\]/get(\1, \2)/' 0[1-9]*
Now mu dies pretty quickly because of all the places I try to lookup a
missing value.
I've been growing lax on white-box testing when it's one of the three
big thrusts of this whole effort. Perhaps it was because I got too
obsessed with keeping traces stable and didn't notice that stable
doesn't mean "not changing". Or perhaps it's because I still don't have
a zoomable trace browser that can parse traces from disk. Or perhaps
$trace-browser is too clunky and discourages me from using it.
Regardless, I need to make the trace useable again before I work much
more on the next few rewriting transforms.
There were several places where we push a call on to a routine without
incrementing call-stack depth, which was used to compute the depth at
which to trace an instruction. So sometimes you ended up one depth lower
than you started a call with. Do this enough times and instructions that
should be traced at level 100 end up at level 0 and pop up as errors.
Solution: since call-stack depth is only used for tracing, include it in
the trace stream and make sure we reset it along with the trace stream.
Then catch all places where we forget to increment call-stack depth and
make sure we catch such places in the future.
When I first ran into this with Caleb I thought there must be some way
that we're writing some output into the warnings result. I didn't
recognize that the spurious output as part of the trace, just at the
wrong level.
At the lowest level I'm reluctantly starting to see the need for errors
that stop the program in its tracks. Only way to avoid memory corruption
and security issues. But beyond that core I still want to be as lenient
as possible at higher levels of abstraction.
Always show recipe name where error occurred. But don't show internal
'interactive' name for sandboxes, that's just confusing.
What started out as warnings are now ossifying into errors that halt all
execution. Is this how things went with C and Unix as well?
Front-loads it a bit more than I'd like, but the payoff is that other
recipes will now be able to describe the type checks right next to their
operation.
I'm also introducing a new use of /raw with literals to indicate unsafe
typecasts.
Turns out the default format for printing floating point numbers is
neither 'scientific' nor 'fixed' even though those are the only two
options offered. Reading the C++ standard I found out that the default
(modulo locale changes) is basically the same as the printf "%g" format.
And "%g" is basically the shorter of:
a) %f with trailing zeros trimmed
b) %e
So we'll just do %f and trim trailing zeros.
Ingredients of 'main' are always strings (type address:array:character),
and are delineated from .mu files to load by a "--", e.g.:
$ ./mu x.mu y.mu -- a b c
Here 'main' must be defined in one of x.mu and y.mu, and will receive
the ingredients "a", "b", and "c".
edit.mu is now over 9k lines long. Only 2.6k of them are code. Plan:
chunk it into multiple files inside say an 'edit' directory. Then you
can run it with:
$ mu edit/*
I also want to be able to test just a few layers:
$ mu edit/00[1-5]*
When I try to chunk it into files, the first issue I run into is that
before/after can't refer back to previous layers. Solution:
transform_all at one shot after loading all files.
Finally terminate the experiment of keeping debug prints around. I'm
also going to give up on maintaining counts.
What we really need is two kinds of tracing:
a) For tests, just the domain-specific facts, organized by labels.
b) For debugging, just transient dumps to stdout.
b) only works if stdout is clean by default.
Hmm, I think this means 'stash' should be the transient kind of trace.
Termbox had been taking shortcuts when it thinks the screen hasn't
changed, which doesn't work if some other process messes up the screen.
The Go version has a Sync method in addition to Flush/tb_present for
precisely this eventuality. But it feels like an unnecessary
optimization given C's general speed. Just drop it altogether.
---
This took me a long time to track down, and interestingly I found myself
writing a new tracing primitive before I remembered how to selectively
trace just certain layers during manual tests. I'm scared of generating
traces not because of performance but because of the visual noise. Be
aware of this. I'm going to clean up $log now.
Maybe I should also stop using $print..
If I try to run a single test and it triggers an error the trace gets
saved in the current directory, as if I was trying to log an interactive
run. Then when I try to rerun the test the trace tries to load as mu
code, and hilarity ensues. Just log interactive runs in .traces/ as well.
Still iterating on the right way to handle incorrect number of
ingredients. My first idea of creating null results doesn't really work
once they're used in later instructions. Just add a warning at one place
in the run loop, but otherwise only add products when there's something
to save in them.
Undoes some work around commit 1886.
Region to click on to edit is now reduced to just the menu bar for the
sandbox (excluding the 'x' for deleting the sandbox). The symmetry there
might be useful, but we'll see if the relative click area is
in line with how commonly the actions are performed.
More verbose, but it saves trouble when debugging; there's never
something you thought should be traced but just never came out the other
end.
Also got rid of fatal errors entirely. Everything's a warning now, and
code after a warning isn't guaranteed to run.
Speeds up edit.mu tests by 10x, and shrinks memory usage by 100x.
We need a more efficient implementation of traces, but we can keep going
for now.
We didn't really need to reclaim memory just yet, after all. Mu is
pretty memory-efficient.
It comes up pretty early in the codebase, but hopefully won't come up
in the mu level until we get to higher-order recipes. Potentially
intimidating name, but such prime real estate with no confusing
overloadings in other projects!
chessboard finally passing all its tests. What made this hard was that
for some reason one of the background routines in the main chessboard
test wasn't terminating like it used to. And so it was polluting *later*
tests. Just clean up that source of contamination for now. Later we'll
think about routine termination.
Just figured out why a first keystroke of backspace was sending me out
for a spin: run_interactive needs all early exits that don't actually
run anything to increment the current_step_index(). FML, this is lousy..
Implement warnings for types without definitions without constraining
where type definitions must appear.
We also eliminate the anti-pattern where a change in layer 10 had its
test in layer 11 (commit 1383).
This bit me in the last commit for the first time.
Layer 010vm.cc is starting to look weird. It has references to stuff
that gets implemented much later, like containers and exclusive
containers. Its helpers are getting an increasing amount of logic. And
it has no tests.
I'm still inclined to think it's useful to have major data structures in
one place, even if they aren't used for a bit. But those helpers should
perhaps move out somehow or get some tests in the same layer.
After like 40 seconds (because of the 120-column screen), but whatever.
The final bug was that clear-screen wasn't actually working right for
fake screens.
(The trace is too large for github, so I'm going to leave it out for
now.)
This is a far cleaner way to provide *some* floating-point support. We
can only represent signed integers up to 2^51 rather than 2^63. But in
exchange we don't have to worry about it elsewhere, and it's probably
faster than checking tag bits in every operation.
Hmm, yeah, surprised how easy this was. I think I'll give up on the
other approach.
I still don't have non-integer literals. But we won't bother with those
until we need them. `3.14159:literal` seems ugly.
I added one test to check that divide can return a float, then hacked at
the rippling failures across the entire entire codebase until all tests
pass. Now I need to look at the changes I made and see if there's a
system to them, identify other places that I missed, and figure out the
best way to cover all cases. I also need to show real rather than
encoded values in the traces, but I can't use value() inside reagent
methods because of the name clash with the member variable. So let's
take a snapshot before we attempt any refactoring. This was non-trivial
to get right.
Even if I convince myself that I've gotten it right, I might back this
all out if I can't easily *persuade others* that I've gotten it right.
On my ubuntu 14.04.1 + gcc 4.8.2 machine, ifstream doesn't actually
raise an error on trying to open a non-existent file until you try to do
something with it. Garbage!
All primitives now always write to all their products. If a product is
not used that's fine, but if an instruction seems to expect too many
products mu will complain.
In the process, many primitives can operate on more than two ingredients
where it seems intuitive. You can add or divide more than two numbers
together, copy or negate multiple corresponding locations, etc.
There's one remaining bit of ugliness. Some instructions like
get/get-address, index/index-address, wait-for-location, these can
unnecessarily load values from memory when they don't need to.
Useful vim commands:
%s/ingredients\[\([^\]]*\)\]/ingredients.at(\1)/gc
%s/products\[\([^\]]*\)\]/products.at(\1)/gc
.,$s/\[\(.\)]/.at(\1)/gc
Just to put all our new test primitives through their paces, and iron
out any kinks.
Just the one chessboard scenario is taking 1.5-2.5x all the tests we've
written so far. But we're starting from a faster baseline, that was the
point of the C++ port. I also have -O3 optimizations in my back-pocket.