Just clarified for myself why `subx translate` and `subx run` need to share
code: emulation supports the tests first and foremost.

In the process we clean up our architecture for levels of layers. It's
a good idea but unused once we reconceive of "level 1" as just part of
the test harness.
This commit is contained in:
Kartik Agaram 2020-01-02 01:28:24 -08:00
parent 01013f2ad2
commit d02aa9ac0b
5 changed files with 25 additions and 102 deletions

View File

@ -1,11 +1,5 @@
//: Core data structures for simulating the SubX VM (subset of an x86 processor)
//:
//: At the lowest level ("level 1") of abstraction, SubX executes x86
//: instructions provided in the form of an array of bytes, loaded into memory
//: starting at a specific address.
//:
//: SubX is fundamentally a translator. But having a VM to execute its
//: translations affords greater confidence in it.
//: Core data structures for simulating the SubX VM (subset of an x86 processor),
//: either in tests or debug aids.
//:: registers
//: assume segment registers are hard-coded to 0

View File

@ -78,15 +78,14 @@ void test_copy_imm32_to_EAX() {
);
}
// top-level helper for scenarios: parse the input, transform any macros, load
// the final hex bytes into memory, run it
// top-level helper for tests: parse the input, load the hex bytes into memory, run
void run(const string& text_bytes) {
program p;
istringstream in(text_bytes);
// Loading Test Program
parse(in, p);
if (trace_contains_errors()) return; // if any stage raises errors, stop immediately
transform(p);
if (trace_contains_errors()) return;
// Running Test Program
load(p);
if (trace_contains_errors()) return;
// convenience to keep tests concise: 'Entry' label need not be provided
@ -244,19 +243,6 @@ void test_detect_duplicate_segments() {
);
}
//:: transform
:(before "End Types")
typedef void (*transform_fn)(program&);
:(before "End Globals")
vector<transform_fn> Transform;
:(code)
void transform(program& p) {
for (int t = 0; t < SIZE(Transform); ++t)
(*Transform.at(t))(p);
}
//:: load
void load(const program& p) {

View File

@ -1,20 +1,9 @@
//: The bedrock level 1 of abstraction is now done, and we're going to start
//: building levels above it that make programming in x86 machine code a
//: little more ergonomic.
//:
//: All levels will be "pass through by default". Whatever they don't
//: understand they will silently pass through to lower levels.
//:
//: Since raw hex bytes of machine code are always possible to inject, SubX is
//: not a language, and we aren't building a compiler. This is something
//: deliberately leakier. Levels are more for improving auditing, checks and
//: error messages rather than for hiding low-level details.
//: After that lengthy prelude to define an x86 emulator, we are now ready to
//: start translating SubX notation.
//: Translator workflow: read 'source' file. Run a series of transforms on it,
//: each passing through what it doesn't understand. The final program should
//: be just machine code, suitable to write to an ELF binary.
//:
//: Higher levels usually transform code on the basis of metadata.
//: be just machine code, suitable to emulate, or to write to an ELF binary.
:(before "End Main")
if (is_equal(argv[1], "translate")) {
@ -69,6 +58,10 @@ if (is_equal(argv[1], "translate")) {
}
:(code)
void transform(program& p) {
// End transform(program& p)
}
void print_translate_usage() {
cerr << "Usage: subx translate file1 file2 ... -o output\n";
}

View File

@ -1,64 +1,11 @@
//: Ordering transforms is a well-known hard problem when building compilers.
//: In our case we also have the additional notion of layers. The ordering of
//: layers can have nothing in common with the ordering of transforms when
//: SubX is tangled and run. This can be confusing for readers, particularly
//: if later layers start inserting transforms at arbitrary points between
//: transforms introduced earlier. Over time adding transforms can get harder
//: and harder, having to meet the constraints of everything that's come
//: before. It's worth thinking about organization up-front so the ordering is
//: easy to hold in our heads, and it's obvious where to add a new transform.
//: Some constraints:
//:
//: 1. Layers force us to build SubX bottom-up; since we want to be able to
//: build and run SubX after stopping loading at any layer, the overall
//: organization has to be to introduce primitives before we start using
//: them.
//:
//: 2. Transforms usually need to be run top-down, converting high-level
//: representations to low-level ones so that low-level layers can be
//: oblivious to them.
//:
//: 3. When running we'd often like new representations to be checked before
//: they are transformed away. The whole reason for new representations is
//: often to add new kinds of automatic checking for our machine code
//: programs.
//:
//: Putting these constraints together, we'll use the following broad
//: organization:
//:
//: a) We'll divide up our transforms into "levels", each level consisting
//: of multiple transforms, and dealing in some new set of representational
//: ideas. Levels will be added in reverse order to the one their transforms
//: will be run in.
//:
//: To run all transforms:
//: Load transforms for level n
//: Load transforms for level n-1
//: ...
//: Load transforms for level 2
//: Run code at level 1
//:
//: b) *Within* a level we'll usually introduce transforms in the order
//: they're run in.
//:
//: To run transforms for level n:
//: Perform transform of layer l
//: Perform transform of layer l+1
//: ...
//:
//: c) Within a level it's often most natural to introduce a new
//: representation by showing how it's transformed to the level below. To
//: make such exceptions more obvious checks usually won't be first-class
//: transforms; instead code that keeps the program unmodified will run
//: within transforms before they mutate the program. As an example:
//:
//: Layer l introduces a transform
//: Layer l+1 adds precondition checks for the transform
//:
//: This may all seem abstract, but will hopefully make sense over time. The
//: goals are basically to always have a working program after any layer, to
//: have the order of layers make narrative sense, and to order transforms
//: correctly at runtime.
:(before "End Types")
typedef void (*transform_fn)(program&);
:(before "End Globals")
vector<transform_fn> Transform;
:(before "End transform(program& p)")
for (int t = 0; t < SIZE(Transform); ++t)
(*Transform.at(t))(p);
:(before "End One-time Setup")
// Begin Transforms

View File

@ -1,5 +1,4 @@
//: Beginning of "level 2": tagging bytes with metadata around what field of
//: an x86 instruction they're for.
//: Metadata for fields of an x86 instruction.
//:
//: The x86 instruction set is variable-length, and how a byte is interpreted
//: affects later instruction boundaries. A lot of the pain in programming
@ -27,6 +26,10 @@ put_new(Help, "instructions",
:(before "End Help Contents")
cerr << " instructions\n";
:(before "Running Test Program")
transform(p);
if (trace_contains_errors()) return;
:(code)
void test_pack_immediate_constants() {
run(