mu/003trace.cc

482 lines
16 KiB
C++
Raw Normal View History

2017-03-02 12:41:24 +00:00
//: The goal of layers is to make programs more easy to understand and more
//: malleable, easy to rewrite in radical ways without accidentally breaking
//: some corner case. Tests further both goals. They help understandability by
//: letting one make small changes and get feedback. What if I wrote this line
//: like so? What if I removed this function call, is it really necessary?
//: Just try it, see if the tests pass. Want to explore rewriting this bit in
//: this way? Tests put many refactorings on a firmer footing.
//:
//: But the usual way we write tests seems incomplete. Refactorings tend to
//: work in the small, but don't help with changes to function boundaries. If
//: you want to extract a new function you have to manually test-drive it to
//: create tests for it. If you want to inline a function its tests are no
//: longer valid. In both cases you end up having to reorganize code as well as
//: tests, an error-prone activity.
//:
//: In response, this layer introduces the notion of domain-driven *white-box*
//: testing. We focus on the domain of inputs the whole program needs to
//: handle rather than the correctness of individual functions. All white-box
//: tests (we call them 'scenarios') invoke the program in a single way: by
//: calling run() with some input. As the program operates on the input, it
//: traces out a list of _facts_ deduced about the domain:
//: trace("label") << "fact 1: " << val;
//:
//: Scenarios can now check these facts:
2015-05-03 18:53:15 +00:00
//: :(scenario foo)
//: 34 # call run() with this input
2017-03-02 13:48:01 +00:00
//: +label: fact 1: 34 # 'run' should have deduced this fact
//: -label: fact 1: 35 # the trace should not contain such a fact
//:
//: Since we never call anything but the run() function directly, we never have
//: to rewrite the scenarios when we reorganize the internals of the program. We
//: just have to make sure our rewrite deduces the same facts about the domain,
//: and that's something we're going to have to do anyway.
//:
2017-03-02 13:48:01 +00:00
//: To avoid the combinatorial explosion of integration tests, each layer
//: mainly logs facts to the trace with a common *label*. All scenarios in a
//: layer tend to check facts with this label. Validating the facts logged
//: with a specific label is like calling functions of that layer directly.
//:
//: To build robust scenarios, trace facts about your domain rather than details of
//: how you computed them.
//:
//: More details: http://akkartik.name/blog/tracing-tests
//:
//: ---
//:
//: Between layers and domain-driven testing, programming starts to look like a
//: fundamentally different activity. Instead of focusing on a) superficial,
//: b) local rules on c) code [like say http://blog.bbv.ch/2013/06/05/clean-code-cheat-sheet],
//: we allow programmers to engage with the a) deep, b) global structure of
//: the c) domain. If you can systematically track discontinuities in the
//: domain, you don't care if the code used gotos as long as it passed all
//: scenarios. If scenarios become more robust to run, it becomes easier to
//: try out radically different implementations for the same program. If code
//: is super-easy to rewrite, it becomes less important what indentation style
//: it uses, or that the objects are appropriately encapsulated, or that the
//: functions are referentially transparent.
//:
//: Instead of plumbing, programming becomes building and gradually refining a
//: map of the environment the program must operate under. Whether a program
//: is 'correct' at a given point in time is a red herring; what matters is
//: avoiding regression by monotonically nailing down the more 'eventful'
//: parts of the terrain. It helps readers new and old, and rewards curiosity,
//: to organize large programs in self-similar hierarchies of example scenarios
//: colocated with the code that makes them work.
//:
//: "Programming properly should be regarded as an activity by which
//: programmers form a mental model, rather than as production of a program."
//: -- Peter Naur (http://alistair.cockburn.us/ASD+book+extract%3A+%22Naur,+Ehn,+Musashi%22)
//:: == Core data structures
2015-05-22 01:10:17 +00:00
:(before "End Globals")
trace_stream* Trace_stream = NULL;
:(before "End Types")
struct trace_stream {
2015-05-22 01:10:17 +00:00
vector<trace_line> past_lines;
// End trace_stream Fields
2015-05-22 01:57:25 +00:00
trace_stream() {
// End trace_stream Constructor
}
~trace_stream() {
// End trace_stream Destructor
2017-12-05 09:15:10 +00:00
}
// End trace_stream Methods
};
2017-12-05 09:15:10 +00:00
//:: == Adding to the trace
//: Top-level method is trace() which can be used like an ostream. Usage:
//: trace(depth, label) << ... << end();
//: Don't forget the 'end()' to actually append to the trace.
:(before "End Includes")
// No brackets around the expansion so that it prints nothing if Trace_stream
// isn't initialized.
#define trace(...) !Trace_stream ? cerr : Trace_stream->stream(__VA_ARGS__)
:(before "End trace_stream Fields")
// accumulator for current trace_line
ostringstream* curr_stream;
string curr_label;
int curr_depth;
// other stuff
int collect_depth; // avoid tracing lower levels for speed
ofstream null_stream; // never opened, so writes to it silently fail
//: Some constants.
:(before "struct trace_stream") // include constants in all cleaved compilation units
const int Max_depth = 9999;
// Most important traces are printed to the screen by default
const int Error_depth = 0;
:(before "End Globals")
int Hide_errors = false; // if set, don't print errors to screen
:(before "End trace_stream Constructor")
curr_stream = NULL;
curr_depth = Max_depth;
collect_depth = Max_depth;
:(before "End Reset")
Hide_errors = false;
:(before "struct trace_stream")
struct trace_line {
string contents;
string label;
int depth; // 0 is 'sea level'; positive integers are progressively 'deeper' and lower level
trace_line(string c, string l) {
contents = c;
label = l;
depth = 0;
}
trace_line(string c, string l, int d) {
contents = c;
label = l;
depth = d;
}
};
//: Starting a new trace line.
:(before "End trace_stream Methods")
ostream& stream(string label) {
return stream(Max_depth, label);
}
ostream& stream(int depth, string label) {
if (depth > collect_depth) return null_stream;
curr_stream = new ostringstream;
curr_label = label;
curr_depth = depth;
return *curr_stream;
}
//: End of a trace line; append it to the trace.
:(before "End Types")
struct end {};
:(code)
ostream& operator<<(ostream& os, end /*unused*/) {
if (Trace_stream) Trace_stream->newline();
return os;
}
:(before "End trace_stream Methods")
void newline();
:(code)
void trace_stream::newline() {
if (!curr_stream) return;
string curr_contents = curr_stream->str();
3370 Fix CI. Figuring out this memory leak was an epic story. I was able to quickly track down that it was caused by commit 3365, in particular the overloading of to-text to support characters. But beyond that I was stumped. Why were layer 3's trace_stream::curr_stream objects being leaked in layer 81 by a change in layer 59?! Triaging through the layers, I found the following: layer 81 - leaks 2 blocks (in clear-line-erases-printed-characters) layer 83 - leaks 4 additional blocks (in clear-line-erases-printed-characters-2) I also figured out that the leaks were happening because of the call to 'trace' on a character inside print:character (that's the 'print' function called with a character argument) trace 90, [print-character], c So I tried to create a simple scenario: scenario trace-on-character-leaks [ 1:character <- copy 111/o trace 90, [print-character], 1:character ] But that triggered no leaks. Which made sense because there were plenty of calls to that 'trace' instruction in print:character. The leak only happened when print:character was called from clear-line. Oh, it happens only when tracing 0/nul characters. Tracing a Mu string with a nul character creates an empty C++ string, which is weird. But why should it leak memory?! Anyway, I tried a new scenario at layer 62 (when 'trace' starts auto-converting characters to text) scenario stashing-nul-character-leaks [ 1:character <- copy 0/nul trace 90, [dbg], 1:character ] But still, no leak! I played around with running layers until 70, 80. But then it didn't leak even at layer 82 where I'd seen it leak before. What had I done? Turns out it was only leaking if I used names for variables and not numeric addresses. Eventually I was able to get layer 59 to leak: scenario stashing-nul-character-leaks [ c:character <- copy 0/nul x:text <- to-text c trace 90, [dbg], x ] At that point I finally went to look at layer 3 (I'd been thinking about it before, but hadn't bothered to *actually go look*!) And the leak was obvious. In the end, all the information I needed was right there in the leak report. The reason this was hard to find was that I wasn't ready to believe there could be a bug in layer 3 after all these months. I had to go through the five stages of grief before I was ready for that realization. Final mystery: why was the memory leak not triggered by numeric variables? Because the transform to auto-convert ingredients to text only operated on named variables. Manually performing the transform did leak: scenario stashing-text-containing-nul-character-leaks [ 1:text <- new character:type, 1/capacity put-index *1:text, 0, 0/nul trace 90, [dbg], 1:text ]
2016-09-16 21:37:52 +00:00
if (!curr_contents.empty()) {
past_lines.push_back(trace_line(curr_contents, trim(curr_label), curr_depth)); // preserve indent in contents
// maybe incrementally dump trace
trace_line& t = past_lines.back();
if (!Hide_errors && curr_depth == Error_depth) {
cerr << std::setw(4) << t.depth << ' ' << t.label << ": " << t.contents << '\n';
}
// End trace Commit
3370 Fix CI. Figuring out this memory leak was an epic story. I was able to quickly track down that it was caused by commit 3365, in particular the overloading of to-text to support characters. But beyond that I was stumped. Why were layer 3's trace_stream::curr_stream objects being leaked in layer 81 by a change in layer 59?! Triaging through the layers, I found the following: layer 81 - leaks 2 blocks (in clear-line-erases-printed-characters) layer 83 - leaks 4 additional blocks (in clear-line-erases-printed-characters-2) I also figured out that the leaks were happening because of the call to 'trace' on a character inside print:character (that's the 'print' function called with a character argument) trace 90, [print-character], c So I tried to create a simple scenario: scenario trace-on-character-leaks [ 1:character <- copy 111/o trace 90, [print-character], 1:character ] But that triggered no leaks. Which made sense because there were plenty of calls to that 'trace' instruction in print:character. The leak only happened when print:character was called from clear-line. Oh, it happens only when tracing 0/nul characters. Tracing a Mu string with a nul character creates an empty C++ string, which is weird. But why should it leak memory?! Anyway, I tried a new scenario at layer 62 (when 'trace' starts auto-converting characters to text) scenario stashing-nul-character-leaks [ 1:character <- copy 0/nul trace 90, [dbg], 1:character ] But still, no leak! I played around with running layers until 70, 80. But then it didn't leak even at layer 82 where I'd seen it leak before. What had I done? Turns out it was only leaking if I used names for variables and not numeric addresses. Eventually I was able to get layer 59 to leak: scenario stashing-nul-character-leaks [ c:character <- copy 0/nul x:text <- to-text c trace 90, [dbg], x ] At that point I finally went to look at layer 3 (I'd been thinking about it before, but hadn't bothered to *actually go look*!) And the leak was obvious. In the end, all the information I needed was right there in the leak report. The reason this was hard to find was that I wasn't ready to believe there could be a bug in layer 3 after all these months. I had to go through the five stages of grief before I was ready for that realization. Final mystery: why was the memory leak not triggered by numeric variables? Because the transform to auto-convert ingredients to text only operated on named variables. Manually performing the transform did leak: scenario stashing-text-containing-nul-character-leaks [ 1:text <- new character:type, 1/capacity put-index *1:text, 0, 0/nul trace 90, [dbg], 1:text ]
2016-09-16 21:37:52 +00:00
}
// clean up
delete curr_stream;
curr_stream = NULL;
curr_label.clear();
curr_depth = Max_depth;
}
//:: == Initializing the trace in scenarios
:(before "End Includes")
#define START_TRACING_UNTIL_END_OF_SCOPE lease_tracer leased_tracer;
:(before "End Test Setup")
START_TRACING_UNTIL_END_OF_SCOPE
//: Trace_stream is a resource, lease_tracer uses RAII to manage it.
:(before "End Types")
struct lease_tracer {
lease_tracer();
~lease_tracer();
};
:(code)
lease_tracer::lease_tracer() { Trace_stream = new trace_stream; }
lease_tracer::~lease_tracer() {
delete Trace_stream;
Trace_stream = NULL;
}
//:: == Errors using traces
:(before "End Includes")
#define raise (!Trace_stream ? (scroll_to_bottom_and_close_console(),++Trace_errors,cerr) /*do print*/ : Trace_stream->stream(Error_depth, "error"))
:(before "End Globals")
int Trace_errors = 0; // used only when Trace_stream is NULL
// Fail scenarios that displayed (unexpected) errors.
// Expected errors should always be hidden and silently checked for.
:(before "End Test Teardown")
if (Passed && !Hide_errors && trace_contains_errors()) {
Passed = false;
}
:(code)
bool trace_contains_errors() {
return Trace_errors > 0 || trace_count("error") > 0;
}
:(before "End Includes")
3663 - fix a refcounting bug: '(type)' != 'type' This was a large commit, and most of it is a follow-up to commit 3309, undoing what is probably the final ill-considered optimization I added to s-expressions in Mu: I was always representing (a b c) as (a b . c), etc. That is now gone. Why did I need to take it out? The key problem was the error silently ignored in layer 30. That was causing size_of("(type)") to silently return garbage rather than loudly complain (assuming 'type' was a simple type). But to take it out I had to modify types_strictly_match (layer 21) to actually strictly match and not just do a prefix match. In the process of removing the prefix match, I had to make extracting recipe types from recipe headers more robust. So far it only matched the first element of each ingredient's type; these matched: (recipe address:number -> address:number) (recipe address -> address) I didn't notice because the dotted notation optimization was actually representing this as: (recipe address:number -> address number) --- One final little thing in this commit: I added an alias for 'assert' called 'assert_for_now', to indicate that I'm not sure something's really an invariant, that it might be triggered by (invalid) user programs, and so require more thought on error handling down the road. But this may well be an ill-posed distinction. It may be overwhelmingly uneconomic to continually distinguish between model invariants and error states for input. I'm starting to grow sympathetic to Google Analytics's recent approach of just banning assertions altogether. We'll see..
2016-11-11 05:39:02 +00:00
// If we aren't yet sure how to deal with some corner case, use assert_for_now
// to indicate that it isn't an inviolable invariant.
#define assert_for_now assert
#define raise_for_now raise
//: Automatically close the console in some situations.
:(before "End One-time Setup")
atexit(scroll_to_bottom_and_close_console);
:(code)
void scroll_to_bottom_and_close_console() {
if (!tb_is_active()) return;
// leave the screen in a relatively clean state
tb_set_cursor(tb_width()-1, tb_height()-1);
cout << "\r\n";
tb_shutdown();
}
:(before "End Includes")
#include "termbox/termbox.h"
//:: == Other assertions on traces
//: Primitives:
//: - CHECK_TRACE_CONTENTS(lines)
//: Assert that the trace contains the given lines (separated by newlines)
//: in order. There can be other intervening lines between them.
//: - CHECK_TRACE_DOESNT_CONTAIN(line)
//: - CHECK_TRACE_DOESNT_CONTAIN(label, contents)
//: Assert that the trace doesn't contain the given (single) line.
//: - CHECK_TRACE_COUNT(label, count)
//: Assert that the trace contains exactly 'count' lines with the given
//: 'label'.
//: - CHECK_TRACE_CONTAINS_ERRORS()
//: - CHECK_TRACE_DOESNT_CONTAIN_ERRORS()
//: - trace_count_prefix(label, prefix)
//: Count the number of trace lines with the given 'label' that start with
//: the given 'prefix'.
2015-09-12 21:58:33 +00:00
:(before "End Includes")
2015-05-21 18:37:50 +00:00
#define CHECK_TRACE_CONTENTS(...) check_trace_contents(__FUNCTION__, __FILE__, __LINE__, __VA_ARGS__)
#define CHECK_TRACE_DOESNT_CONTAIN(...) CHECK(trace_doesnt_contain(__VA_ARGS__))
#define CHECK_TRACE_COUNT(label, count) \
if (Passed && trace_count(label) != (count)) { \
cerr << "\nF - " << __FUNCTION__ << "(" << __FILE__ << ":" << __LINE__ << "): trace_count of " << label << " should be " << count << '\n'; \
cerr << " got " << trace_count(label) << '\n'; /* multiple eval */ \
DUMP(label); \
Passed = false; \
return; /* Currently we stop at the very first failure. */ \
}
#define CHECK_TRACE_CONTAINS_ERRORS() CHECK(trace_contains_errors())
#define CHECK_TRACE_DOESNT_CONTAIN_ERRORS() \
if (Passed && trace_contains_errors()) { \
cerr << "\nF - " << __FUNCTION__ << "(" << __FILE__ << ":" << __LINE__ << "): unexpected errors\n"; \
DUMP("error"); \
Passed = false; \
return; \
}
// Allow scenarios to ignore trace lines generated during setup.
#define CLEAR_TRACE delete Trace_stream, Trace_stream = new trace_stream
:(code)
2015-09-29 16:12:31 +00:00
bool check_trace_contents(string FUNCTION, string FILE, int LINE, string expected) {
if (!Passed) return false;
if (!Trace_stream) return false;
vector<string> expected_lines = split(expected, "");
int curr_expected_line = 0;
2015-05-17 09:22:41 +00:00
while (curr_expected_line < SIZE(expected_lines) && expected_lines.at(curr_expected_line).empty())
++curr_expected_line;
2015-05-17 09:22:41 +00:00
if (curr_expected_line == SIZE(expected_lines)) return true;
2015-10-06 05:53:41 +00:00
string label, contents;
split_label_contents(expected_lines.at(curr_expected_line), &label, &contents);
2016-10-20 05:10:35 +00:00
for (vector<trace_line>::iterator p = Trace_stream->past_lines.begin(); p != Trace_stream->past_lines.end(); ++p) {
if (label != p->label) continue;
if (contents != trim(p->contents)) continue;
++curr_expected_line;
2015-05-17 09:22:41 +00:00
while (curr_expected_line < SIZE(expected_lines) && expected_lines.at(curr_expected_line).empty())
++curr_expected_line;
2015-05-17 09:22:41 +00:00
if (curr_expected_line == SIZE(expected_lines)) return true;
2015-10-06 05:53:41 +00:00
split_label_contents(expected_lines.at(curr_expected_line), &label, &contents);
}
if (line_exists_anywhere(label, contents)) {
cerr << "\nF - " << FUNCTION << "(" << FILE << ":" << LINE << "): line [" << label << ": " << contents << "] out of order in trace:\n";
DUMP("");
}
else {
cerr << "\nF - " << FUNCTION << "(" << FILE << ":" << LINE << "): missing [" << contents << "] in trace:\n";
DUMP(label);
}
Passed = false;
return false;
}
bool trace_doesnt_contain(string expected) {
vector<string> tmp = split_first(expected, ": ");
if (SIZE(tmp) == 1) {
raise << expected << ": missing label or contents in trace line\n" << end();
assert(false);
}
return trace_count(tmp.at(0), tmp.at(1)) == 0;
}
2015-10-06 05:53:41 +00:00
int trace_count(string label) {
return trace_count(label, "");
}
2015-10-06 05:53:41 +00:00
int trace_count(string label, string line) {
if (!Trace_stream) return 0;
long result = 0;
2016-10-20 05:10:35 +00:00
for (vector<trace_line>::iterator p = Trace_stream->past_lines.begin(); p != Trace_stream->past_lines.end(); ++p) {
2015-10-06 05:53:41 +00:00
if (label == p->label) {
2015-12-24 23:36:12 +00:00
if (line == "" || trim(line) == trim(p->contents))
++result;
}
}
return result;
}
int trace_count_prefix(string label, string prefix) {
if (!Trace_stream) return 0;
long result = 0;
2016-10-20 05:10:35 +00:00
for (vector<trace_line>::iterator p = Trace_stream->past_lines.begin(); p != Trace_stream->past_lines.end(); ++p) {
if (label == p->label) {
if (starts_with(trim(p->contents), trim(prefix)))
++result;
}
}
return result;
}
void split_label_contents(const string& s, string* label, string* contents) {
static const string delim(": ");
size_t pos = s.find(delim);
if (pos == string::npos) {
*label = "";
*contents = trim(s);
}
else {
*label = trim(s.substr(0, pos));
*contents = trim(s.substr(pos+SIZE(delim)));
}
}
bool line_exists_anywhere(const string& label, const string& contents) {
for (vector<trace_line>::iterator p = Trace_stream->past_lines.begin(); p != Trace_stream->past_lines.end(); ++p) {
if (label != p->label) continue;
if (contents == trim(p->contents)) return true;
}
return false;
}
vector<string> split(string s, string delim) {
vector<string> result;
2015-05-17 09:22:41 +00:00
size_t begin=0, end=s.find(delim);
while (true) {
2015-05-17 09:22:41 +00:00
if (end == string::npos) {
result.push_back(string(s, begin, string::npos));
break;
}
result.push_back(string(s, begin, end-begin));
begin = end+SIZE(delim);
end = s.find(delim, begin);
}
return result;
}
vector<string> split_first(string s, string delim) {
vector<string> result;
size_t end=s.find(delim);
result.push_back(string(s, 0, end));
if (end != string::npos)
result.push_back(string(s, end+SIZE(delim), string::npos));
return result;
}
//:: == Helpers for debugging using traces
:(before "End Includes")
// To debug why a scenario is failing, dump its trace using '?'.
#define DUMP(label) if (Trace_stream) cerr << Trace_stream->readable_contents(label);
// To add temporary prints to the trace, use 'dbg'.
// `git log` should never show any calls to 'dbg'.
#define dbg trace(0, "a")
//: Dump the entire trace to file where it can be browsed offline.
//: Dump the trace as it happens; that way you get something even if the
//: program crashes.
:(before "End Globals")
ofstream Trace_file;
:(before "End Commandline Options(*arg)")
else if (is_equal(*arg, "--trace")) {
Trace_stream = new trace_stream;
cerr << "saving trace to 'last_run'\n";
Trace_file.open("last_run");
2019-02-27 08:54:42 +00:00
// Add a dummy line up top; otherwise the `browse_trace` tool currently has
// no way to expand any lines above an error.
Trace_file << " 0 dummy: start\n";
}
:(before "End trace Commit")
if (Trace_file) {
Trace_file << std::setw(4) << t.depth << ' ' << t.label << ": " << t.contents << '\n';
}
:(before "End One-time Setup")
atexit(cleanup_main);
:(code)
void cleanup_main() {
if (Trace_file) Trace_file.close();
// End cleanup_main
}
:(before "End trace_stream Methods")
string readable_contents(string label) {
string trim(const string& s); // prototype
ostringstream output;
label = trim(label);
for (vector<trace_line>::iterator p = past_lines.begin(); p != past_lines.end(); ++p)
if (label.empty() || label == p->label)
output << std::setw(4) << p->depth << ' ' << p->label << ": " << p->contents << '\n';
return output.str();
}
//: Miscellaneous helpers.
:(code)
string trim(const string& s) {
string::const_iterator first = s.begin();
while (first != s.end() && isspace(*first))
++first;
if (first == s.end()) return "";
string::const_iterator last = --s.end();
while (last != s.begin() && isspace(*last))
--last;
++last;
return string(first, last);
}
:(before "End Includes")
2016-06-02 17:40:06 +00:00
#include <vector>
using std::vector;
2016-06-02 17:40:06 +00:00
#include <list>
using std::list;
2016-06-02 17:40:06 +00:00
#include <set>
using std::set;
2016-06-02 17:40:06 +00:00
#include <sstream>
using std::istringstream;
using std::ostringstream;
2016-06-02 17:40:06 +00:00
#include <fstream>
using std::ifstream;
using std::ofstream;