mu/archive/1.vm/016dilated_reagent.cc

167 lines
4.4 KiB
C++
Raw Normal View History

//: An alternative syntax for reagents that permits whitespace in properties,
2015-11-13 18:18:44 +00:00
//: grouped by brackets. We'll use this ability in the next layer, when we
//: generalize types from lists to trees of properties.
5001 - drop the :(scenario) DSL I've been saying for a while[1][2][3] that adding extra abstractions makes things harder for newcomers, and adding new notations doubly so. And then I notice this DSL in my own backyard. Makes me feel like a hypocrite. [1] https://news.ycombinator.com/item?id=13565743#13570092 [2] https://lobste.rs/s/to8wpr/configuration_files_are_canary_warning [3] https://lobste.rs/s/mdmcdi/little_languages_by_jon_bentley_1986#c_3miuf2 The implementation of the DSL was also highly hacky: a) It was happening in the tangle/ tool, but was utterly unrelated to tangling layers. b) There were several persnickety constraints on the different kinds of lines and the specific order they were expected in. I kept finding bugs where the translator would silently do the wrong thing. Or the error messages sucked, and readers may be stuck looking at the generated code to figure out what happened. Fixing error messages would require a lot more code, which is one of my arguments against DSLs in the first place: they may be easy to implement, but they're hard to design to go with the grain of the underlying platform. They require lots of iteration. Is that effort worth prioritizing in this project? On the other hand, the DSL did make at least some readers' life easier, the ones who weren't immediately put off by having to learn a strange syntax. There were fewer quotes to parse, fewer backslash escapes. Anyway, since there are also people who dislike having to put up with strange syntaxes, we'll call that consideration a wash and tear this DSL out. --- This commit was sheer drudgery. Hopefully it won't need to be redone with a new DSL because I grow sick of backslashes.
2019-03-13 01:56:55 +00:00
void test_dilated_reagent() {
load(
"def main [\n"
" {1: number, foo: bar} <- copy 34\n"
"]\n"
);
CHECK_TRACE_CONTENTS(
"parse: product: {1: \"number\", \"foo\": \"bar\"}\n"
);
}
5001 - drop the :(scenario) DSL I've been saying for a while[1][2][3] that adding extra abstractions makes things harder for newcomers, and adding new notations doubly so. And then I notice this DSL in my own backyard. Makes me feel like a hypocrite. [1] https://news.ycombinator.com/item?id=13565743#13570092 [2] https://lobste.rs/s/to8wpr/configuration_files_are_canary_warning [3] https://lobste.rs/s/mdmcdi/little_languages_by_jon_bentley_1986#c_3miuf2 The implementation of the DSL was also highly hacky: a) It was happening in the tangle/ tool, but was utterly unrelated to tangling layers. b) There were several persnickety constraints on the different kinds of lines and the specific order they were expected in. I kept finding bugs where the translator would silently do the wrong thing. Or the error messages sucked, and readers may be stuck looking at the generated code to figure out what happened. Fixing error messages would require a lot more code, which is one of my arguments against DSLs in the first place: they may be easy to implement, but they're hard to design to go with the grain of the underlying platform. They require lots of iteration. Is that effort worth prioritizing in this project? On the other hand, the DSL did make at least some readers' life easier, the ones who weren't immediately put off by having to learn a strange syntax. There were fewer quotes to parse, fewer backslash escapes. Anyway, since there are also people who dislike having to put up with strange syntaxes, we'll call that consideration a wash and tear this DSL out. --- This commit was sheer drudgery. Hopefully it won't need to be redone with a new DSL because I grow sick of backslashes.
2019-03-13 01:56:55 +00:00
void test_load_trailing_space_after_curly_bracket() {
load(
"def main [\n"
" # line below has a space at the end\n"
" { \n"
"]\n"
"# successfully parsed\n"
);
}
5001 - drop the :(scenario) DSL I've been saying for a while[1][2][3] that adding extra abstractions makes things harder for newcomers, and adding new notations doubly so. And then I notice this DSL in my own backyard. Makes me feel like a hypocrite. [1] https://news.ycombinator.com/item?id=13565743#13570092 [2] https://lobste.rs/s/to8wpr/configuration_files_are_canary_warning [3] https://lobste.rs/s/mdmcdi/little_languages_by_jon_bentley_1986#c_3miuf2 The implementation of the DSL was also highly hacky: a) It was happening in the tangle/ tool, but was utterly unrelated to tangling layers. b) There were several persnickety constraints on the different kinds of lines and the specific order they were expected in. I kept finding bugs where the translator would silently do the wrong thing. Or the error messages sucked, and readers may be stuck looking at the generated code to figure out what happened. Fixing error messages would require a lot more code, which is one of my arguments against DSLs in the first place: they may be easy to implement, but they're hard to design to go with the grain of the underlying platform. They require lots of iteration. Is that effort worth prioritizing in this project? On the other hand, the DSL did make at least some readers' life easier, the ones who weren't immediately put off by having to learn a strange syntax. There were fewer quotes to parse, fewer backslash escapes. Anyway, since there are also people who dislike having to put up with strange syntaxes, we'll call that consideration a wash and tear this DSL out. --- This commit was sheer drudgery. Hopefully it won't need to be redone with a new DSL because I grow sick of backslashes.
2019-03-13 01:56:55 +00:00
void test_dilated_reagent_with_comment() {
load(
"def main [\n"
" {1: number, foo: bar} <- copy 34 # test comment\n"
"]\n"
);
CHECK_TRACE_CONTENTS(
"parse: product: {1: \"number\", \"foo\": \"bar\"}\n"
);
CHECK_TRACE_COUNT("error", 0);
}
2016-01-19 00:37:56 +00:00
5001 - drop the :(scenario) DSL I've been saying for a while[1][2][3] that adding extra abstractions makes things harder for newcomers, and adding new notations doubly so. And then I notice this DSL in my own backyard. Makes me feel like a hypocrite. [1] https://news.ycombinator.com/item?id=13565743#13570092 [2] https://lobste.rs/s/to8wpr/configuration_files_are_canary_warning [3] https://lobste.rs/s/mdmcdi/little_languages_by_jon_bentley_1986#c_3miuf2 The implementation of the DSL was also highly hacky: a) It was happening in the tangle/ tool, but was utterly unrelated to tangling layers. b) There were several persnickety constraints on the different kinds of lines and the specific order they were expected in. I kept finding bugs where the translator would silently do the wrong thing. Or the error messages sucked, and readers may be stuck looking at the generated code to figure out what happened. Fixing error messages would require a lot more code, which is one of my arguments against DSLs in the first place: they may be easy to implement, but they're hard to design to go with the grain of the underlying platform. They require lots of iteration. Is that effort worth prioritizing in this project? On the other hand, the DSL did make at least some readers' life easier, the ones who weren't immediately put off by having to learn a strange syntax. There were fewer quotes to parse, fewer backslash escapes. Anyway, since there are also people who dislike having to put up with strange syntaxes, we'll call that consideration a wash and tear this DSL out. --- This commit was sheer drudgery. Hopefully it won't need to be redone with a new DSL because I grow sick of backslashes.
2019-03-13 01:56:55 +00:00
void test_dilated_reagent_with_comment_immediately_following() {
load(
"def main [\n"
" 1:number <- copy {34: literal} # test comment\n"
"]\n"
);
CHECK_TRACE_COUNT("error", 0);
}
2016-01-19 00:37:56 +00:00
//: First augment next_word to group balanced brackets together.
2015-10-27 18:31:05 +00:00
:(before "End next_word Special-cases")
2015-11-17 06:39:14 +00:00
if (in.peek() == '(')
return slurp_balanced_bracket(in);
// treat curlies mostly like parens, but don't mess up labels
if (start_of_dilated_reagent(in))
return slurp_balanced_bracket(in);
:(code)
// A curly is considered a label if it's the last thing on a line. Dilated
// reagents should remain all on one line.
bool start_of_dilated_reagent(istream& in) {
if (in.peek() != '{') return false;
int pos = in.tellg();
in.get(); // slurp '{'
skip_whitespace_but_not_newline(in);
2015-10-27 20:31:21 +00:00
char next = in.peek();
in.seekg(pos);
2015-10-27 20:31:21 +00:00
return next != '\n';
}
// Assume the first letter is an open bracket, and read everything until the
// matching close bracket.
// We balance {} () and [].
string slurp_balanced_bracket(istream& in) {
ostringstream result;
char c;
list<char> open_brackets;
while (in >> c) {
if (c == '(') open_brackets.push_back(c);
if (c == ')') {
2016-08-22 15:39:05 +00:00
if (open_brackets.empty() || open_brackets.back() != '(') {
raise << "unbalanced ')'\n" << end();
continue;
}
assert(open_brackets.back() == '(');
open_brackets.pop_back();
}
if (c == '[') open_brackets.push_back(c);
if (c == ']') {
2016-08-22 15:39:05 +00:00
if (open_brackets.empty() || open_brackets.back() != '[') {
raise << "unbalanced ']'\n" << end();
continue;
}
open_brackets.pop_back();
}
if (c == '{') open_brackets.push_back(c);
if (c == '}') {
2016-08-22 15:39:05 +00:00
if (open_brackets.empty() || open_brackets.back() != '{') {
raise << "unbalanced '}'\n" << end();
continue;
}
open_brackets.pop_back();
}
result << c;
if (open_brackets.empty()) break;
}
2016-01-19 00:37:56 +00:00
skip_whitespace_and_comments_but_not_newline(in);
return result.str();
}
:(after "Parsing reagent(string s)")
2016-09-12 01:07:29 +00:00
if (starts_with(s, "{")) {
2015-10-25 21:57:20 +00:00
assert(properties.empty());
istringstream in(s);
in >> std::noskipws;
in.get(); // skip '{'
name = slurp_key(in);
if (name.empty()) {
raise << "invalid reagent '" << s << "' without a name\n" << end();
return;
}
if (name == "}") {
raise << "invalid empty reagent '" << s << "'\n" << end();
return;
}
{
string s = next_word(in);
if (s.empty()) {
assert(!has_data(in));
raise << "incomplete dilated reagent at end of file (0)\n" << end();
return;
}
string_tree* type_names = new string_tree(s);
2016-08-31 16:53:11 +00:00
// End Parsing Dilated Reagent Type Property(type_names)
type = new_type_tree(type_names);
delete type_names;
}
while (has_data(in)) {
2015-10-27 20:31:21 +00:00
string key = slurp_key(in);
2015-10-25 19:39:00 +00:00
if (key.empty()) continue;
if (key == "}") continue;
string s = next_word(in);
if (s.empty()) {
assert(!has_data(in));
raise << "incomplete dilated reagent at end of file (1)\n" << end();
return;
}
string_tree* value = new string_tree(s);
2016-08-31 16:53:11 +00:00
// End Parsing Dilated Reagent Property(value)
2015-10-27 21:38:57 +00:00
properties.push_back(pair<string, string_tree*>(key, value));
}
return;
}
:(code)
2015-10-27 20:31:21 +00:00
string slurp_key(istream& in) {
string result = next_word(in);
if (result.empty()) {
assert(!has_data(in));
raise << "incomplete dilated reagent at end of file (2)\n" << end();
return result;
}
2015-11-04 20:43:09 +00:00
while (!result.empty() && *result.rbegin() == ':')
strip_last(result);
2015-11-04 20:43:09 +00:00
while (isspace(in.peek()) || in.peek() == ':')
in.get();
return result;
}