GRU DeerTree specifications

GRU DeerTree standard library specifications

DeerTree is high-level programming language, focused on simplification of low-level development.

Lexical items

Comments

  • One line comment starts with // and ends with a newline.
  • Multiple lines comment starts with /* and ends with */.

Tokens

  • Token is minimal language unit in DeerTree.
  • There several types of tokens in DeerTree: /identifiers/, /keywords/ and /punctuation/.

Identifiers

  • Identifiers are program elements.
  • Keywords can't be used as identifiers.

Keywords

  • Keywords are predefined and reserved words.
  • They are associated with specific features.
  • DeerTree keywords:
    fn, import, int, double, long, float, char, byte,
    string, bool, true, false, return, if, else, const,
    switch, case, for, do, while, break, continue, struct,
    enum, union, type, goto, unsigned, extern, strict
    
    (auto), (until), (define) -- this is not needed
    

Punctuation

  • Punctutation is character sequence that represents operators and punctuation.
  • DeerTree punctuation:
    '+', '-', '*', '/', '%', '++', '--'
    '&', '&&', '|', '||', '^', '<<', '>>',
    '+=', '-=', '*=', '/=', '%=', '&=', '|=', '^=',
    '>', '>=', '<', '<=', '=', '==', '!', '~'
    '(', ')', '{', '}', '[', ']', ',', '.', ';', ':'
    

Integer literals

  • An integer literal is sequence of digits that represents integer contstant.
  • Prefix before integer literal sets non-decimal base of it: 0b or 0B - binary, 0o or 0O - octal, 0x or 0X - hexadecimal.
  • Underscores don't change literal value, if they are after base prefix. Underscores can't be before integer literal or after.
  • Integer literals examples:
    123
    12_34
    0b0101010
    0B0101010
    0o20
    0O20
    0xff
    0xFF
    0Xff
    0XFF
    0xFF_FF_FF
    
    _12 // invalid: handled as identifier
    12_ // invalid: can't be in the end of integer literal, must separate digits
    12__34 // invalid: there can't be two underscores in a row
    0_xff // invalid: underscore can't be in base prefix
    

Float literals

  • A floating-point literal is a decimal or hexadecimal representation of a floating-point constant.
  • A decimal floating-point literal consists of an integer part (decimal digits), a decimal point, a fractional part (decimal digits), and an exponent part (e or E followed by an optional sign and decimal digits).
  • A hexadecimal floating-point literal consists of a 0x or 0X prefix, an integer part (hexadecimal digits), a radix point, a fractional part (hexadecimal digits), and an exponent part (p or P followed by an optional sign and decimal digits).
  • Underscores don't change literal value, if they are after base prefix. Underscores can't be before float literal or after.
  • Integer literals examples:
    0.
    12.34
    1.23456
    1.e+10
    3_4.
    

Rune literals

  • A rune literal represents a rune constant, an integer value identifying a Unicode code point.
  • A rune literal is expressed as one or more characters enclosed in single quotes, as in 'x' or '\n'.
  • Within the quotes, any character may appear except newline and unescaped single quote.
  • Single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats.
  • After backslash, certain single-character escapes represent special values
    \a   U+0007 alert or bell
    \b   U+0008 backspace
    \f   U+000C form feed
    \n   U+000A line feed or newline
    \r   U+000D carriage return
    \t   U+0009 horizontal tab
    \v   U+000B vertical tab
    \\   U+005C backslash
    \'   U+0027 single quote  (valid escape only within rune literals)
    \"   U+0022 double quote  (valid escape only within string literals)
    
  • Rune literals examples
    'a'
    '\n'
    '\000'
    '\xFFF'
    'aa' // invalid: too many characters
    

String literals

  • A string literal represents a string constant.
  • A string is sequence of characters.
  • The text between the quotes forms the value of the literal, with backslash escapes interpreted as rune literals.
  • String literals examples
    "a"
    "aaa"
    "\n"
    "\""
    "Hello, World\n"
    

TODO Constants

Variables

  • A variable is a location for storing a value.
  • Variables have /types/.
  • Each element of array acts as a variable.

Variables examples

int a;
float b;
char[] c;
int a = 12;
a = 34;
b = 56;
a = b;

TODO Types

  • A type can be defined using type command.

Boolean types

  • A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false.
  • bool is predeclared boolean type.

Numeric types

  • A numeric type represents a set of integer or floating-point values.

Predeclared numeric types

// TODO: unsigned and others
int    - 32-bit integer
double - 64-bit integer
long   - 64-bit integer
float  - 32-bit floating-point numbers
char   - 8-bit integer
byte   - alias for char

String types

  • A string type represents the set of string values
  • A string value is a (possibly empty) sequence of character
  • string is predeclared string type

Array types

Function types

Dictionary types

TODO Declarations

Label scopes

Constant declarations

Type declarations

Variable declarations

[type] [var-name];
[type] [var-name] = [value];

Function declarations

  • A function declaration binds an identifier to function as /function name/
fn [function-name] '(' { arguments_list } ')' { '(' { type } ')' } '{' body '}'

Example

fn inc_number(int num) (int) {
  return num+1;
}

TODO Expressions

TODO Statements

Empty statements

Labeled statements

[identifier] ':' body

Expression statements

expression

Increment/Decrement statements

expression ( "++" | "--" )

Assignments

[identifier] ( '+=' | '-=' | '*=' | '/=' | '%=' | '|=' | '^=' | '&=' | '=' ) expression

If statements

"if" '(' condition ')' '{' body '}' ( ( "else if" | "else" ) )

switch statements

"switch" '(' expression ')' '{' ( "case" expression | "else" ) body '}'

for statements

"for" '(' condition ')' '{' body '}'

while statements

"while" '(' condition ')' '{' body '}'

return statements

  • "return" statement terminates execution of function and provided result values.
return { expression }

break statements

"break"

continue statements

"continue"

goto statements

  • "goto" statement transfers control to statement with corresponding label
goto [label]

TODO Built-in functions

sizeof()

len()

compare()

Printing functions

print();

println();

fprint();

fprintln()

sprint();

sprintln()

eprint();

eprintln()

Converting functions

int()

float()

char()

str()

TODO Headers