ZOBOS Language Details and Nuances
zlang.lr
zlang.cfg is an SLR(1) language, its LR table is provided here (a typeset table is here though I doubt this would be helpful) and format of the LR table is described in (lga-code-prep.pdf). For the curious among you, the CFSM "item set graph" can be examined (with a very large screen).
The formatting details of zlang.lr are specified in lga-silly-lexing.pdf as well.
In the text and slides R-N is written out across the entire row of the LR table, the .lr format puts an R-N in each column of the appropriate parsing state.
Declaration Semantics
Declarations of the form
int x = y = z = 3;
declare three integer variables, x, y and z, all of them with a value of three.
int g = h = i;
Declares three variables as well, but all of them uninitialized. And
int m, n, p = 2;
Declares three variables as well, with p initialized to two and m, n uninitialized.
Expression Tree Semantics
The type of value result in expression trees is always one of bool, float, int or string. An expression tree value type begins in the leaves or a FUNCALL and is implicit based on the type of literal value (intval, stringval, floatval), type of variable, or the return type of a function.
The value type may be altered in expression trees (as the calculation progresses up) by CAST operations, through int and float arithmetic combinations, or relational BOOL operators such as <= that create bool results.
UNARY operations do not change the value type.
PLUS and TIMES operations between int and float values yield float results
Function Semantics
In order to reduce the effort required for this project, we'll work with rather simple function semantics.
Function return types and parameters are only int, float, bool
C/C++ like function prototypes are permitted, but there can be only one per function in a source listing.
- Functions are not required to have a prototype (they can simply be defined), as in C/C++.
- Function names cannot be overloaded with a different argument list.
Functions use a visual basic like syntax that requires a valued "returns variable" to be specified before the function statement body scope is opened. In allgood-5.src below, the returns variable is r.
allgood-5.src (available in the ZOBOS archive files)
int an_undefined_function( float t ); float aFunction( int q, bool k ) returns r = 3.14159*q { if ( int(k) == q ) { r = q*q + (an_undefined_function(r)); emit symtable; } }
The type of a function is a combination of type names and forward slashes (Unix path separator symbol), this is the regular expression(ish) pattern:
returnType//(param1type(/param2type(/...)))
For the aFunction above the type or signature is
float//int/bool
The "location" of a function prototype, definition, or function call is the location of the function identifier (line and column), for aFunction in allgood-5.src above: 2 7.
Function parameters are implicitly const and initialized --- the latter might be important if the semester's ZOBOS project has you testing for UNINIT or CONST issues.
All function names exist in the global scope (this is enforced by syntax), a function's parameters and returns identifier are in scope level 1, the function statement body begins a scope 2. For the example above, the emit symtable instruction in allgood-5.src would generate the following output (see also):
allgood-5.sym (available in the ZOBOS archive files)
0,const float//int/bool,aFunction 0,int//float,an_undefined_function 1,const bool,k 1,const int,q 1,float,r
Notice that parameters q and k must have been inserted into the symbol table before r in order for r to be valued by the expression using q (at least without unnecessary hoop jumping by your implementation).
Symbol table outputs shown in ZOBOS related pages have non-deterministic ordering within scopes, so don't be bothered that the listing above shows the prototype after the definition. This does not mean that aFunction went into the symbol table before an_undefined_function.
zlang Special Statements
There are two types of special statements in in zlang: emit and rand. These are special in that they don't fit into the typical control structure or assignment pattern.
emit
emit symtable is detailed in the ZOBOS main write-up.
emit id requires id to be an arithmetic expression of type bool, int or float.
The emit id, offset, length form requires id to be a string variable and offset and length to be int values --- these might be as simple as an int variable, an int literal, or as complicated as an int valued expression with function calls.
Any identifier on the right-hand side of an emit statement should be considered USED.
In the final project, CZAR, the latter two emit forms will generate OUTPUT statements according to the usual course requirements.
rand
rand id requires id to be a bool or float non-const variable. After the statement, the identifier should be considered valued (no longer UNINIT).
The rand id, low, high form requires id to be an int non-const variable. The low and high arguments should be int valued expressions (like offset, length for the aforementioned emit statement). The id should subsequently be considered valued, and any identifiers in the low and high expressions should be considered USED.
In the final project, CZAR, the virtual machine will place a random $Bernoulli()$ value in a bool variable, a $Uniform(0,1)$ value in a float variable, and an $Equalikely(low,high)$ value in an int variable.
domain
Handling expressions containing the domain operator (which syntactically looks like a function call), eg: 3 + domain(2*4.5) is detailed in the ZOBOS main write-up.
zlang Quirks
While the project's language has many similarities to C, there are some quirky syntax related nuances1. I don't believe any of these would prevent you from scanning a source listing and understanding the intent of the code --- but if you want to test or experiment with you own .src files, you will want to know about these.
The language has some strict type rules, so there are explict casts required in many places (just look at the CONV rules (if conversion semantic tests are part of this semester's project). Boolean comparisons (BOOLS) require arithmetic expressions on either side, so to test equality between bool values you must do something like:
if( int(bool_var)==0 ) {...} if( int(bool_var)==int(true) ) {...}
- You can't disregard the return value of a function (and we don't have a "void" type), so you must do somthing like:
silly_language = f();
instead of justf();
- To incorporate a function call into an arithmetic expression, you must wrap it in parenthesis:
float x = 3.14 + (sin(4));
To emit an expression or a non-string variable value, it must be enclosed within parenthesis
emit (4+3*v); emit ((fibonnaci(8)));
In the second emit statement, the first set of () around the function is from the previous nuance (3.14+(sin(4)), the outermost () is because the arithmetic expression argument of emit must be enclosed in ().
It would be nice to have a VALUE → FUNCALL rule in the language, but this generates SLR table conflicts.
Why? In part to keep the language SDT straight-forward and maintain an SLR language without table conflicts (1)