![]() |
LRSTAR: LR(*) parser generator for C++ | A.M.D.G. |
About
Feedback
Downloads, Setup LRSTAR DFA Release Notes Contact |
LRSTAR Options Type "lrstar" and you get this: LRSTAR 20.0.001 32b Copyright LRTEC. | | LR(*) PARSER GENERATOR | | lrstar <grammar> [/<option>...] | | OPTION DEFAULT DESCRIPTION | ast 1 AST construction in parser | d 0 Debug option for the parser | k 1 k-lookaheads for LR(*) parsing | m 0 Minimize parser-table size | o 0 Optimize parser speed | st 0 State-machine listing | v 2 Verbose mode (1,2,3) | w 0 Print warnings on screen |_ LRSTAR Grammars LRSTAR reads a grammar. A grammar is a "list of rules" specifying (1) the pattern of an input file, or (2) the syntax of a language. LRSTAR reads pure grammar notation, mostly rules, action operators and function names. By using AST operators, LRSTAR parsers can construct an abstract-syntax tree (AST). Simple, concise and powerful. Lexical Tokens
<identifier>,
<integer>,
<string>,
<eof>. <error> => error(); <identifier> => lookup(); // Call symbol-table function. <integer> => lookup(); // Call symbol-table function. { '+' '-' } << // Left associative, lowest priority. { '*' '/' } << // Left associative, highest priority. Start -> Program+ <eof> *> goal_ Program -> 'program' <identifier> '{' Stmt* '}' *> program_(2) Stmt -> Target '=' Exp ';' *> store_~ Target -> <identifier> *> target_(1) Exp -> <integer> *> integer_(1) -> <identifier> *> identifier_(1) -> Exp '+' Exp *> add_ -> Exp '-' Exp *> sub_ -> Exp '*' Exp *> mul_ -> Exp '/' Exp *> div_ -> '(' Exp ')' Lexical Tokens As Defined Constants
IDENTIFIER,
INTEGER,
STRING,
EOF. However, you may have a YACC grammar which uses defined constants in the rules. In this case, you will need to put the defined constants at the top of your grammar (easier than changing the rules). These defined constants will also be available to use in your source code. Here is an example: ERROR <error> => error(); IDENTIFIER <identifier> => lookup(); // Call symbol table function. INTEGER <integer> => lookup(); // Call symbol table function. EOF <eof>; PLUS '+'; MINUS '-'; MUL '*'; DIV '/'; LBRACE '{'; RBRACE '}'; EQUALS '='; SEMICOLON ';'; LPAREN '('; RPAREN ')'; PROGRAM 'program'; Parser-Defined Terminals
{typename},
{headsymbol},
{function_name},
{array_name}. <identifier> => lookup(); // Call symbol-table function. Goal : Declaration* <eof> ; Declaration : TypeSpec+ Identifier ';' | typedef TypeSpec+ Typename ';' ; Identifier : <identifier> ; Typename : <identifier> => typedef_(1,{typename}) ; TypeSpec : char | int | short | {typename} ; Two Special Symbols
<eof> End-of-file symbol.
<eof> Start -> CompilationUnit <eof>
<error> EBNF Notation
Basic operators: Extended operators: Here are some useful examples: Start -> ExternalDef* <eof> // Zero or more ExternalDef PrimaryExp -> <string>+ // One or more <string> -> <identifier> Default -> (default ':' Stmt*)? // Optional group PostfixExp -> PrimaryExp -> {function} '(' [Args] ')' // Optional Args Args -> <identifier>/','+ // One or more separated by ','s Five Action Operators
=> Call a Token Action or a Parse Action. Here's where they fit in to the parser system:
(1) Get token => Call a Token Action Here is a simple grammar, with action operators in pink: <error> => error(); <identifier> => lookup(); // Call symbol table function. <integer> => lookup(); // Call symbol table function. { '+' '-' } << // Left associative, lowest priority. { '*' '/' } << // Left associative, highest priority. Start -> Program+ <eof> *> goal_ Program -> 'program' <identifier> '{' Stmt* '}' *> program_(2) Stmt -> Target '=' Exp ';' *> store_~ Target -> <identifier> *> target_(1) Exp -> <integer> *> integer_(1) -> <identifier> *> identifier_(1) -> Exp '+' Exp *> add_ -> Exp '-' Exp *> sub_ -> Exp '*' Exp *> mul_ -> Exp '/' Exp *> div_ -> '(' Exp ')' Reverse Operator ~ Reverse the child nodes. In the AST, some subtrees work much better when the order is "reversed" from the original input order. In the following case, we want to evaluate the Exp before we store the result in the Target, so we use the ~ after the Node Action name. See the Calc project. Stmt -> Target '=' Exp ';' *> store_~ Three Argument Arrays
arga[], argx[], argy[]. Token Actions: arga[] Grammar: <alpha> => headsym ({headsymbol}); Function: int LRSTAR_TokenActions::headsym (int& t) { int original_t = t; char* p = token.start-1; while (*p == ' ' || *p == '\t') p--; if (*p == '\n') // First symbol on this line? { t = arga[t]; // Set to {headsymbol} } return lookup (original_t); } Parse Actions: 1st argument = argx[], 2nd argument = argy[] Grammar: TypedefIdentifier -> <identifier> => typedef_ (1,{typename}) Function: int Typedef_ParseActions::typedef_ (int p) { int x = argx[p]; int sti = PS[x].sti; // Get sti from parse stack. symbol[sti].term = argy[p]; // Set term to {typename}. return 1; } |
(c) Copyright LRTEC 2020. All rights reserved. |