LRSTAR - Parser Generator for C++ A.M.D.G.
About Feedback Installation
and Setup
LRSTAR DFA Papers Release
Notes
Contact,
Support
BASIC ADVANCED EXPERT

DFA vs flex

DFA has been creating compressed-matrix lexers since 1987. These are TWO TIMES the speed of "flex" created lexers and about the same size. Now 33 years later, people are still recommending "flex" for speed. I guess people stop learning after they are taught ancient UNIX stuff at the university.

DFA vs re2c

The fastest lexers I have ever seen, are created by "re2c". They are up to 10% faster than DFA lexers. When you add parsing and symbol-table creation, 10% increased in lexer speed is not very noticeable. The big disadvantage of "re2c" is the compile time for the lexer, which can be a bit of a pain for languages with many keywords. In these cases, DFA has shorter compile times and smaller lexers.

Comments

DFA creates lexers that read comments (/* Created by John Doe */) at the same time as reading everything else. No separate code is needed to handle comments.

Keyword Recognition

DFA creates lexers that read keywords (int, char, float) at the same time as reading <identifier>s (x, n_of_files, return_code). There is no faster way to deal with keywords. No extra time is required for the keywords.

Keyword List

Don't make a keyword list. Your keywords are already in the syntactical grammar. The keyword list is done automatically by LRSTAR and put in the .lex file. Do you really want to manually list the 550 keywords of DB2 language?

Matching Parentheses

DFA lexers cannot handle matching parentheses (x(8),y(9)). A PDA (push-down automata) is needed to handle this. A PDA is FIVE TIMES slower that a DFA. I tested it. 99% of the time, the parser should be handling matching parentheses. So, put that in the syntactical grammar, instead.

(c) Copyright Paul B Mann 2023.  All rights reserved.