Update to ply 2.3
ext/ply/ply/lex.py: ext/ply/ply/yacc.py: ext/ply/CHANGES: ext/ply/README: ext/ply/TODO: ext/ply/doc/ply.html: ext/ply/example/ansic/clex.py: ext/ply/example/ansic/cparse.py: ext/ply/example/calc/calc.py: ext/ply/example/hedit/hedit.py: ext/ply/example/optcalc/calc.py: ext/ply/test/README: ext/ply/test/calclex.py: ext/ply/test/lex_doc1.exp: ext/ply/test/lex_doc1.py: ext/ply/test/lex_dup1.exp: ext/ply/test/lex_dup1.py: ext/ply/test/lex_dup2.exp: ext/ply/test/lex_dup2.py: ext/ply/test/lex_dup3.exp: ext/ply/test/lex_dup3.py: ext/ply/test/lex_empty.py: ext/ply/test/lex_error1.py: ext/ply/test/lex_error2.py: ext/ply/test/lex_error3.exp: ext/ply/test/lex_error3.py: ext/ply/test/lex_error4.exp: ext/ply/test/lex_error4.py: ext/ply/test/lex_hedit.exp: ext/ply/test/lex_hedit.py: ext/ply/test/lex_ignore.exp: ext/ply/test/lex_ignore.py: ext/ply/test/lex_re1.exp: ext/ply/test/lex_re1.py: ext/ply/test/lex_rule1.py: ext/ply/test/lex_token1.py: ext/ply/test/lex_token2.py: ext/ply/test/lex_token3.py: ext/ply/test/lex_token4.py: ext/ply/test/lex_token5.exp: ext/ply/test/lex_token5.py: ext/ply/test/yacc_badargs.exp: ext/ply/test/yacc_badargs.py: ext/ply/test/yacc_badprec.exp: ext/ply/test/yacc_badprec.py: ext/ply/test/yacc_badprec2.exp: ext/ply/test/yacc_badprec2.py: ext/ply/test/yacc_badrule.exp: ext/ply/test/yacc_badrule.py: ext/ply/test/yacc_badtok.exp: ext/ply/test/yacc_badtok.py: ext/ply/test/yacc_dup.exp: ext/ply/test/yacc_dup.py: ext/ply/test/yacc_error1.exp: ext/ply/test/yacc_error1.py: ext/ply/test/yacc_error2.exp: ext/ply/test/yacc_error2.py: ext/ply/test/yacc_error3.exp: ext/ply/test/yacc_error3.py: ext/ply/test/yacc_inf.exp: ext/ply/test/yacc_inf.py: ext/ply/test/yacc_missing1.exp: ext/ply/test/yacc_missing1.py: ext/ply/test/yacc_nodoc.exp: ext/ply/test/yacc_nodoc.py: ext/ply/test/yacc_noerror.exp: ext/ply/test/yacc_noerror.py: ext/ply/test/yacc_nop.exp: ext/ply/test/yacc_nop.py: ext/ply/test/yacc_notfunc.exp: ext/ply/test/yacc_notfunc.py: ext/ply/test/yacc_notok.exp: ext/ply/test/yacc_notok.py: ext/ply/test/yacc_rr.exp: ext/ply/test/yacc_rr.py: ext/ply/test/yacc_simple.exp: ext/ply/test/yacc_simple.py: ext/ply/test/yacc_sr.exp: ext/ply/test/yacc_sr.py: ext/ply/test/yacc_term1.exp: ext/ply/test/yacc_term1.py: ext/ply/test/yacc_unused.exp: ext/ply/test/yacc_unused.py: ext/ply/test/yacc_uprec.exp: ext/ply/test/yacc_uprec.py: Import patch ply.diff src/arch/isa_parser.py: everything is now within the ply package --HG-- rename : ext/ply/lex.py => ext/ply/ply/lex.py rename : ext/ply/yacc.py => ext/ply/ply/yacc.py extra : convert_revision : fca8deabd5c095bdeabd52a1f236ae1404ef106e
This commit is contained in:
parent
9f1c104ccd
commit
44ebb8d3e2
145 changed files with 7739 additions and 1534 deletions
48
ext/ply/ANNOUNCE
Normal file
48
ext/ply/ANNOUNCE
Normal file
|
@ -0,0 +1,48 @@
|
||||||
|
February 19, 2007
|
||||||
|
|
||||||
|
Announcing : PLY-2.3 (Python Lex-Yacc)
|
||||||
|
|
||||||
|
http://www.dabeaz.com/ply
|
||||||
|
|
||||||
|
I'm pleased to announce a significant new update to PLY---a 100% Python
|
||||||
|
implementation of the common parsing tools lex and yacc. PLY-2.3 is
|
||||||
|
a minor bug fix release, but also features improved performance.
|
||||||
|
|
||||||
|
If you are new to PLY, here are a few highlights:
|
||||||
|
|
||||||
|
- PLY is closely modeled after traditional lex/yacc. If you know how
|
||||||
|
to use these or similar tools in other languages, you will find
|
||||||
|
PLY to be comparable.
|
||||||
|
|
||||||
|
- PLY provides very extensive error reporting and diagnostic
|
||||||
|
information to assist in parser construction. The original
|
||||||
|
implementation was developed for instructional purposes. As
|
||||||
|
a result, the system tries to identify the most common types
|
||||||
|
of errors made by novice users.
|
||||||
|
|
||||||
|
- PLY provides full support for empty productions, error recovery,
|
||||||
|
precedence rules, and ambiguous grammars.
|
||||||
|
|
||||||
|
- Parsing is based on LR-parsing which is fast, memory efficient,
|
||||||
|
better suited to large grammars, and which has a number of nice
|
||||||
|
properties when dealing with syntax errors and other parsing
|
||||||
|
problems. Currently, PLY can build its parsing tables using
|
||||||
|
either SLR or LALR(1) algorithms.
|
||||||
|
|
||||||
|
- PLY can be used to build parsers for large programming languages.
|
||||||
|
Although it is not ultra-fast due to its Python implementation,
|
||||||
|
PLY can be used to parse grammars consisting of several hundred
|
||||||
|
rules (as might be found for a language like C). The lexer and LR
|
||||||
|
parser are also reasonably efficient when parsing normal
|
||||||
|
sized programs.
|
||||||
|
|
||||||
|
More information about PLY can be obtained on the PLY webpage at:
|
||||||
|
|
||||||
|
http://www.dabeaz.com/ply
|
||||||
|
|
||||||
|
PLY is freely available and is licensed under the terms of the Lesser
|
||||||
|
GNU Public License (LGPL).
|
||||||
|
|
||||||
|
Cheers,
|
||||||
|
|
||||||
|
David Beazley (http://www.dabeaz.com)
|
579
ext/ply/CHANGES
579
ext/ply/CHANGES
|
@ -1,3 +1,582 @@
|
||||||
|
Version 2.3
|
||||||
|
-----------------------------
|
||||||
|
02/20/07: beazley
|
||||||
|
Fixed a bug with character literals if the literal '.' appeared as the
|
||||||
|
last symbol of a grammar rule. Reported by Ales Smrcka.
|
||||||
|
|
||||||
|
02/19/07: beazley
|
||||||
|
Warning messages are now redirected to stderr instead of being printed
|
||||||
|
to standard output.
|
||||||
|
|
||||||
|
02/19/07: beazley
|
||||||
|
Added a warning message to lex.py if it detects a literal backslash
|
||||||
|
character inside the t_ignore declaration. This is to help
|
||||||
|
problems that might occur if someone accidentally defines t_ignore
|
||||||
|
as a Python raw string. For example:
|
||||||
|
|
||||||
|
t_ignore = r' \t'
|
||||||
|
|
||||||
|
The idea for this is from an email I received from David Cimimi who
|
||||||
|
reported bizarre behavior in lexing as a result of defining t_ignore
|
||||||
|
as a raw string by accident.
|
||||||
|
|
||||||
|
02/18/07: beazley
|
||||||
|
Performance improvements. Made some changes to the internal
|
||||||
|
table organization and LR parser to improve parsing performance.
|
||||||
|
|
||||||
|
02/18/07: beazley
|
||||||
|
Automatic tracking of line number and position information must now be
|
||||||
|
enabled by a special flag to parse(). For example:
|
||||||
|
|
||||||
|
yacc.parse(data,tracking=True)
|
||||||
|
|
||||||
|
In many applications, it's just not that important to have the
|
||||||
|
parser automatically track all line numbers. By making this an
|
||||||
|
optional feature, it allows the parser to run significantly faster
|
||||||
|
(more than a 20% speed increase in many cases). Note: positional
|
||||||
|
information is always available for raw tokens---this change only
|
||||||
|
applies to positional information associated with nonterminal
|
||||||
|
grammar symbols.
|
||||||
|
*** POTENTIAL INCOMPATIBILITY ***
|
||||||
|
|
||||||
|
02/18/07: beazley
|
||||||
|
Yacc no longer supports extended slices of grammar productions.
|
||||||
|
However, it does support regular slices. For example:
|
||||||
|
|
||||||
|
def p_foo(p):
|
||||||
|
'''foo: a b c d e'''
|
||||||
|
p[0] = p[1:3]
|
||||||
|
|
||||||
|
This change is a performance improvement to the parser--it streamlines
|
||||||
|
normal access to the grammar values since slices are now handled in
|
||||||
|
a __getslice__() method as opposed to __getitem__().
|
||||||
|
|
||||||
|
02/12/07: beazley
|
||||||
|
Fixed a bug in the handling of token names when combined with
|
||||||
|
start conditions. Bug reported by Todd O'Bryan.
|
||||||
|
|
||||||
|
Version 2.2
|
||||||
|
------------------------------
|
||||||
|
11/01/06: beazley
|
||||||
|
Added lexpos() and lexspan() methods to grammar symbols. These
|
||||||
|
mirror the same functionality of lineno() and linespan(). For
|
||||||
|
example:
|
||||||
|
|
||||||
|
def p_expr(p):
|
||||||
|
'expr : expr PLUS expr'
|
||||||
|
p.lexpos(1) # Lexing position of left-hand-expression
|
||||||
|
p.lexpos(1) # Lexing position of PLUS
|
||||||
|
start,end = p.lexspan(3) # Lexing range of right hand expression
|
||||||
|
|
||||||
|
11/01/06: beazley
|
||||||
|
Minor change to error handling. The recommended way to skip characters
|
||||||
|
in the input is to use t.lexer.skip() as shown here:
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
print "Illegal character '%s'" % t.value[0]
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
The old approach of just using t.skip(1) will still work, but won't
|
||||||
|
be documented.
|
||||||
|
|
||||||
|
10/31/06: beazley
|
||||||
|
Discarded tokens can now be specified as simple strings instead of
|
||||||
|
functions. To do this, simply include the text "ignore_" in the
|
||||||
|
token declaration. For example:
|
||||||
|
|
||||||
|
t_ignore_cppcomment = r'//.*'
|
||||||
|
|
||||||
|
Previously, this had to be done with a function. For example:
|
||||||
|
|
||||||
|
def t_ignore_cppcomment(t):
|
||||||
|
r'//.*'
|
||||||
|
pass
|
||||||
|
|
||||||
|
If start conditions/states are being used, state names should appear
|
||||||
|
before the "ignore_" text.
|
||||||
|
|
||||||
|
10/19/06: beazley
|
||||||
|
The Lex module now provides support for flex-style start conditions
|
||||||
|
as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html.
|
||||||
|
Please refer to this document to understand this change note. Refer to
|
||||||
|
the PLY documentation for PLY-specific explanation of how this works.
|
||||||
|
|
||||||
|
To use start conditions, you first need to declare a set of states in
|
||||||
|
your lexer file:
|
||||||
|
|
||||||
|
states = (
|
||||||
|
('foo','exclusive'),
|
||||||
|
('bar','inclusive')
|
||||||
|
)
|
||||||
|
|
||||||
|
This serves the same role as the %s and %x specifiers in flex.
|
||||||
|
|
||||||
|
One a state has been declared, tokens for that state can be
|
||||||
|
declared by defining rules of the form t_state_TOK. For example:
|
||||||
|
|
||||||
|
t_PLUS = '\+' # Rule defined in INITIAL state
|
||||||
|
t_foo_NUM = '\d+' # Rule defined in foo state
|
||||||
|
t_bar_NUM = '\d+' # Rule defined in bar state
|
||||||
|
|
||||||
|
t_foo_bar_NUM = '\d+' # Rule defined in both foo and bar
|
||||||
|
t_ANY_NUM = '\d+' # Rule defined in all states
|
||||||
|
|
||||||
|
In addition to defining tokens for each state, the t_ignore and t_error
|
||||||
|
specifications can be customized for specific states. For example:
|
||||||
|
|
||||||
|
t_foo_ignore = " " # Ignored characters for foo state
|
||||||
|
def t_bar_error(t):
|
||||||
|
# Handle errors in bar state
|
||||||
|
|
||||||
|
With token rules, the following methods can be used to change states
|
||||||
|
|
||||||
|
def t_TOKNAME(t):
|
||||||
|
t.lexer.begin('foo') # Begin state 'foo'
|
||||||
|
t.lexer.push_state('foo') # Begin state 'foo', push old state
|
||||||
|
# onto a stack
|
||||||
|
t.lexer.pop_state() # Restore previous state
|
||||||
|
t.lexer.current_state() # Returns name of current state
|
||||||
|
|
||||||
|
These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and
|
||||||
|
yy_top_state() functions in flex.
|
||||||
|
|
||||||
|
The use of start states can be used as one way to write sub-lexers.
|
||||||
|
For example, the lexer or parser might instruct the lexer to start
|
||||||
|
generating a different set of tokens depending on the context.
|
||||||
|
|
||||||
|
example/yply/ylex.py shows the use of start states to grab C/C++
|
||||||
|
code fragments out of traditional yacc specification files.
|
||||||
|
|
||||||
|
*** NEW FEATURE *** Suggested by Daniel Larraz with whom I also
|
||||||
|
discussed various aspects of the design.
|
||||||
|
|
||||||
|
10/19/06: beazley
|
||||||
|
Minor change to the way in which yacc.py was reporting shift/reduce
|
||||||
|
conflicts. Although the underlying LALR(1) algorithm was correct,
|
||||||
|
PLY was under-reporting the number of conflicts compared to yacc/bison
|
||||||
|
when precedence rules were in effect. This change should make PLY
|
||||||
|
report the same number of conflicts as yacc.
|
||||||
|
|
||||||
|
10/19/06: beazley
|
||||||
|
Modified yacc so that grammar rules could also include the '-'
|
||||||
|
character. For example:
|
||||||
|
|
||||||
|
def p_expr_list(p):
|
||||||
|
'expression-list : expression-list expression'
|
||||||
|
|
||||||
|
Suggested by Oldrich Jedlicka.
|
||||||
|
|
||||||
|
10/18/06: beazley
|
||||||
|
Attribute lexer.lexmatch added so that token rules can access the re
|
||||||
|
match object that was generated. For example:
|
||||||
|
|
||||||
|
def t_FOO(t):
|
||||||
|
r'some regex'
|
||||||
|
m = t.lexer.lexmatch
|
||||||
|
# Do something with m
|
||||||
|
|
||||||
|
|
||||||
|
This may be useful if you want to access named groups specified within
|
||||||
|
the regex for a specific token. Suggested by Oldrich Jedlicka.
|
||||||
|
|
||||||
|
10/16/06: beazley
|
||||||
|
Changed the error message that results if an illegal character
|
||||||
|
is encountered and no default error function is defined in lex.
|
||||||
|
The exception is now more informative about the actual cause of
|
||||||
|
the error.
|
||||||
|
|
||||||
|
Version 2.1
|
||||||
|
------------------------------
|
||||||
|
10/02/06: beazley
|
||||||
|
The last Lexer object built by lex() can be found in lex.lexer.
|
||||||
|
The last Parser object built by yacc() can be found in yacc.parser.
|
||||||
|
|
||||||
|
10/02/06: beazley
|
||||||
|
New example added: examples/yply
|
||||||
|
|
||||||
|
This example uses PLY to convert Unix-yacc specification files to
|
||||||
|
PLY programs with the same grammar. This may be useful if you
|
||||||
|
want to convert a grammar from bison/yacc to use with PLY.
|
||||||
|
|
||||||
|
10/02/06: beazley
|
||||||
|
Added support for a start symbol to be specified in the yacc
|
||||||
|
input file itself. Just do this:
|
||||||
|
|
||||||
|
start = 'name'
|
||||||
|
|
||||||
|
where 'name' matches some grammar rule. For example:
|
||||||
|
|
||||||
|
def p_name(p):
|
||||||
|
'name : A B C'
|
||||||
|
...
|
||||||
|
|
||||||
|
This mirrors the functionality of the yacc %start specifier.
|
||||||
|
|
||||||
|
09/30/06: beazley
|
||||||
|
Some new examples added.:
|
||||||
|
|
||||||
|
examples/GardenSnake : A simple indentation based language similar
|
||||||
|
to Python. Shows how you might handle
|
||||||
|
whitespace. Contributed by Andrew Dalke.
|
||||||
|
|
||||||
|
examples/BASIC : An implementation of 1964 Dartmouth BASIC.
|
||||||
|
Contributed by Dave against his better
|
||||||
|
judgement.
|
||||||
|
|
||||||
|
09/28/06: beazley
|
||||||
|
Minor patch to allow named groups to be used in lex regular
|
||||||
|
expression rules. For example:
|
||||||
|
|
||||||
|
t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)'''
|
||||||
|
|
||||||
|
Patch submitted by Adam Ring.
|
||||||
|
|
||||||
|
09/28/06: beazley
|
||||||
|
LALR(1) is now the default parsing method. To use SLR, use
|
||||||
|
yacc.yacc(method="SLR"). Note: there is no performance impact
|
||||||
|
on parsing when using LALR(1) instead of SLR. However, constructing
|
||||||
|
the parsing tables will take a little longer.
|
||||||
|
|
||||||
|
09/26/06: beazley
|
||||||
|
Change to line number tracking. To modify line numbers, modify
|
||||||
|
the line number of the lexer itself. For example:
|
||||||
|
|
||||||
|
def t_NEWLINE(t):
|
||||||
|
r'\n'
|
||||||
|
t.lexer.lineno += 1
|
||||||
|
|
||||||
|
This modification is both cleanup and a performance optimization.
|
||||||
|
In past versions, lex was monitoring every token for changes in
|
||||||
|
the line number. This extra processing is unnecessary for a vast
|
||||||
|
majority of tokens. Thus, this new approach cleans it up a bit.
|
||||||
|
|
||||||
|
*** POTENTIAL INCOMPATIBILITY ***
|
||||||
|
You will need to change code in your lexer that updates the line
|
||||||
|
number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1"
|
||||||
|
|
||||||
|
09/26/06: beazley
|
||||||
|
Added the lexing position to tokens as an attribute lexpos. This
|
||||||
|
is the raw index into the input text at which a token appears.
|
||||||
|
This information can be used to compute column numbers and other
|
||||||
|
details (e.g., scan backwards from lexpos to the first newline
|
||||||
|
to get a column position).
|
||||||
|
|
||||||
|
09/25/06: beazley
|
||||||
|
Changed the name of the __copy__() method on the Lexer class
|
||||||
|
to clone(). This is used to clone a Lexer object (e.g., if
|
||||||
|
you're running different lexers at the same time).
|
||||||
|
|
||||||
|
09/21/06: beazley
|
||||||
|
Limitations related to the use of the re module have been eliminated.
|
||||||
|
Several users reported problems with regular expressions exceeding
|
||||||
|
more than 100 named groups. To solve this, lex.py is now capable
|
||||||
|
of automatically splitting its master regular regular expression into
|
||||||
|
smaller expressions as needed. This should, in theory, make it
|
||||||
|
possible to specify an arbitrarily large number of tokens.
|
||||||
|
|
||||||
|
09/21/06: beazley
|
||||||
|
Improved error checking in lex.py. Rules that match the empty string
|
||||||
|
are now rejected (otherwise they cause the lexer to enter an infinite
|
||||||
|
loop). An extra check for rules containing '#' has also been added.
|
||||||
|
Since lex compiles regular expressions in verbose mode, '#' is interpreted
|
||||||
|
as a regex comment, it is critical to use '\#' instead.
|
||||||
|
|
||||||
|
09/18/06: beazley
|
||||||
|
Added a @TOKEN decorator function to lex.py that can be used to
|
||||||
|
define token rules where the documentation string might be computed
|
||||||
|
in some way.
|
||||||
|
|
||||||
|
digit = r'([0-9])'
|
||||||
|
nondigit = r'([_A-Za-z])'
|
||||||
|
identifier = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)'
|
||||||
|
|
||||||
|
from ply.lex import TOKEN
|
||||||
|
|
||||||
|
@TOKEN(identifier)
|
||||||
|
def t_ID(t):
|
||||||
|
# Do whatever
|
||||||
|
|
||||||
|
The @TOKEN decorator merely sets the documentation string of the
|
||||||
|
associated token function as needed for lex to work.
|
||||||
|
|
||||||
|
Note: An alternative solution is the following:
|
||||||
|
|
||||||
|
def t_ID(t):
|
||||||
|
# Do whatever
|
||||||
|
|
||||||
|
t_ID.__doc__ = identifier
|
||||||
|
|
||||||
|
Note: Decorators require the use of Python 2.4 or later. If compatibility
|
||||||
|
with old versions is needed, use the latter solution.
|
||||||
|
|
||||||
|
The need for this feature was suggested by Cem Karan.
|
||||||
|
|
||||||
|
09/14/06: beazley
|
||||||
|
Support for single-character literal tokens has been added to yacc.
|
||||||
|
These literals must be enclosed in quotes. For example:
|
||||||
|
|
||||||
|
def p_expr(p):
|
||||||
|
"expr : expr '+' expr"
|
||||||
|
...
|
||||||
|
|
||||||
|
def p_expr(p):
|
||||||
|
'expr : expr "-" expr'
|
||||||
|
...
|
||||||
|
|
||||||
|
In addition to this, it is necessary to tell the lexer module about
|
||||||
|
literal characters. This is done by defining the variable 'literals'
|
||||||
|
as a list of characters. This should be defined in the module that
|
||||||
|
invokes the lex.lex() function. For example:
|
||||||
|
|
||||||
|
literals = ['+','-','*','/','(',')','=']
|
||||||
|
|
||||||
|
or simply
|
||||||
|
|
||||||
|
literals = '+=*/()='
|
||||||
|
|
||||||
|
It is important to note that literals can only be a single character.
|
||||||
|
When the lexer fails to match a token using its normal regular expression
|
||||||
|
rules, it will check the current character against the literal list.
|
||||||
|
If found, it will be returned with a token type set to match the literal
|
||||||
|
character. Otherwise, an illegal character will be signalled.
|
||||||
|
|
||||||
|
|
||||||
|
09/14/06: beazley
|
||||||
|
Modified PLY to install itself as a proper Python package called 'ply'.
|
||||||
|
This will make it a little more friendly to other modules. This
|
||||||
|
changes the usage of PLY only slightly. Just do this to import the
|
||||||
|
modules
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
import ply.yacc as yacc
|
||||||
|
|
||||||
|
Alternatively, you can do this:
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
Which imports both the lex and yacc modules.
|
||||||
|
Change suggested by Lee June.
|
||||||
|
|
||||||
|
09/13/06: beazley
|
||||||
|
Changed the handling of negative indices when used in production rules.
|
||||||
|
A negative production index now accesses already parsed symbols on the
|
||||||
|
parsing stack. For example,
|
||||||
|
|
||||||
|
def p_foo(p):
|
||||||
|
"foo: A B C D"
|
||||||
|
print p[1] # Value of 'A' symbol
|
||||||
|
print p[2] # Value of 'B' symbol
|
||||||
|
print p[-1] # Value of whatever symbol appears before A
|
||||||
|
# on the parsing stack.
|
||||||
|
|
||||||
|
p[0] = some_val # Sets the value of the 'foo' grammer symbol
|
||||||
|
|
||||||
|
This behavior makes it easier to work with embedded actions within the
|
||||||
|
parsing rules. For example, in C-yacc, it is possible to write code like
|
||||||
|
this:
|
||||||
|
|
||||||
|
bar: A { printf("seen an A = %d\n", $1); } B { do_stuff; }
|
||||||
|
|
||||||
|
In this example, the printf() code executes immediately after A has been
|
||||||
|
parsed. Within the embedded action code, $1 refers to the A symbol on
|
||||||
|
the stack.
|
||||||
|
|
||||||
|
To perform this equivalent action in PLY, you need to write a pair
|
||||||
|
of rules like this:
|
||||||
|
|
||||||
|
def p_bar(p):
|
||||||
|
"bar : A seen_A B"
|
||||||
|
do_stuff
|
||||||
|
|
||||||
|
def p_seen_A(p):
|
||||||
|
"seen_A :"
|
||||||
|
print "seen an A =", p[-1]
|
||||||
|
|
||||||
|
The second rule "seen_A" is merely a empty production which should be
|
||||||
|
reduced as soon as A is parsed in the "bar" rule above. The use
|
||||||
|
of the negative index p[-1] is used to access whatever symbol appeared
|
||||||
|
before the seen_A symbol.
|
||||||
|
|
||||||
|
This feature also makes it possible to support inherited attributes.
|
||||||
|
For example:
|
||||||
|
|
||||||
|
def p_decl(p):
|
||||||
|
"decl : scope name"
|
||||||
|
|
||||||
|
def p_scope(p):
|
||||||
|
"""scope : GLOBAL
|
||||||
|
| LOCAL"""
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_name(p):
|
||||||
|
"name : ID"
|
||||||
|
if p[-1] == "GLOBAL":
|
||||||
|
# ...
|
||||||
|
else if p[-1] == "LOCAL":
|
||||||
|
#...
|
||||||
|
|
||||||
|
In this case, the name rule is inheriting an attribute from the
|
||||||
|
scope declaration that precedes it.
|
||||||
|
|
||||||
|
*** POTENTIAL INCOMPATIBILITY ***
|
||||||
|
If you are currently using negative indices within existing grammar rules,
|
||||||
|
your code will break. This should be extremely rare if non-existent in
|
||||||
|
most cases. The argument to various grammar rules is not usually not
|
||||||
|
processed in the same way as a list of items.
|
||||||
|
|
||||||
|
Version 2.0
|
||||||
|
------------------------------
|
||||||
|
09/07/06: beazley
|
||||||
|
Major cleanup and refactoring of the LR table generation code. Both SLR
|
||||||
|
and LALR(1) table generation is now performed by the same code base with
|
||||||
|
only minor extensions for extra LALR(1) processing.
|
||||||
|
|
||||||
|
09/07/06: beazley
|
||||||
|
Completely reimplemented the entire LALR(1) parsing engine to use the
|
||||||
|
DeRemer and Pennello algorithm for calculating lookahead sets. This
|
||||||
|
significantly improves the performance of generating LALR(1) tables
|
||||||
|
and has the added feature of actually working correctly! If you
|
||||||
|
experienced weird behavior with LALR(1) in prior releases, this should
|
||||||
|
hopefully resolve all of those problems. Many thanks to
|
||||||
|
Andrew Waters and Markus Schoepflin for submitting bug reports
|
||||||
|
and helping me test out the revised LALR(1) support.
|
||||||
|
|
||||||
|
Version 1.8
|
||||||
|
------------------------------
|
||||||
|
08/02/06: beazley
|
||||||
|
Fixed a problem related to the handling of default actions in LALR(1)
|
||||||
|
parsing. If you experienced subtle and/or bizarre behavior when trying
|
||||||
|
to use the LALR(1) engine, this may correct those problems. Patch
|
||||||
|
contributed by Russ Cox. Note: This patch has been superceded by
|
||||||
|
revisions for LALR(1) parsing in Ply-2.0.
|
||||||
|
|
||||||
|
08/02/06: beazley
|
||||||
|
Added support for slicing of productions in yacc.
|
||||||
|
Patch contributed by Patrick Mezard.
|
||||||
|
|
||||||
|
Version 1.7
|
||||||
|
------------------------------
|
||||||
|
03/02/06: beazley
|
||||||
|
Fixed infinite recursion problem ReduceToTerminals() function that
|
||||||
|
would sometimes come up in LALR(1) table generation. Reported by
|
||||||
|
Markus Schoepflin.
|
||||||
|
|
||||||
|
03/01/06: beazley
|
||||||
|
Added "reflags" argument to lex(). For example:
|
||||||
|
|
||||||
|
lex.lex(reflags=re.UNICODE)
|
||||||
|
|
||||||
|
This can be used to specify optional flags to the re.compile() function
|
||||||
|
used inside the lexer. This may be necessary for special situations such
|
||||||
|
as processing Unicode (e.g., if you want escapes like \w and \b to consult
|
||||||
|
the Unicode character property database). The need for this suggested by
|
||||||
|
Andreas Jung.
|
||||||
|
|
||||||
|
03/01/06: beazley
|
||||||
|
Fixed a bug with an uninitialized variable on repeated instantiations of parser
|
||||||
|
objects when the write_tables=0 argument was used. Reported by Michael Brown.
|
||||||
|
|
||||||
|
03/01/06: beazley
|
||||||
|
Modified lex.py to accept Unicode strings both as the regular expressions for
|
||||||
|
tokens and as input. Hopefully this is the only change needed for Unicode support.
|
||||||
|
Patch contributed by Johan Dahl.
|
||||||
|
|
||||||
|
03/01/06: beazley
|
||||||
|
Modified the class-based interface to work with new-style or old-style classes.
|
||||||
|
Patch contributed by Michael Brown (although I tweaked it slightly so it would work
|
||||||
|
with older versions of Python).
|
||||||
|
|
||||||
|
Version 1.6
|
||||||
|
------------------------------
|
||||||
|
05/27/05: beazley
|
||||||
|
Incorporated patch contributed by Christopher Stawarz to fix an extremely
|
||||||
|
devious bug in LALR(1) parser generation. This patch should fix problems
|
||||||
|
numerous people reported with LALR parsing.
|
||||||
|
|
||||||
|
05/27/05: beazley
|
||||||
|
Fixed problem with lex.py copy constructor. Reported by Dave Aitel, Aaron Lav,
|
||||||
|
and Thad Austin.
|
||||||
|
|
||||||
|
05/27/05: beazley
|
||||||
|
Added outputdir option to yacc() to control output directory. Contributed
|
||||||
|
by Christopher Stawarz.
|
||||||
|
|
||||||
|
05/27/05: beazley
|
||||||
|
Added rununit.py test script to run tests using the Python unittest module.
|
||||||
|
Contributed by Miki Tebeka.
|
||||||
|
|
||||||
|
Version 1.5
|
||||||
|
------------------------------
|
||||||
|
05/26/04: beazley
|
||||||
|
Major enhancement. LALR(1) parsing support is now working.
|
||||||
|
This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu)
|
||||||
|
and optimized by David Beazley. To use LALR(1) parsing do
|
||||||
|
the following:
|
||||||
|
|
||||||
|
yacc.yacc(method="LALR")
|
||||||
|
|
||||||
|
Computing LALR(1) parsing tables takes about twice as long as
|
||||||
|
the default SLR method. However, LALR(1) allows you to handle
|
||||||
|
more complex grammars. For example, the ANSI C grammar
|
||||||
|
(in example/ansic) has 13 shift-reduce conflicts with SLR, but
|
||||||
|
only has 1 shift-reduce conflict with LALR(1).
|
||||||
|
|
||||||
|
05/20/04: beazley
|
||||||
|
Added a __len__ method to parser production lists. Can
|
||||||
|
be used in parser rules like this:
|
||||||
|
|
||||||
|
def p_somerule(p):
|
||||||
|
"""a : B C D
|
||||||
|
| E F"
|
||||||
|
if (len(p) == 3):
|
||||||
|
# Must have been first rule
|
||||||
|
elif (len(p) == 2):
|
||||||
|
# Must be second rule
|
||||||
|
|
||||||
|
Suggested by Joshua Gerth and others.
|
||||||
|
|
||||||
|
Version 1.4
|
||||||
|
------------------------------
|
||||||
|
04/23/04: beazley
|
||||||
|
Incorporated a variety of patches contributed by Eric Raymond.
|
||||||
|
These include:
|
||||||
|
|
||||||
|
0. Cleans up some comments so they don't wrap on an 80-column display.
|
||||||
|
1. Directs compiler errors to stderr where they belong.
|
||||||
|
2. Implements and documents automatic line counting when \n is ignored.
|
||||||
|
3. Changes the way progress messages are dumped when debugging is on.
|
||||||
|
The new format is both less verbose and conveys more information than
|
||||||
|
the old, including shift and reduce actions.
|
||||||
|
|
||||||
|
04/23/04: beazley
|
||||||
|
Added a Python setup.py file to simply installation. Contributed
|
||||||
|
by Adam Kerrison.
|
||||||
|
|
||||||
|
04/23/04: beazley
|
||||||
|
Added patches contributed by Adam Kerrison.
|
||||||
|
|
||||||
|
- Some output is now only shown when debugging is enabled. This
|
||||||
|
means that PLY will be completely silent when not in debugging mode.
|
||||||
|
|
||||||
|
- An optional parameter "write_tables" can be passed to yacc() to
|
||||||
|
control whether or not parsing tables are written. By default,
|
||||||
|
it is true, but it can be turned off if you don't want the yacc
|
||||||
|
table file. Note: disabling this will cause yacc() to regenerate
|
||||||
|
the parsing table each time.
|
||||||
|
|
||||||
|
04/23/04: beazley
|
||||||
|
Added patches contributed by David McNab. This patch addes two
|
||||||
|
features:
|
||||||
|
|
||||||
|
- The parser can be supplied as a class instead of a module.
|
||||||
|
For an example of this, see the example/classcalc directory.
|
||||||
|
|
||||||
|
- Debugging output can be directed to a filename of the user's
|
||||||
|
choice. Use
|
||||||
|
|
||||||
|
yacc(debugfile="somefile.out")
|
||||||
|
|
||||||
|
|
||||||
Version 1.3
|
Version 1.3
|
||||||
------------------------------
|
------------------------------
|
||||||
12/10/02: jmdyck
|
12/10/02: jmdyck
|
||||||
|
|
149
ext/ply/README
149
ext/ply/README
|
@ -1,14 +1,8 @@
|
||||||
PLY (Python Lex-Yacc) Version 1.2 (November 27, 2002)
|
PLY (Python Lex-Yacc) Version 2.3 (February 18, 2007)
|
||||||
|
|
||||||
David M. Beazley
|
David M. Beazley (dave@dabeaz.com)
|
||||||
Department of Computer Science
|
|
||||||
University of Chicago
|
|
||||||
Chicago, IL 60637
|
|
||||||
beazley@cs.uchicago.edu
|
|
||||||
|
|
||||||
Copyright (C) 2001 David M. Beazley
|
Copyright (C) 2001-2007 David M. Beazley
|
||||||
|
|
||||||
$Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $
|
|
||||||
|
|
||||||
This library is free software; you can redistribute it and/or
|
This library is free software; you can redistribute it and/or
|
||||||
modify it under the terms of the GNU Lesser General Public
|
modify it under the terms of the GNU Lesser General Public
|
||||||
|
@ -52,11 +46,10 @@ Python, there are several reasons why you might want to consider PLY:
|
||||||
Currently, PLY builds its parsing tables using the SLR algorithm which
|
Currently, PLY builds its parsing tables using the SLR algorithm which
|
||||||
is slightly weaker than LALR(1) used in traditional yacc.
|
is slightly weaker than LALR(1) used in traditional yacc.
|
||||||
|
|
||||||
- Like John Aycock's excellent SPARK toolkit, PLY uses Python
|
- PLY uses Python introspection features to build lexers and parsers.
|
||||||
reflection to build lexers and parsers. This greatly simplifies
|
This greatly simplifies the task of parser construction since it reduces
|
||||||
the task of parser construction since it reduces the number of files
|
the number of files and eliminates the need to run a separate lex/yacc
|
||||||
and eliminates the need to run a separate lex/yacc tool before
|
tool before running your program.
|
||||||
running your program.
|
|
||||||
|
|
||||||
- PLY can be used to build parsers for "real" programming languages.
|
- PLY can be used to build parsers for "real" programming languages.
|
||||||
Although it is not ultra-fast due to its Python implementation,
|
Although it is not ultra-fast due to its Python implementation,
|
||||||
|
@ -77,51 +70,79 @@ common usability problems.
|
||||||
How to Use
|
How to Use
|
||||||
==========
|
==========
|
||||||
|
|
||||||
PLY consists of two files : lex.py and yacc.py. To use the system,
|
PLY consists of two files : lex.py and yacc.py. These are contained
|
||||||
simply copy these files to your project and import them like standard
|
within the 'ply' directory which may also be used as a Python package.
|
||||||
Python modules.
|
To use PLY, simply copy the 'ply' directory to your project and import
|
||||||
|
lex and yacc from the associated 'ply' package. For example:
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
import ply.yacc as yacc
|
||||||
|
|
||||||
|
Alternatively, you can copy just the files lex.py and yacc.py
|
||||||
|
individually and use them as modules. For example:
|
||||||
|
|
||||||
|
import lex
|
||||||
|
import yacc
|
||||||
|
|
||||||
|
The file setup.py can be used to install ply using distutils.
|
||||||
|
|
||||||
The file doc/ply.html contains complete documentation on how to use
|
The file doc/ply.html contains complete documentation on how to use
|
||||||
the system.
|
the system.
|
||||||
|
|
||||||
The example directory contains several different examples including a
|
The example directory contains several different examples including a
|
||||||
PLY specification for ANSI C as given in K&R 2nd Ed. Note: To use
|
PLY specification for ANSI C as given in K&R 2nd Ed.
|
||||||
the examples, you will need to copy the lex.py and yacc.py files to
|
|
||||||
the example directory.
|
|
||||||
|
|
||||||
A simple example is found at the end of this document
|
A simple example is found at the end of this document
|
||||||
|
|
||||||
Requirements
|
Requirements
|
||||||
============
|
============
|
||||||
PLY requires the use of Python 2.0 or greater. It should work on
|
PLY requires the use of Python 2.1 or greater. However, you should
|
||||||
just about any platform.
|
use the latest Python release if possible. It should work on just
|
||||||
|
about any platform. PLY has been tested with both CPython and Jython.
|
||||||
|
However, it does not seem to work with IronPython.
|
||||||
|
|
||||||
Resources
|
Resources
|
||||||
=========
|
=========
|
||||||
|
|
||||||
More information about PLY can be obtained on the PLY webpage at:
|
More information about PLY can be obtained on the PLY webpage at:
|
||||||
|
|
||||||
http://systems.cs.uchicago.edu/ply
|
http://www.dabeaz.com/ply
|
||||||
|
|
||||||
For a detailed overview of parsing theory, consult the excellent
|
For a detailed overview of parsing theory, consult the excellent
|
||||||
book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
|
book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
|
||||||
Ullman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown
|
Ullman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown
|
||||||
may also be useful.
|
may also be useful.
|
||||||
|
|
||||||
Given that this is the first release, I welcome your comments on how
|
A Google group for PLY can be found at
|
||||||
to improve the current implementation. See the TODO file for things that
|
|
||||||
still need to be done.
|
http://groups.google.com/group/ply-hack
|
||||||
|
|
||||||
Acknowledgments
|
Acknowledgments
|
||||||
===============
|
===============
|
||||||
|
|
||||||
A special thanks is in order for all of the students in CS326 who
|
A special thanks is in order for all of the students in CS326 who
|
||||||
suffered through about 25 different versions of these tools :-).
|
suffered through about 25 different versions of these tools :-).
|
||||||
|
|
||||||
|
The CHANGES file acknowledges those who have contributed patches.
|
||||||
|
|
||||||
|
Elias Ioup did the first implementation of LALR(1) parsing in PLY-1.x.
|
||||||
|
Andrew Waters and Markus Schoepflin were instrumental in reporting bugs
|
||||||
|
and testing a revised LALR(1) implementation for PLY-2.0.
|
||||||
|
|
||||||
|
Special Note for PLY-2.x
|
||||||
|
========================
|
||||||
|
PLY-2.0 is the first in a series of PLY releases that will be adding a
|
||||||
|
variety of significant new features. The first release in this series
|
||||||
|
(Ply-2.0) should be 100% compatible with all previous Ply-1.x releases
|
||||||
|
except for the fact that Ply-2.0 features a correct implementation of
|
||||||
|
LALR(1) table generation.
|
||||||
|
|
||||||
|
If you have suggestions for improving PLY in future 2.x releases, please
|
||||||
|
contact me. - Dave
|
||||||
|
|
||||||
Example
|
Example
|
||||||
=======
|
=======
|
||||||
|
|
||||||
Here is a simple example showing a PLY implementation of a calculator with variables.
|
Here is a simple example showing a PLY implementation of a calculator
|
||||||
|
with variables.
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
# calc.py
|
# calc.py
|
||||||
|
@ -160,14 +181,14 @@ t_ignore = " \t"
|
||||||
|
|
||||||
def t_newline(t):
|
def t_newline(t):
|
||||||
r'\n+'
|
r'\n+'
|
||||||
t.lineno += t.value.count("\n")
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
import ply.lex as lex
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
||||||
# Precedence rules for the arithmetic operators
|
# Precedence rules for the arithmetic operators
|
||||||
|
@ -180,48 +201,48 @@ precedence = (
|
||||||
# dictionary of names (for storing variables)
|
# dictionary of names (for storing variables)
|
||||||
names = { }
|
names = { }
|
||||||
|
|
||||||
def p_statement_assign(t):
|
def p_statement_assign(p):
|
||||||
'statement : NAME EQUALS expression'
|
'statement : NAME EQUALS expression'
|
||||||
names[t[1]] = t[3]
|
names[p[1]] = p[3]
|
||||||
|
|
||||||
def p_statement_expr(t):
|
def p_statement_expr(p):
|
||||||
'statement : expression'
|
'statement : expression'
|
||||||
print t[1]
|
print p[1]
|
||||||
|
|
||||||
def p_expression_binop(t):
|
def p_expression_binop(p):
|
||||||
'''expression : expression PLUS expression
|
'''expression : expression PLUS expression
|
||||||
| expression MINUS expression
|
| expression MINUS expression
|
||||||
| expression TIMES expression
|
| expression TIMES expression
|
||||||
| expression DIVIDE expression'''
|
| expression DIVIDE expression'''
|
||||||
if t[2] == '+' : t[0] = t[1] + t[3]
|
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||||
elif t[2] == '-': t[0] = t[1] - t[3]
|
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||||
elif t[2] == '*': t[0] = t[1] * t[3]
|
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||||
elif t[2] == '/': t[0] = t[1] / t[3]
|
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||||
|
|
||||||
def p_expression_uminus(t):
|
def p_expression_uminus(p):
|
||||||
'expression : MINUS expression %prec UMINUS'
|
'expression : MINUS expression %prec UMINUS'
|
||||||
t[0] = -t[2]
|
p[0] = -p[2]
|
||||||
|
|
||||||
def p_expression_group(t):
|
def p_expression_group(p):
|
||||||
'expression : LPAREN expression RPAREN'
|
'expression : LPAREN expression RPAREN'
|
||||||
t[0] = t[2]
|
p[0] = p[2]
|
||||||
|
|
||||||
def p_expression_number(t):
|
def p_expression_number(p):
|
||||||
'expression : NUMBER'
|
'expression : NUMBER'
|
||||||
t[0] = t[1]
|
p[0] = p[1]
|
||||||
|
|
||||||
def p_expression_name(t):
|
def p_expression_name(p):
|
||||||
'expression : NAME'
|
'expression : NAME'
|
||||||
try:
|
try:
|
||||||
t[0] = names[t[1]]
|
p[0] = names[p[1]]
|
||||||
except LookupError:
|
except LookupError:
|
||||||
print "Undefined name '%s'" % t[1]
|
print "Undefined name '%s'" % p[1]
|
||||||
t[0] = 0
|
p[0] = 0
|
||||||
|
|
||||||
def p_error(t):
|
def p_error(p):
|
||||||
print "Syntax error at '%s'" % t.value
|
print "Syntax error at '%s'" % p.value
|
||||||
|
|
||||||
import yacc
|
import ply.yacc as yacc
|
||||||
yacc.yacc()
|
yacc.yacc()
|
||||||
|
|
||||||
while 1:
|
while 1:
|
||||||
|
@ -232,12 +253,20 @@ while 1:
|
||||||
yacc.parse(s)
|
yacc.parse(s)
|
||||||
|
|
||||||
|
|
||||||
|
Bug Reports and Patches
|
||||||
|
=======================
|
||||||
|
Because of the extremely specialized and advanced nature of PLY, I
|
||||||
|
rarely spend much time working on it unless I receive very specific
|
||||||
|
bug-reports and/or patches to fix problems. I also try to incorporate
|
||||||
|
submitted feature requests and enhancements into each new version. To
|
||||||
|
contact me about bugs and/or new features, please send email to
|
||||||
|
dave@dabeaz.com.
|
||||||
|
|
||||||
|
In addition there is a Google group for discussing PLY related issues at
|
||||||
|
|
||||||
|
http://groups.google.com/group/ply-hack
|
||||||
|
|
||||||
|
-- Dave
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
18
ext/ply/TODO
18
ext/ply/TODO
|
@ -1,22 +1,14 @@
|
||||||
The PLY to-do list:
|
The PLY to-do list:
|
||||||
|
|
||||||
$Header: /home/stever/bk/newmem2/ext/ply/TODO 1.1 03/06/06 14:53:34-00:00 stever@ $
|
1. More interesting parsing examples.
|
||||||
|
|
||||||
1. Create a Python package using distutils
|
2. Work on the ANSI C grammar so that it can actually parse C programs. To do this,
|
||||||
|
|
||||||
2. More interesting parsing examples.
|
|
||||||
|
|
||||||
3. Work on the ANSI C grammar so that it can actually parse C programs. To do this,
|
|
||||||
some extra code needs to be added to the lexer to deal with typedef names and enumeration
|
some extra code needs to be added to the lexer to deal with typedef names and enumeration
|
||||||
constants.
|
constants.
|
||||||
|
|
||||||
4. Get LALR(1) to work. Hard, but not impossible.
|
3. More tests in the test directory.
|
||||||
|
|
||||||
5. More tests in the test directory.
|
4. Performance improvements and cleanup in yacc.py.
|
||||||
|
|
||||||
6. Performance improvements and cleanup in yacc.py.
|
5. More documentation (?).
|
||||||
|
|
||||||
7. More documentation.
|
|
||||||
|
|
||||||
8. Lots and lots of cleanup.
|
|
||||||
|
|
||||||
|
|
194
ext/ply/doc/makedoc.py
Normal file
194
ext/ply/doc/makedoc.py
Normal file
|
@ -0,0 +1,194 @@
|
||||||
|
#!/usr/local/bin/python
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# Takes a chapter as input and adds internal links and numbering to all
|
||||||
|
# of the H1, H2, H3, H4 and H5 sections.
|
||||||
|
#
|
||||||
|
# Every heading HTML tag (H1, H2 etc) is given an autogenerated name to link
|
||||||
|
# to. However, if the name is not an autogenerated name from a previous run,
|
||||||
|
# it will be kept. If it is autogenerated, it might change on subsequent runs
|
||||||
|
# of this program. Thus if you want to create links to one of the headings,
|
||||||
|
# then change the heading link name to something that does not look like an
|
||||||
|
# autogenerated link name.
|
||||||
|
###############################################################################
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import re
|
||||||
|
import string
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# Functions
|
||||||
|
###############################################################################
|
||||||
|
|
||||||
|
# Regexs for <a name="..."></a>
|
||||||
|
alink = re.compile(r"<a *name *= *\"(.*)\"></a>", re.IGNORECASE)
|
||||||
|
heading = re.compile(r"(_nn\d)", re.IGNORECASE)
|
||||||
|
|
||||||
|
def getheadingname(m):
|
||||||
|
autogeneratedheading = True;
|
||||||
|
if m.group(1) != None:
|
||||||
|
amatch = alink.match(m.group(1))
|
||||||
|
if amatch:
|
||||||
|
# A non-autogenerated heading - keep it
|
||||||
|
headingname = amatch.group(1)
|
||||||
|
autogeneratedheading = heading.match(headingname)
|
||||||
|
if autogeneratedheading:
|
||||||
|
# The heading name was either non-existent or autogenerated,
|
||||||
|
# We can create a new heading / change the existing heading
|
||||||
|
headingname = "%s_nn%d" % (filenamebase, nameindex)
|
||||||
|
return headingname
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# Main program
|
||||||
|
###############################################################################
|
||||||
|
|
||||||
|
if len(sys.argv) != 2:
|
||||||
|
print "usage: makedoc.py filename"
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
filename = sys.argv[1]
|
||||||
|
filenamebase = string.split(filename,".")[0]
|
||||||
|
|
||||||
|
section = 0
|
||||||
|
subsection = 0
|
||||||
|
subsubsection = 0
|
||||||
|
subsubsubsection = 0
|
||||||
|
nameindex = 0
|
||||||
|
|
||||||
|
name = ""
|
||||||
|
|
||||||
|
# Regexs for <h1>,... <h5> sections
|
||||||
|
|
||||||
|
h1 = re.compile(r".*?<H1>(<a.*a>)*[\d\.\s]*(.*?)</H1>", re.IGNORECASE)
|
||||||
|
h2 = re.compile(r".*?<H2>(<a.*a>)*[\d\.\s]*(.*?)</H2>", re.IGNORECASE)
|
||||||
|
h3 = re.compile(r".*?<H3>(<a.*a>)*[\d\.\s]*(.*?)</H3>", re.IGNORECASE)
|
||||||
|
h4 = re.compile(r".*?<H4>(<a.*a>)*[\d\.\s]*(.*?)</H4>", re.IGNORECASE)
|
||||||
|
h5 = re.compile(r".*?<H5>(<a.*a>)*[\d\.\s]*(.*?)</H5>", re.IGNORECASE)
|
||||||
|
|
||||||
|
data = open(filename).read() # Read data
|
||||||
|
open(filename+".bak","w").write(data) # Make backup
|
||||||
|
|
||||||
|
lines = data.splitlines()
|
||||||
|
result = [ ] # This is the result of postprocessing the file
|
||||||
|
index = "<!-- INDEX -->\n<div class=\"sectiontoc\">\n" # index contains the index for adding at the top of the file. Also printed to stdout.
|
||||||
|
|
||||||
|
skip = 0
|
||||||
|
skipspace = 0
|
||||||
|
|
||||||
|
for s in lines:
|
||||||
|
if s == "<!-- INDEX -->":
|
||||||
|
if not skip:
|
||||||
|
result.append("@INDEX@")
|
||||||
|
skip = 1
|
||||||
|
else:
|
||||||
|
skip = 0
|
||||||
|
continue;
|
||||||
|
if skip:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not s and skipspace:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if skipspace:
|
||||||
|
result.append("")
|
||||||
|
result.append("")
|
||||||
|
skipspace = 0
|
||||||
|
|
||||||
|
m = h2.match(s)
|
||||||
|
if m:
|
||||||
|
prevheadingtext = m.group(2)
|
||||||
|
nameindex += 1
|
||||||
|
section += 1
|
||||||
|
headingname = getheadingname(m)
|
||||||
|
result.append("""<H2><a name="%s"></a>%d. %s</H2>""" % (headingname,section, prevheadingtext))
|
||||||
|
|
||||||
|
if subsubsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if subsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if subsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if section == 1:
|
||||||
|
index += "<ul>\n"
|
||||||
|
|
||||||
|
index += """<li><a href="#%s">%s</a>\n""" % (headingname,prevheadingtext)
|
||||||
|
subsection = 0
|
||||||
|
subsubsection = 0
|
||||||
|
subsubsubsection = 0
|
||||||
|
skipspace = 1
|
||||||
|
continue
|
||||||
|
m = h3.match(s)
|
||||||
|
if m:
|
||||||
|
prevheadingtext = m.group(2)
|
||||||
|
nameindex += 1
|
||||||
|
subsection += 1
|
||||||
|
headingname = getheadingname(m)
|
||||||
|
result.append("""<H3><a name="%s"></a>%d.%d %s</H3>""" % (headingname,section, subsection, prevheadingtext))
|
||||||
|
|
||||||
|
if subsubsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if subsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if subsection == 1:
|
||||||
|
index += "<ul>\n"
|
||||||
|
|
||||||
|
index += """<li><a href="#%s">%s</a>\n""" % (headingname,prevheadingtext)
|
||||||
|
subsubsection = 0
|
||||||
|
skipspace = 1
|
||||||
|
continue
|
||||||
|
m = h4.match(s)
|
||||||
|
if m:
|
||||||
|
prevheadingtext = m.group(2)
|
||||||
|
nameindex += 1
|
||||||
|
subsubsection += 1
|
||||||
|
subsubsubsection = 0
|
||||||
|
headingname = getheadingname(m)
|
||||||
|
result.append("""<H4><a name="%s"></a>%d.%d.%d %s</H4>""" % (headingname,section, subsection, subsubsection, prevheadingtext))
|
||||||
|
|
||||||
|
if subsubsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
if subsubsection == 1:
|
||||||
|
index += "<ul>\n"
|
||||||
|
|
||||||
|
index += """<li><a href="#%s">%s</a>\n""" % (headingname,prevheadingtext)
|
||||||
|
skipspace = 1
|
||||||
|
continue
|
||||||
|
m = h5.match(s)
|
||||||
|
if m:
|
||||||
|
prevheadingtext = m.group(2)
|
||||||
|
nameindex += 1
|
||||||
|
subsubsubsection += 1
|
||||||
|
headingname = getheadingname(m)
|
||||||
|
result.append("""<H5><a name="%s"></a>%d.%d.%d.%d %s</H5>""" % (headingname,section, subsection, subsubsection, subsubsubsection, prevheadingtext))
|
||||||
|
|
||||||
|
if subsubsubsection == 1:
|
||||||
|
index += "<ul>\n"
|
||||||
|
|
||||||
|
index += """<li><a href="#%s">%s</a>\n""" % (headingname,prevheadingtext)
|
||||||
|
skipspace = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
result.append(s)
|
||||||
|
|
||||||
|
if subsubsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
|
||||||
|
if subsubsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
|
||||||
|
if subsection:
|
||||||
|
index += "</ul>\n"
|
||||||
|
|
||||||
|
if section:
|
||||||
|
index += "</ul>\n"
|
||||||
|
|
||||||
|
index += "</div>\n<!-- INDEX -->\n"
|
||||||
|
|
||||||
|
data = "\n".join(result)
|
||||||
|
|
||||||
|
data = data.replace("@INDEX@",index) + "\n";
|
||||||
|
|
||||||
|
# Write the file back out
|
||||||
|
open(filename,"w").write(data)
|
||||||
|
|
||||||
|
|
1874
ext/ply/doc/ply.html
1874
ext/ply/doc/ply.html
File diff suppressed because it is too large
Load diff
79
ext/ply/example/BASIC/README
Normal file
79
ext/ply/example/BASIC/README
Normal file
|
@ -0,0 +1,79 @@
|
||||||
|
Inspired by a September 14, 2006 Salon article "Why Johnny Can't Code" by
|
||||||
|
David Brin (http://www.salon.com/tech/feature/2006/09/14/basic/index.html),
|
||||||
|
I thought that a fully working BASIC interpreter might be an interesting,
|
||||||
|
if not questionable, PLY example. Uh, okay, so maybe it's just a bad idea,
|
||||||
|
but in any case, here it is.
|
||||||
|
|
||||||
|
In this example, you'll find a rough implementation of 1964 Dartmouth BASIC
|
||||||
|
as described in the manual at:
|
||||||
|
|
||||||
|
http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf
|
||||||
|
|
||||||
|
See also:
|
||||||
|
|
||||||
|
http://en.wikipedia.org/wiki/Dartmouth_BASIC
|
||||||
|
|
||||||
|
This dialect is downright primitive---there are no string variables
|
||||||
|
and no facilities for interactive input. Moreover, subroutines and functions
|
||||||
|
are brain-dead even more than they usually are for BASIC. Of course,
|
||||||
|
the GOTO statement is provided.
|
||||||
|
|
||||||
|
Nevertheless, there are a few interesting aspects of this example:
|
||||||
|
|
||||||
|
- It illustrates a fully working interpreter including lexing, parsing,
|
||||||
|
and interpretation of instructions.
|
||||||
|
|
||||||
|
- The parser shows how to catch and report various kinds of parsing
|
||||||
|
errors in a more graceful way.
|
||||||
|
|
||||||
|
- The example both parses files (supplied on command line) and
|
||||||
|
interactive input entered line by line.
|
||||||
|
|
||||||
|
- It shows how you might represent parsed information. In this case,
|
||||||
|
each BASIC statement is encoded into a Python tuple containing the
|
||||||
|
statement type and parameters. These tuples are then stored in
|
||||||
|
a dictionary indexed by program line numbers.
|
||||||
|
|
||||||
|
- Even though it's just BASIC, the parser contains more than 80
|
||||||
|
rules and 150 parsing states. Thus, it's a little more meaty than
|
||||||
|
the calculator example.
|
||||||
|
|
||||||
|
To use the example, run it as follows:
|
||||||
|
|
||||||
|
% python basic.py hello.bas
|
||||||
|
HELLO WORLD
|
||||||
|
%
|
||||||
|
|
||||||
|
or use it interactively:
|
||||||
|
|
||||||
|
% python basic.py
|
||||||
|
[BASIC] 10 PRINT "HELLO WORLD"
|
||||||
|
[BASIC] 20 END
|
||||||
|
[BASIC] RUN
|
||||||
|
HELLO WORLD
|
||||||
|
[BASIC]
|
||||||
|
|
||||||
|
The following files are defined:
|
||||||
|
|
||||||
|
basic.py - High level script that controls everything
|
||||||
|
basiclex.py - BASIC tokenizer
|
||||||
|
basparse.py - BASIC parser
|
||||||
|
basinterp.py - BASIC interpreter that runs parsed programs.
|
||||||
|
|
||||||
|
In addition, a number of sample BASIC programs (.bas suffix) are
|
||||||
|
provided. These were taken out of the Dartmouth manual.
|
||||||
|
|
||||||
|
Disclaimer: I haven't spent a ton of time testing this and it's likely that
|
||||||
|
I've skimped here and there on a few finer details (e.g., strictly enforcing
|
||||||
|
variable naming rules). However, the interpreter seems to be able to run
|
||||||
|
the examples in the BASIC manual.
|
||||||
|
|
||||||
|
Have fun!
|
||||||
|
|
||||||
|
-Dave
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
68
ext/ply/example/BASIC/basic.py
Normal file
68
ext/ply/example/BASIC/basic.py
Normal file
|
@ -0,0 +1,68 @@
|
||||||
|
# An implementation of Dartmouth BASIC (1964)
|
||||||
|
#
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
import basiclex
|
||||||
|
import basparse
|
||||||
|
import basinterp
|
||||||
|
|
||||||
|
# If a filename has been specified, we try to run it.
|
||||||
|
# If a runtime error occurs, we bail out and enter
|
||||||
|
# interactive mode below
|
||||||
|
if len(sys.argv) == 2:
|
||||||
|
data = open(sys.argv[1]).read()
|
||||||
|
prog = basparse.parse(data)
|
||||||
|
if not prog: raise SystemExit
|
||||||
|
b = basinterp.BasicInterpreter(prog)
|
||||||
|
try:
|
||||||
|
b.run()
|
||||||
|
raise SystemExit
|
||||||
|
except RuntimeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
else:
|
||||||
|
b = basinterp.BasicInterpreter({})
|
||||||
|
|
||||||
|
# Interactive mode. This incrementally adds/deletes statements
|
||||||
|
# from the program stored in the BasicInterpreter object. In
|
||||||
|
# addition, special commands 'NEW','LIST',and 'RUN' are added.
|
||||||
|
# Specifying a line number with no code deletes that line from
|
||||||
|
# the program.
|
||||||
|
|
||||||
|
while 1:
|
||||||
|
try:
|
||||||
|
line = raw_input("[BASIC] ")
|
||||||
|
except EOFError:
|
||||||
|
raise SystemExit
|
||||||
|
if not line: continue
|
||||||
|
line += "\n"
|
||||||
|
prog = basparse.parse(line)
|
||||||
|
if not prog: continue
|
||||||
|
|
||||||
|
keys = prog.keys()
|
||||||
|
if keys[0] > 0:
|
||||||
|
b.add_statements(prog)
|
||||||
|
else:
|
||||||
|
stat = prog[keys[0]]
|
||||||
|
if stat[0] == 'RUN':
|
||||||
|
try:
|
||||||
|
b.run()
|
||||||
|
except RuntimeError:
|
||||||
|
pass
|
||||||
|
elif stat[0] == 'LIST':
|
||||||
|
b.list()
|
||||||
|
elif stat[0] == 'BLANK':
|
||||||
|
b.del_line(stat[1])
|
||||||
|
elif stat[0] == 'NEW':
|
||||||
|
b.new()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
74
ext/ply/example/BASIC/basiclex.py
Normal file
74
ext/ply/example/BASIC/basiclex.py
Normal file
|
@ -0,0 +1,74 @@
|
||||||
|
# An implementation of Dartmouth BASIC (1964)
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
keywords = (
|
||||||
|
'LET','READ','DATA','PRINT','GOTO','IF','THEN','FOR','NEXT','TO','STEP',
|
||||||
|
'END','STOP','DEF','GOSUB','DIM','REM','RETURN','RUN','LIST','NEW',
|
||||||
|
)
|
||||||
|
|
||||||
|
tokens = keywords + (
|
||||||
|
'EQUALS','PLUS','MINUS','TIMES','DIVIDE','POWER',
|
||||||
|
'LPAREN','RPAREN','LT','LE','GT','GE','NE',
|
||||||
|
'COMMA','SEMI', 'INTEGER','FLOAT', 'STRING',
|
||||||
|
'ID','NEWLINE'
|
||||||
|
)
|
||||||
|
|
||||||
|
t_ignore = ' \t'
|
||||||
|
|
||||||
|
def t_REM(t):
|
||||||
|
r'REM .*'
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_ID(t):
|
||||||
|
r'[A-Z][A-Z0-9]*'
|
||||||
|
if t.value in keywords:
|
||||||
|
t.type = t.value
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_EQUALS = r'='
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_TIMES = r'\*'
|
||||||
|
t_POWER = r'\^'
|
||||||
|
t_DIVIDE = r'/'
|
||||||
|
t_LPAREN = r'\('
|
||||||
|
t_RPAREN = r'\)'
|
||||||
|
t_LT = r'<'
|
||||||
|
t_LE = r'<='
|
||||||
|
t_GT = r'>'
|
||||||
|
t_GE = r'>='
|
||||||
|
t_NE = r'<>'
|
||||||
|
t_COMMA = r'\,'
|
||||||
|
t_SEMI = r';'
|
||||||
|
t_INTEGER = r'\d+'
|
||||||
|
t_FLOAT = r'((\d*\.\d+)(E[\+-]?\d+)?|([1-9]\d*E[\+-]?\d+))'
|
||||||
|
t_STRING = r'\".*?\"'
|
||||||
|
|
||||||
|
def t_NEWLINE(t):
|
||||||
|
r'\n'
|
||||||
|
t.lexer.lineno += 1
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
print "Illegal character", t.value[0]
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
440
ext/ply/example/BASIC/basinterp.py
Normal file
440
ext/ply/example/BASIC/basinterp.py
Normal file
|
@ -0,0 +1,440 @@
|
||||||
|
# This file provides the runtime support for running a basic program
|
||||||
|
# Assumes the program has been parsed using basparse.py
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import math
|
||||||
|
import random
|
||||||
|
|
||||||
|
class BasicInterpreter:
|
||||||
|
|
||||||
|
# Initialize the interpreter. prog is a dictionary
|
||||||
|
# containing (line,statement) mappings
|
||||||
|
def __init__(self,prog):
|
||||||
|
self.prog = prog
|
||||||
|
|
||||||
|
self.functions = { # Built-in function table
|
||||||
|
'SIN' : lambda z: math.sin(self.eval(z)),
|
||||||
|
'COS' : lambda z: math.cos(self.eval(z)),
|
||||||
|
'TAN' : lambda z: math.tan(self.eval(z)),
|
||||||
|
'ATN' : lambda z: math.atan(self.eval(z)),
|
||||||
|
'EXP' : lambda z: math.exp(self.eval(z)),
|
||||||
|
'ABS' : lambda z: abs(self.eval(z)),
|
||||||
|
'LOG' : lambda z: math.log(self.eval(z)),
|
||||||
|
'SQR' : lambda z: math.sqrt(self.eval(z)),
|
||||||
|
'INT' : lambda z: int(self.eval(z)),
|
||||||
|
'RND' : lambda z: random.random()
|
||||||
|
}
|
||||||
|
|
||||||
|
# Collect all data statements
|
||||||
|
def collect_data(self):
|
||||||
|
self.data = []
|
||||||
|
for lineno in self.stat:
|
||||||
|
if self.prog[lineno][0] == 'DATA':
|
||||||
|
self.data = self.data + self.prog[lineno][1]
|
||||||
|
self.dc = 0 # Initialize the data counter
|
||||||
|
|
||||||
|
# Check for end statements
|
||||||
|
def check_end(self):
|
||||||
|
has_end = 0
|
||||||
|
for lineno in self.stat:
|
||||||
|
if self.prog[lineno][0] == 'END' and not has_end:
|
||||||
|
has_end = lineno
|
||||||
|
if not has_end:
|
||||||
|
print "NO END INSTRUCTION"
|
||||||
|
self.error = 1
|
||||||
|
if has_end != lineno:
|
||||||
|
print "END IS NOT LAST"
|
||||||
|
self.error = 1
|
||||||
|
|
||||||
|
# Check loops
|
||||||
|
def check_loops(self):
|
||||||
|
for pc in range(len(self.stat)):
|
||||||
|
lineno = self.stat[pc]
|
||||||
|
if self.prog[lineno][0] == 'FOR':
|
||||||
|
forinst = self.prog[lineno]
|
||||||
|
loopvar = forinst[1]
|
||||||
|
for i in range(pc+1,len(self.stat)):
|
||||||
|
if self.prog[self.stat[i]][0] == 'NEXT':
|
||||||
|
nextvar = self.prog[self.stat[i]][1]
|
||||||
|
if nextvar != loopvar: continue
|
||||||
|
self.loopend[pc] = i
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
print "FOR WITHOUT NEXT AT LINE" % self.stat[pc]
|
||||||
|
self.error = 1
|
||||||
|
|
||||||
|
# Evaluate an expression
|
||||||
|
def eval(self,expr):
|
||||||
|
etype = expr[0]
|
||||||
|
if etype == 'NUM': return expr[1]
|
||||||
|
elif etype == 'GROUP': return self.eval(expr[1])
|
||||||
|
elif etype == 'UNARY':
|
||||||
|
if expr[1] == '-': return -self.eval(expr[2])
|
||||||
|
elif etype == 'BINOP':
|
||||||
|
if expr[1] == '+': return self.eval(expr[2])+self.eval(expr[3])
|
||||||
|
elif expr[1] == '-': return self.eval(expr[2])-self.eval(expr[3])
|
||||||
|
elif expr[1] == '*': return self.eval(expr[2])*self.eval(expr[3])
|
||||||
|
elif expr[1] == '/': return float(self.eval(expr[2]))/self.eval(expr[3])
|
||||||
|
elif expr[1] == '^': return abs(self.eval(expr[2]))**self.eval(expr[3])
|
||||||
|
elif etype == 'VAR':
|
||||||
|
var,dim1,dim2 = expr[1]
|
||||||
|
if not dim1 and not dim2:
|
||||||
|
if self.vars.has_key(var):
|
||||||
|
return self.vars[var]
|
||||||
|
else:
|
||||||
|
print "UNDEFINED VARIABLE", var, "AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
# May be a list lookup or a function evaluation
|
||||||
|
if dim1 and not dim2:
|
||||||
|
if self.functions.has_key(var):
|
||||||
|
# A function
|
||||||
|
return self.functions[var](dim1)
|
||||||
|
else:
|
||||||
|
# A list evaluation
|
||||||
|
if self.lists.has_key(var):
|
||||||
|
dim1val = self.eval(dim1)
|
||||||
|
if dim1val < 1 or dim1val > len(self.lists[var]):
|
||||||
|
print "LIST INDEX OUT OF BOUNDS AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
return self.lists[var][dim1val-1]
|
||||||
|
if dim1 and dim2:
|
||||||
|
if self.tables.has_key(var):
|
||||||
|
dim1val = self.eval(dim1)
|
||||||
|
dim2val = self.eval(dim2)
|
||||||
|
if dim1val < 1 or dim1val > len(self.tables[var]) or dim2val < 1 or dim2val > len(self.tables[var][0]):
|
||||||
|
print "TABLE INDEX OUT OUT BOUNDS AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
return self.tables[var][dim1val-1][dim2val-1]
|
||||||
|
print "UNDEFINED VARIABLE", var, "AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
|
||||||
|
# Evaluate a relational expression
|
||||||
|
def releval(self,expr):
|
||||||
|
etype = expr[1]
|
||||||
|
lhs = self.eval(expr[2])
|
||||||
|
rhs = self.eval(expr[3])
|
||||||
|
if etype == '<':
|
||||||
|
if lhs < rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
elif etype == '<=':
|
||||||
|
if lhs <= rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
elif etype == '>':
|
||||||
|
if lhs > rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
elif etype == '>=':
|
||||||
|
if lhs >= rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
elif etype == '=':
|
||||||
|
if lhs == rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
elif etype == '<>':
|
||||||
|
if lhs != rhs: return 1
|
||||||
|
else: return 0
|
||||||
|
|
||||||
|
# Assignment
|
||||||
|
def assign(self,target,value):
|
||||||
|
var, dim1, dim2 = target
|
||||||
|
if not dim1 and not dim2:
|
||||||
|
self.vars[var] = self.eval(value)
|
||||||
|
elif dim1 and not dim2:
|
||||||
|
# List assignment
|
||||||
|
dim1val = self.eval(dim1)
|
||||||
|
if not self.lists.has_key(var):
|
||||||
|
self.lists[var] = [0]*10
|
||||||
|
|
||||||
|
if dim1val > len(self.lists[var]):
|
||||||
|
print "DIMENSION TOO LARGE AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
self.lists[var][dim1val-1] = self.eval(value)
|
||||||
|
elif dim1 and dim2:
|
||||||
|
dim1val = self.eval(dim1)
|
||||||
|
dim2val = self.eval(dim2)
|
||||||
|
if not self.tables.has_key(var):
|
||||||
|
temp = [0]*10
|
||||||
|
v = []
|
||||||
|
for i in range(10): v.append(temp[:])
|
||||||
|
self.tables[var] = v
|
||||||
|
# Variable already exists
|
||||||
|
if dim1val > len(self.tables[var]) or dim2val > len(self.tables[var][0]):
|
||||||
|
print "DIMENSION TOO LARGE AT LINE", self.stat[self.pc]
|
||||||
|
raise RuntimeError
|
||||||
|
self.tables[var][dim1val-1][dim2val-1] = self.eval(value)
|
||||||
|
|
||||||
|
# Change the current line number
|
||||||
|
def goto(self,linenum):
|
||||||
|
if not self.prog.has_key(linenum):
|
||||||
|
print "UNDEFINED LINE NUMBER %d AT LINE %d" % (linenum, self.stat[self.pc])
|
||||||
|
raise RuntimeError
|
||||||
|
self.pc = self.stat.index(linenum)
|
||||||
|
|
||||||
|
# Run it
|
||||||
|
def run(self):
|
||||||
|
self.vars = { } # All variables
|
||||||
|
self.lists = { } # List variables
|
||||||
|
self.tables = { } # Tables
|
||||||
|
self.loops = [ ] # Currently active loops
|
||||||
|
self.loopend= { } # Mapping saying where loops end
|
||||||
|
self.gosub = None # Gosub return point (if any)
|
||||||
|
self.error = 0 # Indicates program error
|
||||||
|
|
||||||
|
self.stat = self.prog.keys() # Ordered list of all line numbers
|
||||||
|
self.stat.sort()
|
||||||
|
self.pc = 0 # Current program counter
|
||||||
|
|
||||||
|
# Processing prior to running
|
||||||
|
|
||||||
|
self.collect_data() # Collect all of the data statements
|
||||||
|
self.check_end()
|
||||||
|
self.check_loops()
|
||||||
|
|
||||||
|
if self.error: raise RuntimeError
|
||||||
|
|
||||||
|
while 1:
|
||||||
|
line = self.stat[self.pc]
|
||||||
|
instr = self.prog[line]
|
||||||
|
|
||||||
|
op = instr[0]
|
||||||
|
|
||||||
|
# END and STOP statements
|
||||||
|
if op == 'END' or op == 'STOP':
|
||||||
|
break # We're done
|
||||||
|
|
||||||
|
# GOTO statement
|
||||||
|
elif op == 'GOTO':
|
||||||
|
newline = instr[1]
|
||||||
|
self.goto(newline)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# PRINT statement
|
||||||
|
elif op == 'PRINT':
|
||||||
|
plist = instr[1]
|
||||||
|
out = ""
|
||||||
|
for label,val in plist:
|
||||||
|
if out:
|
||||||
|
out += ' '*(15 - (len(out) % 15))
|
||||||
|
out += label
|
||||||
|
if val:
|
||||||
|
if label: out += " "
|
||||||
|
eval = self.eval(val)
|
||||||
|
out += str(eval)
|
||||||
|
sys.stdout.write(out)
|
||||||
|
end = instr[2]
|
||||||
|
if not (end == ',' or end == ';'):
|
||||||
|
sys.stdout.write("\n")
|
||||||
|
if end == ',': sys.stdout.write(" "*(15-(len(out) % 15)))
|
||||||
|
if end == ';': sys.stdout.write(" "*(3-(len(out) % 3)))
|
||||||
|
|
||||||
|
# LET statement
|
||||||
|
elif op == 'LET':
|
||||||
|
target = instr[1]
|
||||||
|
value = instr[2]
|
||||||
|
self.assign(target,value)
|
||||||
|
|
||||||
|
# READ statement
|
||||||
|
elif op == 'READ':
|
||||||
|
for target in instr[1]:
|
||||||
|
if self.dc < len(self.data):
|
||||||
|
value = ('NUM',self.data[self.dc])
|
||||||
|
self.assign(target,value)
|
||||||
|
self.dc += 1
|
||||||
|
else:
|
||||||
|
# No more data. Program ends
|
||||||
|
return
|
||||||
|
elif op == 'IF':
|
||||||
|
relop = instr[1]
|
||||||
|
newline = instr[2]
|
||||||
|
if (self.releval(relop)):
|
||||||
|
self.goto(newline)
|
||||||
|
continue
|
||||||
|
|
||||||
|
elif op == 'FOR':
|
||||||
|
loopvar = instr[1]
|
||||||
|
initval = instr[2]
|
||||||
|
finval = instr[3]
|
||||||
|
stepval = instr[4]
|
||||||
|
|
||||||
|
# Check to see if this is a new loop
|
||||||
|
if not self.loops or self.loops[-1][0] != self.pc:
|
||||||
|
# Looks like a new loop. Make the initial assignment
|
||||||
|
newvalue = initval
|
||||||
|
self.assign((loopvar,None,None),initval)
|
||||||
|
if not stepval: stepval = ('NUM',1)
|
||||||
|
stepval = self.eval(stepval) # Evaluate step here
|
||||||
|
self.loops.append((self.pc,stepval))
|
||||||
|
else:
|
||||||
|
# It's a repeat of the previous loop
|
||||||
|
# Update the value of the loop variable according to the step
|
||||||
|
stepval = ('NUM',self.loops[-1][1])
|
||||||
|
newvalue = ('BINOP','+',('VAR',(loopvar,None,None)),stepval)
|
||||||
|
|
||||||
|
if self.loops[-1][1] < 0: relop = '>='
|
||||||
|
else: relop = '<='
|
||||||
|
if not self.releval(('RELOP',relop,newvalue,finval)):
|
||||||
|
# Loop is done. Jump to the NEXT
|
||||||
|
self.pc = self.loopend[self.pc]
|
||||||
|
self.loops.pop()
|
||||||
|
else:
|
||||||
|
self.assign((loopvar,None,None),newvalue)
|
||||||
|
|
||||||
|
elif op == 'NEXT':
|
||||||
|
if not self.loops:
|
||||||
|
print "NEXT WITHOUT FOR AT LINE",line
|
||||||
|
return
|
||||||
|
|
||||||
|
nextvar = instr[1]
|
||||||
|
self.pc = self.loops[-1][0]
|
||||||
|
loopinst = self.prog[self.stat[self.pc]]
|
||||||
|
forvar = loopinst[1]
|
||||||
|
if nextvar != forvar:
|
||||||
|
print "NEXT DOESN'T MATCH FOR AT LINE", line
|
||||||
|
return
|
||||||
|
continue
|
||||||
|
elif op == 'GOSUB':
|
||||||
|
newline = instr[1]
|
||||||
|
if self.gosub:
|
||||||
|
print "ALREADY IN A SUBROUTINE AT LINE", line
|
||||||
|
return
|
||||||
|
self.gosub = self.stat[self.pc]
|
||||||
|
self.goto(newline)
|
||||||
|
continue
|
||||||
|
|
||||||
|
elif op == 'RETURN':
|
||||||
|
if not self.gosub:
|
||||||
|
print "RETURN WITHOUT A GOSUB AT LINE",line
|
||||||
|
return
|
||||||
|
self.goto(self.gosub)
|
||||||
|
self.gosub = None
|
||||||
|
|
||||||
|
elif op == 'FUNC':
|
||||||
|
fname = instr[1]
|
||||||
|
pname = instr[2]
|
||||||
|
expr = instr[3]
|
||||||
|
def eval_func(pvalue,name=pname,self=self,expr=expr):
|
||||||
|
self.assign((pname,None,None),pvalue)
|
||||||
|
return self.eval(expr)
|
||||||
|
self.functions[fname] = eval_func
|
||||||
|
|
||||||
|
elif op == 'DIM':
|
||||||
|
for vname,x,y in instr[1]:
|
||||||
|
if y == 0:
|
||||||
|
# Single dimension variable
|
||||||
|
self.lists[vname] = [0]*x
|
||||||
|
else:
|
||||||
|
# Double dimension variable
|
||||||
|
temp = [0]*y
|
||||||
|
v = []
|
||||||
|
for i in range(x):
|
||||||
|
v.append(temp[:])
|
||||||
|
self.tables[vname] = v
|
||||||
|
|
||||||
|
self.pc += 1
|
||||||
|
|
||||||
|
# Utility functions for program listing
|
||||||
|
def expr_str(self,expr):
|
||||||
|
etype = expr[0]
|
||||||
|
if etype == 'NUM': return str(expr[1])
|
||||||
|
elif etype == 'GROUP': return "(%s)" % self.expr_str(expr[1])
|
||||||
|
elif etype == 'UNARY':
|
||||||
|
if expr[1] == '-': return "-"+str(expr[2])
|
||||||
|
elif etype == 'BINOP':
|
||||||
|
return "%s %s %s" % (self.expr_str(expr[2]),expr[1],self.expr_str(expr[3]))
|
||||||
|
elif etype == 'VAR':
|
||||||
|
return self.var_str(expr[1])
|
||||||
|
|
||||||
|
def relexpr_str(self,expr):
|
||||||
|
return "%s %s %s" % (self.expr_str(expr[2]),expr[1],self.expr_str(expr[3]))
|
||||||
|
|
||||||
|
def var_str(self,var):
|
||||||
|
varname,dim1,dim2 = var
|
||||||
|
if not dim1 and not dim2: return varname
|
||||||
|
if dim1 and not dim2: return "%s(%s)" % (varname, self.expr_str(dim1))
|
||||||
|
return "%s(%s,%s)" % (varname, self.expr_str(dim1),self.expr_str(dim2))
|
||||||
|
|
||||||
|
# Create a program listing
|
||||||
|
def list(self):
|
||||||
|
stat = self.prog.keys() # Ordered list of all line numbers
|
||||||
|
stat.sort()
|
||||||
|
for line in stat:
|
||||||
|
instr = self.prog[line]
|
||||||
|
op = instr[0]
|
||||||
|
if op in ['END','STOP','RETURN']:
|
||||||
|
print line, op
|
||||||
|
continue
|
||||||
|
elif op == 'REM':
|
||||||
|
print line, instr[1]
|
||||||
|
elif op == 'PRINT':
|
||||||
|
print line, op,
|
||||||
|
first = 1
|
||||||
|
for p in instr[1]:
|
||||||
|
if not first: print ",",
|
||||||
|
if p[0] and p[1]: print '"%s"%s' % (p[0],self.expr_str(p[1])),
|
||||||
|
elif p[1]: print self.expr_str(p[1]),
|
||||||
|
else: print '"%s"' % (p[0],),
|
||||||
|
first = 0
|
||||||
|
if instr[2]: print instr[2]
|
||||||
|
else: print
|
||||||
|
elif op == 'LET':
|
||||||
|
print line,"LET",self.var_str(instr[1]),"=",self.expr_str(instr[2])
|
||||||
|
elif op == 'READ':
|
||||||
|
print line,"READ",
|
||||||
|
first = 1
|
||||||
|
for r in instr[1]:
|
||||||
|
if not first: print ",",
|
||||||
|
print self.var_str(r),
|
||||||
|
first = 0
|
||||||
|
print ""
|
||||||
|
elif op == 'IF':
|
||||||
|
print line,"IF %s THEN %d" % (self.relexpr_str(instr[1]),instr[2])
|
||||||
|
elif op == 'GOTO' or op == 'GOSUB':
|
||||||
|
print line, op, instr[1]
|
||||||
|
elif op == 'FOR':
|
||||||
|
print line,"FOR %s = %s TO %s" % (instr[1],self.expr_str(instr[2]),self.expr_str(instr[3])),
|
||||||
|
if instr[4]: print "STEP %s" % (self.expr_str(instr[4])),
|
||||||
|
print
|
||||||
|
elif op == 'NEXT':
|
||||||
|
print line,"NEXT", instr[1]
|
||||||
|
elif op == 'FUNC':
|
||||||
|
print line,"DEF %s(%s) = %s" % (instr[1],instr[2],self.expr_str(instr[3]))
|
||||||
|
elif op == 'DIM':
|
||||||
|
print line,"DIM",
|
||||||
|
first = 1
|
||||||
|
for vname,x,y in instr[1]:
|
||||||
|
if not first: print ",",
|
||||||
|
first = 0
|
||||||
|
if y == 0:
|
||||||
|
print "%s(%d)" % (vname,x),
|
||||||
|
else:
|
||||||
|
print "%s(%d,%d)" % (vname,x,y),
|
||||||
|
|
||||||
|
print
|
||||||
|
elif op == 'DATA':
|
||||||
|
print line,"DATA",
|
||||||
|
first = 1
|
||||||
|
for v in instr[1]:
|
||||||
|
if not first: print ",",
|
||||||
|
first = 0
|
||||||
|
print v,
|
||||||
|
print
|
||||||
|
|
||||||
|
# Erase the current program
|
||||||
|
def new(self):
|
||||||
|
self.prog = {}
|
||||||
|
|
||||||
|
# Insert statements
|
||||||
|
def add_statements(self,prog):
|
||||||
|
for line,stat in prog.items():
|
||||||
|
self.prog[line] = stat
|
||||||
|
|
||||||
|
# Delete a statement
|
||||||
|
def del_line(self,lineno):
|
||||||
|
try:
|
||||||
|
del self.prog[lineno]
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
|
424
ext/ply/example/BASIC/basparse.py
Normal file
424
ext/ply/example/BASIC/basparse.py
Normal file
|
@ -0,0 +1,424 @@
|
||||||
|
# An implementation of Dartmouth BASIC (1964)
|
||||||
|
#
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
import basiclex
|
||||||
|
|
||||||
|
tokens = basiclex.tokens
|
||||||
|
|
||||||
|
precedence = (
|
||||||
|
('left', 'PLUS','MINUS'),
|
||||||
|
('left', 'TIMES','DIVIDE'),
|
||||||
|
('left', 'POWER'),
|
||||||
|
('right','UMINUS')
|
||||||
|
)
|
||||||
|
|
||||||
|
#### A BASIC program is a series of statements. We represent the program as a
|
||||||
|
#### dictionary of tuples indexed by line number.
|
||||||
|
|
||||||
|
def p_program(p):
|
||||||
|
'''program : program statement
|
||||||
|
| statement'''
|
||||||
|
|
||||||
|
if len(p) == 2 and p[1]:
|
||||||
|
p[0] = { }
|
||||||
|
line,stat = p[1]
|
||||||
|
p[0][line] = stat
|
||||||
|
elif len(p) ==3:
|
||||||
|
p[0] = p[1]
|
||||||
|
if not p[0]: p[0] = { }
|
||||||
|
if p[2]:
|
||||||
|
line,stat = p[2]
|
||||||
|
p[0][line] = stat
|
||||||
|
|
||||||
|
#### This catch-all rule is used for any catastrophic errors. In this case,
|
||||||
|
#### we simply return nothing
|
||||||
|
|
||||||
|
def p_program_error(p):
|
||||||
|
'''program : error'''
|
||||||
|
p[0] = None
|
||||||
|
p.parser.error = 1
|
||||||
|
|
||||||
|
#### Format of all BASIC statements.
|
||||||
|
|
||||||
|
def p_statement(p):
|
||||||
|
'''statement : INTEGER command NEWLINE'''
|
||||||
|
if isinstance(p[2],str):
|
||||||
|
print p[2],"AT LINE", p[1]
|
||||||
|
p[0] = None
|
||||||
|
p.parser.error = 1
|
||||||
|
else:
|
||||||
|
lineno = int(p[1])
|
||||||
|
p[0] = (lineno,p[2])
|
||||||
|
|
||||||
|
#### Interactive statements.
|
||||||
|
|
||||||
|
def p_statement_interactive(p):
|
||||||
|
'''statement : RUN NEWLINE
|
||||||
|
| LIST NEWLINE
|
||||||
|
| NEW NEWLINE'''
|
||||||
|
p[0] = (0, (p[1],0))
|
||||||
|
|
||||||
|
#### Blank line number
|
||||||
|
def p_statement_blank(p):
|
||||||
|
'''statement : INTEGER NEWLINE'''
|
||||||
|
p[0] = (0,('BLANK',int(p[1])))
|
||||||
|
|
||||||
|
#### Error handling for malformed statements
|
||||||
|
|
||||||
|
def p_statement_bad(p):
|
||||||
|
'''statement : INTEGER error NEWLINE'''
|
||||||
|
print "MALFORMED STATEMENT AT LINE", p[1]
|
||||||
|
p[0] = None
|
||||||
|
p.parser.error = 1
|
||||||
|
|
||||||
|
#### Blank line
|
||||||
|
|
||||||
|
def p_statement_newline(p):
|
||||||
|
'''statement : NEWLINE'''
|
||||||
|
p[0] = None
|
||||||
|
|
||||||
|
#### LET statement
|
||||||
|
|
||||||
|
def p_command_let(p):
|
||||||
|
'''command : LET variable EQUALS expr'''
|
||||||
|
p[0] = ('LET',p[2],p[4])
|
||||||
|
|
||||||
|
def p_command_let_bad(p):
|
||||||
|
'''command : LET variable EQUALS error'''
|
||||||
|
p[0] = "BAD EXPRESSION IN LET"
|
||||||
|
|
||||||
|
#### READ statement
|
||||||
|
|
||||||
|
def p_command_read(p):
|
||||||
|
'''command : READ varlist'''
|
||||||
|
p[0] = ('READ',p[2])
|
||||||
|
|
||||||
|
def p_command_read_bad(p):
|
||||||
|
'''command : READ error'''
|
||||||
|
p[0] = "MALFORMED VARIABLE LIST IN READ"
|
||||||
|
|
||||||
|
#### DATA statement
|
||||||
|
|
||||||
|
def p_command_data(p):
|
||||||
|
'''command : DATA numlist'''
|
||||||
|
p[0] = ('DATA',p[2])
|
||||||
|
|
||||||
|
def p_command_data_bad(p):
|
||||||
|
'''command : DATA error'''
|
||||||
|
p[0] = "MALFORMED NUMBER LIST IN DATA"
|
||||||
|
|
||||||
|
#### PRINT statement
|
||||||
|
|
||||||
|
def p_command_print(p):
|
||||||
|
'''command : PRINT plist optend'''
|
||||||
|
p[0] = ('PRINT',p[2],p[3])
|
||||||
|
|
||||||
|
def p_command_print_bad(p):
|
||||||
|
'''command : PRINT error'''
|
||||||
|
p[0] = "MALFORMED PRINT STATEMENT"
|
||||||
|
|
||||||
|
#### Optional ending on PRINT. Either a comma (,) or semicolon (;)
|
||||||
|
|
||||||
|
def p_optend(p):
|
||||||
|
'''optend : COMMA
|
||||||
|
| SEMI
|
||||||
|
|'''
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
p[0] = None
|
||||||
|
|
||||||
|
#### PRINT statement with no arguments
|
||||||
|
|
||||||
|
def p_command_print_empty(p):
|
||||||
|
'''command : PRINT'''
|
||||||
|
p[0] = ('PRINT',[],None)
|
||||||
|
|
||||||
|
#### GOTO statement
|
||||||
|
|
||||||
|
def p_command_goto(p):
|
||||||
|
'''command : GOTO INTEGER'''
|
||||||
|
p[0] = ('GOTO',int(p[2]))
|
||||||
|
|
||||||
|
def p_command_goto_bad(p):
|
||||||
|
'''command : GOTO error'''
|
||||||
|
p[0] = "INVALID LINE NUMBER IN GOTO"
|
||||||
|
|
||||||
|
#### IF-THEN statement
|
||||||
|
|
||||||
|
def p_command_if(p):
|
||||||
|
'''command : IF relexpr THEN INTEGER'''
|
||||||
|
p[0] = ('IF',p[2],int(p[4]))
|
||||||
|
|
||||||
|
def p_command_if_bad(p):
|
||||||
|
'''command : IF error THEN INTEGER'''
|
||||||
|
p[0] = "BAD RELATIONAL EXPRESSION"
|
||||||
|
|
||||||
|
def p_command_if_bad2(p):
|
||||||
|
'''command : IF relexpr THEN error'''
|
||||||
|
p[0] = "INVALID LINE NUMBER IN THEN"
|
||||||
|
|
||||||
|
#### FOR statement
|
||||||
|
|
||||||
|
def p_command_for(p):
|
||||||
|
'''command : FOR ID EQUALS expr TO expr optstep'''
|
||||||
|
p[0] = ('FOR',p[2],p[4],p[6],p[7])
|
||||||
|
|
||||||
|
def p_command_for_bad_initial(p):
|
||||||
|
'''command : FOR ID EQUALS error TO expr optstep'''
|
||||||
|
p[0] = "BAD INITIAL VALUE IN FOR STATEMENT"
|
||||||
|
|
||||||
|
def p_command_for_bad_final(p):
|
||||||
|
'''command : FOR ID EQUALS expr TO error optstep'''
|
||||||
|
p[0] = "BAD FINAL VALUE IN FOR STATEMENT"
|
||||||
|
|
||||||
|
def p_command_for_bad_step(p):
|
||||||
|
'''command : FOR ID EQUALS expr TO expr STEP error'''
|
||||||
|
p[0] = "MALFORMED STEP IN FOR STATEMENT"
|
||||||
|
|
||||||
|
#### Optional STEP qualifier on FOR statement
|
||||||
|
|
||||||
|
def p_optstep(p):
|
||||||
|
'''optstep : STEP expr
|
||||||
|
| empty'''
|
||||||
|
if len(p) == 3:
|
||||||
|
p[0] = p[2]
|
||||||
|
else:
|
||||||
|
p[0] = None
|
||||||
|
|
||||||
|
#### NEXT statement
|
||||||
|
|
||||||
|
def p_command_next(p):
|
||||||
|
'''command : NEXT ID'''
|
||||||
|
|
||||||
|
p[0] = ('NEXT',p[2])
|
||||||
|
|
||||||
|
def p_command_next_bad(p):
|
||||||
|
'''command : NEXT error'''
|
||||||
|
p[0] = "MALFORMED NEXT"
|
||||||
|
|
||||||
|
#### END statement
|
||||||
|
|
||||||
|
def p_command_end(p):
|
||||||
|
'''command : END'''
|
||||||
|
p[0] = ('END',)
|
||||||
|
|
||||||
|
#### REM statement
|
||||||
|
|
||||||
|
def p_command_rem(p):
|
||||||
|
'''command : REM'''
|
||||||
|
p[0] = ('REM',p[1])
|
||||||
|
|
||||||
|
#### STOP statement
|
||||||
|
|
||||||
|
def p_command_stop(p):
|
||||||
|
'''command : STOP'''
|
||||||
|
p[0] = ('STOP',)
|
||||||
|
|
||||||
|
#### DEF statement
|
||||||
|
|
||||||
|
def p_command_def(p):
|
||||||
|
'''command : DEF ID LPAREN ID RPAREN EQUALS expr'''
|
||||||
|
p[0] = ('FUNC',p[2],p[4],p[7])
|
||||||
|
|
||||||
|
def p_command_def_bad_rhs(p):
|
||||||
|
'''command : DEF ID LPAREN ID RPAREN EQUALS error'''
|
||||||
|
p[0] = "BAD EXPRESSION IN DEF STATEMENT"
|
||||||
|
|
||||||
|
def p_command_def_bad_arg(p):
|
||||||
|
'''command : DEF ID LPAREN error RPAREN EQUALS expr'''
|
||||||
|
p[0] = "BAD ARGUMENT IN DEF STATEMENT"
|
||||||
|
|
||||||
|
#### GOSUB statement
|
||||||
|
|
||||||
|
def p_command_gosub(p):
|
||||||
|
'''command : GOSUB INTEGER'''
|
||||||
|
p[0] = ('GOSUB',int(p[2]))
|
||||||
|
|
||||||
|
def p_command_gosub_bad(p):
|
||||||
|
'''command : GOSUB error'''
|
||||||
|
p[0] = "INVALID LINE NUMBER IN GOSUB"
|
||||||
|
|
||||||
|
#### RETURN statement
|
||||||
|
|
||||||
|
def p_command_return(p):
|
||||||
|
'''command : RETURN'''
|
||||||
|
p[0] = ('RETURN',)
|
||||||
|
|
||||||
|
#### DIM statement
|
||||||
|
|
||||||
|
def p_command_dim(p):
|
||||||
|
'''command : DIM dimlist'''
|
||||||
|
p[0] = ('DIM',p[2])
|
||||||
|
|
||||||
|
def p_command_dim_bad(p):
|
||||||
|
'''command : DIM error'''
|
||||||
|
p[0] = "MALFORMED VARIABLE LIST IN DIM"
|
||||||
|
|
||||||
|
#### List of variables supplied to DIM statement
|
||||||
|
|
||||||
|
def p_dimlist(p):
|
||||||
|
'''dimlist : dimlist COMMA dimitem
|
||||||
|
| dimitem'''
|
||||||
|
if len(p) == 4:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[0].append(p[3])
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
#### DIM items
|
||||||
|
|
||||||
|
def p_dimitem_single(p):
|
||||||
|
'''dimitem : ID LPAREN INTEGER RPAREN'''
|
||||||
|
p[0] = (p[1],eval(p[3]),0)
|
||||||
|
|
||||||
|
def p_dimitem_double(p):
|
||||||
|
'''dimitem : ID LPAREN INTEGER COMMA INTEGER RPAREN'''
|
||||||
|
p[0] = (p[1],eval(p[3]),eval(p[5]))
|
||||||
|
|
||||||
|
#### Arithmetic expressions
|
||||||
|
|
||||||
|
def p_expr_binary(p):
|
||||||
|
'''expr : expr PLUS expr
|
||||||
|
| expr MINUS expr
|
||||||
|
| expr TIMES expr
|
||||||
|
| expr DIVIDE expr
|
||||||
|
| expr POWER expr'''
|
||||||
|
|
||||||
|
p[0] = ('BINOP',p[2],p[1],p[3])
|
||||||
|
|
||||||
|
def p_expr_number(p):
|
||||||
|
'''expr : INTEGER
|
||||||
|
| FLOAT'''
|
||||||
|
p[0] = ('NUM',eval(p[1]))
|
||||||
|
|
||||||
|
def p_expr_variable(p):
|
||||||
|
'''expr : variable'''
|
||||||
|
p[0] = ('VAR',p[1])
|
||||||
|
|
||||||
|
def p_expr_group(p):
|
||||||
|
'''expr : LPAREN expr RPAREN'''
|
||||||
|
p[0] = ('GROUP',p[2])
|
||||||
|
|
||||||
|
def p_expr_unary(p):
|
||||||
|
'''expr : MINUS expr %prec UMINUS'''
|
||||||
|
p[0] = ('UNARY','-',p[2])
|
||||||
|
|
||||||
|
#### Relational expressions
|
||||||
|
|
||||||
|
def p_relexpr(p):
|
||||||
|
'''relexpr : expr LT expr
|
||||||
|
| expr LE expr
|
||||||
|
| expr GT expr
|
||||||
|
| expr GE expr
|
||||||
|
| expr EQUALS expr
|
||||||
|
| expr NE expr'''
|
||||||
|
p[0] = ('RELOP',p[2],p[1],p[3])
|
||||||
|
|
||||||
|
#### Variables
|
||||||
|
|
||||||
|
def p_variable(p):
|
||||||
|
'''variable : ID
|
||||||
|
| ID LPAREN expr RPAREN
|
||||||
|
| ID LPAREN expr COMMA expr RPAREN'''
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = (p[1],None,None)
|
||||||
|
elif len(p) == 5:
|
||||||
|
p[0] = (p[1],p[3],None)
|
||||||
|
else:
|
||||||
|
p[0] = (p[1],p[3],p[5])
|
||||||
|
|
||||||
|
#### Builds a list of variable targets as a Python list
|
||||||
|
|
||||||
|
def p_varlist(p):
|
||||||
|
'''varlist : varlist COMMA variable
|
||||||
|
| variable'''
|
||||||
|
if len(p) > 2:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[0].append(p[3])
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
|
||||||
|
#### Builds a list of numbers as a Python list
|
||||||
|
|
||||||
|
def p_numlist(p):
|
||||||
|
'''numlist : numlist COMMA number
|
||||||
|
| number'''
|
||||||
|
|
||||||
|
if len(p) > 2:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[0].append(p[3])
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
#### A number. May be an integer or a float
|
||||||
|
|
||||||
|
def p_number(p):
|
||||||
|
'''number : INTEGER
|
||||||
|
| FLOAT'''
|
||||||
|
p[0] = eval(p[1])
|
||||||
|
|
||||||
|
#### A signed number.
|
||||||
|
|
||||||
|
def p_number_signed(p):
|
||||||
|
'''number : MINUS INTEGER
|
||||||
|
| MINUS FLOAT'''
|
||||||
|
p[0] = eval("-"+p[2])
|
||||||
|
|
||||||
|
#### List of targets for a print statement
|
||||||
|
#### Returns a list of tuples (label,expr)
|
||||||
|
|
||||||
|
def p_plist(p):
|
||||||
|
'''plist : plist COMMA pitem
|
||||||
|
| pitem'''
|
||||||
|
if len(p) > 3:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[0].append(p[3])
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
def p_item_string(p):
|
||||||
|
'''pitem : STRING'''
|
||||||
|
p[0] = (p[1][1:-1],None)
|
||||||
|
|
||||||
|
def p_item_string_expr(p):
|
||||||
|
'''pitem : STRING expr'''
|
||||||
|
p[0] = (p[1][1:-1],p[2])
|
||||||
|
|
||||||
|
def p_item_expr(p):
|
||||||
|
'''pitem : expr'''
|
||||||
|
p[0] = ("",p[1])
|
||||||
|
|
||||||
|
#### Empty
|
||||||
|
|
||||||
|
def p_empty(p):
|
||||||
|
'''empty : '''
|
||||||
|
|
||||||
|
#### Catastrophic error handler
|
||||||
|
def p_error(p):
|
||||||
|
if not p:
|
||||||
|
print "SYNTAX ERROR AT EOF"
|
||||||
|
|
||||||
|
bparser = yacc.yacc()
|
||||||
|
|
||||||
|
def parse(data):
|
||||||
|
bparser.error = 0
|
||||||
|
p = bparser.parse(data)
|
||||||
|
if bparser.error: return None
|
||||||
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
14
ext/ply/example/BASIC/dim.bas
Normal file
14
ext/ply/example/BASIC/dim.bas
Normal file
|
@ -0,0 +1,14 @@
|
||||||
|
5 DIM A(50,15)
|
||||||
|
10 FOR I = 1 TO 50
|
||||||
|
20 FOR J = 1 TO 15
|
||||||
|
30 LET A(I,J) = I + J
|
||||||
|
35 REM PRINT I,J, A(I,J)
|
||||||
|
40 NEXT J
|
||||||
|
50 NEXT I
|
||||||
|
100 FOR I = 1 TO 50
|
||||||
|
110 FOR J = 1 TO 15
|
||||||
|
120 PRINT A(I,J),
|
||||||
|
130 NEXT J
|
||||||
|
140 PRINT
|
||||||
|
150 NEXT I
|
||||||
|
999 END
|
5
ext/ply/example/BASIC/func.bas
Normal file
5
ext/ply/example/BASIC/func.bas
Normal file
|
@ -0,0 +1,5 @@
|
||||||
|
10 DEF FDX(X) = 2*X
|
||||||
|
20 FOR I = 0 TO 100
|
||||||
|
30 PRINT FDX(I)
|
||||||
|
40 NEXT I
|
||||||
|
50 END
|
22
ext/ply/example/BASIC/gcd.bas
Normal file
22
ext/ply/example/BASIC/gcd.bas
Normal file
|
@ -0,0 +1,22 @@
|
||||||
|
10 PRINT "A","B","C","GCD"
|
||||||
|
20 READ A,B,C
|
||||||
|
30 LET X = A
|
||||||
|
40 LET Y = B
|
||||||
|
50 GOSUB 200
|
||||||
|
60 LET X = G
|
||||||
|
70 LET Y = C
|
||||||
|
80 GOSUB 200
|
||||||
|
90 PRINT A, B, C, G
|
||||||
|
100 GOTO 20
|
||||||
|
110 DATA 60, 90, 120
|
||||||
|
120 DATA 38456, 64872, 98765
|
||||||
|
130 DATA 32, 384, 72
|
||||||
|
200 LET Q = INT(X/Y)
|
||||||
|
210 LET R = X - Q*Y
|
||||||
|
220 IF R = 0 THEN 300
|
||||||
|
230 LET X = Y
|
||||||
|
240 LET Y = R
|
||||||
|
250 GOTO 200
|
||||||
|
300 LET G = Y
|
||||||
|
310 RETURN
|
||||||
|
999 END
|
13
ext/ply/example/BASIC/gosub.bas
Normal file
13
ext/ply/example/BASIC/gosub.bas
Normal file
|
@ -0,0 +1,13 @@
|
||||||
|
100 LET X = 3
|
||||||
|
110 GOSUB 400
|
||||||
|
120 PRINT U, V, W
|
||||||
|
200 LET X = 5
|
||||||
|
210 GOSUB 400
|
||||||
|
220 LET Z = U + 2*V + 3*W
|
||||||
|
230 PRINT Z
|
||||||
|
240 GOTO 999
|
||||||
|
400 LET U = X*X
|
||||||
|
410 LET V = X*X*X
|
||||||
|
420 LET W = X*X*X*X + X*X*X + X*X + X
|
||||||
|
430 RETURN
|
||||||
|
999 END
|
4
ext/ply/example/BASIC/hello.bas
Normal file
4
ext/ply/example/BASIC/hello.bas
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
5 REM HELLO WORLD PROGAM
|
||||||
|
10 PRINT "HELLO WORLD"
|
||||||
|
99 END
|
||||||
|
|
17
ext/ply/example/BASIC/linear.bas
Normal file
17
ext/ply/example/BASIC/linear.bas
Normal file
|
@ -0,0 +1,17 @@
|
||||||
|
1 REM ::: SOLVE A SYSTEM OF LINEAR EQUATIONS
|
||||||
|
2 REM ::: A1*X1 + A2*X2 = B1
|
||||||
|
3 REM ::: A3*X1 + A4*X2 = B2
|
||||||
|
4 REM --------------------------------------
|
||||||
|
10 READ A1, A2, A3, A4
|
||||||
|
15 LET D = A1 * A4 - A3 * A2
|
||||||
|
20 IF D = 0 THEN 65
|
||||||
|
30 READ B1, B2
|
||||||
|
37 LET X1 = (B1*A4 - B2*A2) / D
|
||||||
|
42 LET X2 = (A1*B2 - A3*B1) / D
|
||||||
|
55 PRINT X1, X2
|
||||||
|
60 GOTO 30
|
||||||
|
65 PRINT "NO UNIQUE SOLUTION"
|
||||||
|
70 DATA 1, 2, 4
|
||||||
|
80 DATA 2, -7, 5
|
||||||
|
85 DATA 1, 3, 4, -7
|
||||||
|
90 END
|
12
ext/ply/example/BASIC/maxsin.bas
Normal file
12
ext/ply/example/BASIC/maxsin.bas
Normal file
|
@ -0,0 +1,12 @@
|
||||||
|
5 PRINT "X VALUE", "SINE", "RESOLUTION"
|
||||||
|
10 READ D
|
||||||
|
20 LET M = -1
|
||||||
|
30 FOR X = 0 TO 3 STEP D
|
||||||
|
40 IF SIN(X) <= M THEN 80
|
||||||
|
50 LET X0 = X
|
||||||
|
60 LET M = SIN(X)
|
||||||
|
80 NEXT X
|
||||||
|
85 PRINT X0, M, D
|
||||||
|
90 GOTO 10
|
||||||
|
100 DATA .1, .01, .001
|
||||||
|
110 END
|
13
ext/ply/example/BASIC/powers.bas
Normal file
13
ext/ply/example/BASIC/powers.bas
Normal file
|
@ -0,0 +1,13 @@
|
||||||
|
5 PRINT "THIS PROGRAM COMPUTES AND PRINTS THE NTH POWERS"
|
||||||
|
6 PRINT "OF THE NUMBERS LESS THAN OR EQUAL TO N FOR VARIOUS"
|
||||||
|
7 PRINT "N FROM 1 THROUGH 7"
|
||||||
|
8 PRINT
|
||||||
|
10 FOR N = 1 TO 7
|
||||||
|
15 PRINT "N = "N
|
||||||
|
20 FOR I = 1 TO N
|
||||||
|
30 PRINT I^N,
|
||||||
|
40 NEXT I
|
||||||
|
50 PRINT
|
||||||
|
60 PRINT
|
||||||
|
70 NEXT N
|
||||||
|
80 END
|
4
ext/ply/example/BASIC/rand.bas
Normal file
4
ext/ply/example/BASIC/rand.bas
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
10 FOR I = 1 TO 20
|
||||||
|
20 PRINT INT(10*RND(0))
|
||||||
|
30 NEXT I
|
||||||
|
40 END
|
20
ext/ply/example/BASIC/sales.bas
Normal file
20
ext/ply/example/BASIC/sales.bas
Normal file
|
@ -0,0 +1,20 @@
|
||||||
|
10 FOR I = 1 TO 3
|
||||||
|
20 READ P(I)
|
||||||
|
30 NEXT I
|
||||||
|
40 FOR I = 1 TO 3
|
||||||
|
50 FOR J = 1 TO 5
|
||||||
|
60 READ S(I,J)
|
||||||
|
70 NEXT J
|
||||||
|
80 NEXT I
|
||||||
|
90 FOR J = 1 TO 5
|
||||||
|
100 LET S = 0
|
||||||
|
110 FOR I = 1 TO 3
|
||||||
|
120 LET S = S + P(I) * S(I,J)
|
||||||
|
130 NEXT I
|
||||||
|
140 PRINT "TOTAL SALES FOR SALESMAN"J, "$"S
|
||||||
|
150 NEXT J
|
||||||
|
200 DATA 1.25, 4.30, 2.50
|
||||||
|
210 DATA 40, 20, 37, 29, 42
|
||||||
|
220 DATA 10, 16, 3, 21, 8
|
||||||
|
230 DATA 35, 47, 29, 16, 33
|
||||||
|
300 END
|
18
ext/ply/example/BASIC/sears.bas
Normal file
18
ext/ply/example/BASIC/sears.bas
Normal file
|
@ -0,0 +1,18 @@
|
||||||
|
1 REM :: THIS PROGRAM COMPUTES HOW MANY TIMES YOU HAVE TO FOLD
|
||||||
|
2 REM :: A PIECE OF PAPER SO THAT IT IS TALLER THAN THE
|
||||||
|
3 REM :: SEARS TOWER.
|
||||||
|
4 REM :: S = HEIGHT OF TOWER (METERS)
|
||||||
|
5 REM :: T = THICKNESS OF PAPER (MILLIMETERS)
|
||||||
|
10 LET S = 442
|
||||||
|
20 LET T = 0.1
|
||||||
|
30 REM CONVERT T TO METERS
|
||||||
|
40 LET T = T * .001
|
||||||
|
50 LET F = 1
|
||||||
|
60 LET H = T
|
||||||
|
100 IF H > S THEN 200
|
||||||
|
120 LET H = 2 * H
|
||||||
|
125 LET F = F + 1
|
||||||
|
130 GOTO 100
|
||||||
|
200 PRINT "NUMBER OF FOLDS ="F
|
||||||
|
220 PRINT "FINAL HEIGHT ="H
|
||||||
|
999 END
|
5
ext/ply/example/BASIC/sqrt1.bas
Normal file
5
ext/ply/example/BASIC/sqrt1.bas
Normal file
|
@ -0,0 +1,5 @@
|
||||||
|
10 LET X = 0
|
||||||
|
20 LET X = X + 1
|
||||||
|
30 PRINT X, SQR(X)
|
||||||
|
40 IF X < 100 THEN 20
|
||||||
|
50 END
|
4
ext/ply/example/BASIC/sqrt2.bas
Normal file
4
ext/ply/example/BASIC/sqrt2.bas
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
10 FOR X = 1 TO 100
|
||||||
|
20 PRINT X, SQR(X)
|
||||||
|
30 NEXT X
|
||||||
|
40 END
|
709
ext/ply/example/GardenSnake/GardenSnake.py
Normal file
709
ext/ply/example/GardenSnake/GardenSnake.py
Normal file
|
@ -0,0 +1,709 @@
|
||||||
|
# GardenSnake - a parser generator demonstration program
|
||||||
|
#
|
||||||
|
# This implements a modified version of a subset of Python:
|
||||||
|
# - only 'def', 'return' and 'if' statements
|
||||||
|
# - 'if' only has 'then' clause (no elif nor else)
|
||||||
|
# - single-quoted strings only, content in raw format
|
||||||
|
# - numbers are decimal.Decimal instances (not integers or floats)
|
||||||
|
# - no print statment; use the built-in 'print' function
|
||||||
|
# - only < > == + - / * implemented (and unary + -)
|
||||||
|
# - assignment and tuple assignment work
|
||||||
|
# - no generators of any sort
|
||||||
|
# - no ... well, no quite a lot
|
||||||
|
|
||||||
|
# Why? I'm thinking about a new indentation-based configuration
|
||||||
|
# language for a project and wanted to figure out how to do it. Once
|
||||||
|
# I got that working I needed a way to test it out. My original AST
|
||||||
|
# was dumb so I decided to target Python's AST and compile it into
|
||||||
|
# Python code. Plus, it's pretty cool that it only took a day or so
|
||||||
|
# from sitting down with Ply to having working code.
|
||||||
|
|
||||||
|
# This uses David Beazley's Ply from http://www.dabeaz.com/ply/
|
||||||
|
|
||||||
|
# This work is hereby released into the Public Domain. To view a copy of
|
||||||
|
# the public domain dedication, visit
|
||||||
|
# http://creativecommons.org/licenses/publicdomain/ or send a letter to
|
||||||
|
# Creative Commons, 543 Howard Street, 5th Floor, San Francisco,
|
||||||
|
# California, 94105, USA.
|
||||||
|
#
|
||||||
|
# Portions of this work are derived from Python's Grammar definition
|
||||||
|
# and may be covered under the Python copyright and license
|
||||||
|
#
|
||||||
|
# Andrew Dalke / Dalke Scientific Software, LLC
|
||||||
|
# 30 August 2006 / Cape Town, South Africa
|
||||||
|
|
||||||
|
# Changelog:
|
||||||
|
# 30 August - added link to CC license; removed the "swapcase" encoding
|
||||||
|
|
||||||
|
# Modifications for inclusion in PLY distribution
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
##### Lexer ######
|
||||||
|
#import lex
|
||||||
|
import decimal
|
||||||
|
|
||||||
|
tokens = (
|
||||||
|
'DEF',
|
||||||
|
'IF',
|
||||||
|
'NAME',
|
||||||
|
'NUMBER', # Python decimals
|
||||||
|
'STRING', # single quoted strings only; syntax of raw strings
|
||||||
|
'LPAR',
|
||||||
|
'RPAR',
|
||||||
|
'COLON',
|
||||||
|
'EQ',
|
||||||
|
'ASSIGN',
|
||||||
|
'LT',
|
||||||
|
'GT',
|
||||||
|
'PLUS',
|
||||||
|
'MINUS',
|
||||||
|
'MULT',
|
||||||
|
'DIV',
|
||||||
|
'RETURN',
|
||||||
|
'WS',
|
||||||
|
'NEWLINE',
|
||||||
|
'COMMA',
|
||||||
|
'SEMICOLON',
|
||||||
|
'INDENT',
|
||||||
|
'DEDENT',
|
||||||
|
'ENDMARKER',
|
||||||
|
)
|
||||||
|
|
||||||
|
#t_NUMBER = r'\d+'
|
||||||
|
# taken from decmial.py but without the leading sign
|
||||||
|
def t_NUMBER(t):
|
||||||
|
r"""(\d+(\.\d*)?|\.\d+)([eE][-+]? \d+)?"""
|
||||||
|
t.value = decimal.Decimal(t.value)
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_STRING(t):
|
||||||
|
r"'([^\\']+|\\'|\\\\)*'" # I think this is right ...
|
||||||
|
t.value=t.value[1:-1].decode("string-escape") # .swapcase() # for fun
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_COLON = r':'
|
||||||
|
t_EQ = r'=='
|
||||||
|
t_ASSIGN = r'='
|
||||||
|
t_LT = r'<'
|
||||||
|
t_GT = r'>'
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_MULT = r'\*'
|
||||||
|
t_DIV = r'/'
|
||||||
|
t_COMMA = r','
|
||||||
|
t_SEMICOLON = r';'
|
||||||
|
|
||||||
|
# Ply nicely documented how to do this.
|
||||||
|
|
||||||
|
RESERVED = {
|
||||||
|
"def": "DEF",
|
||||||
|
"if": "IF",
|
||||||
|
"return": "RETURN",
|
||||||
|
}
|
||||||
|
|
||||||
|
def t_NAME(t):
|
||||||
|
r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||||
|
t.type = RESERVED.get(t.value, "NAME")
|
||||||
|
return t
|
||||||
|
|
||||||
|
# Putting this before t_WS let it consume lines with only comments in
|
||||||
|
# them so the latter code never sees the WS part. Not consuming the
|
||||||
|
# newline. Needed for "if 1: #comment"
|
||||||
|
def t_comment(t):
|
||||||
|
r"[ ]*\043[^\n]*" # \043 is '#'
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
# Whitespace
|
||||||
|
def t_WS(t):
|
||||||
|
r' [ ]+ '
|
||||||
|
if t.lexer.at_line_start and t.lexer.paren_count == 0:
|
||||||
|
return t
|
||||||
|
|
||||||
|
# Don't generate newline tokens when inside of parenthesis, eg
|
||||||
|
# a = (1,
|
||||||
|
# 2, 3)
|
||||||
|
def t_newline(t):
|
||||||
|
r'\n+'
|
||||||
|
t.lexer.lineno += len(t.value)
|
||||||
|
t.type = "NEWLINE"
|
||||||
|
if t.lexer.paren_count == 0:
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_LPAR(t):
|
||||||
|
r'\('
|
||||||
|
t.lexer.paren_count += 1
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_RPAR(t):
|
||||||
|
r'\)'
|
||||||
|
# check for underflow? should be the job of the parser
|
||||||
|
t.lexer.paren_count -= 1
|
||||||
|
return t
|
||||||
|
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
raise SyntaxError("Unknown symbol %r" % (t.value[0],))
|
||||||
|
print "Skipping", repr(t.value[0])
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
## I implemented INDENT / DEDENT generation as a post-processing filter
|
||||||
|
|
||||||
|
# The original lex token stream contains WS and NEWLINE characters.
|
||||||
|
# WS will only occur before any other tokens on a line.
|
||||||
|
|
||||||
|
# I have three filters. One tags tokens by adding two attributes.
|
||||||
|
# "must_indent" is True if the token must be indented from the
|
||||||
|
# previous code. The other is "at_line_start" which is True for WS
|
||||||
|
# and the first non-WS/non-NEWLINE on a line. It flags the check so
|
||||||
|
# see if the new line has changed indication level.
|
||||||
|
|
||||||
|
# Python's syntax has three INDENT states
|
||||||
|
# 0) no colon hence no need to indent
|
||||||
|
# 1) "if 1: go()" - simple statements have a COLON but no need for an indent
|
||||||
|
# 2) "if 1:\n go()" - complex statements have a COLON NEWLINE and must indent
|
||||||
|
NO_INDENT = 0
|
||||||
|
MAY_INDENT = 1
|
||||||
|
MUST_INDENT = 2
|
||||||
|
|
||||||
|
# only care about whitespace at the start of a line
|
||||||
|
def track_tokens_filter(lexer, tokens):
|
||||||
|
lexer.at_line_start = at_line_start = True
|
||||||
|
indent = NO_INDENT
|
||||||
|
saw_colon = False
|
||||||
|
for token in tokens:
|
||||||
|
token.at_line_start = at_line_start
|
||||||
|
|
||||||
|
if token.type == "COLON":
|
||||||
|
at_line_start = False
|
||||||
|
indent = MAY_INDENT
|
||||||
|
token.must_indent = False
|
||||||
|
|
||||||
|
elif token.type == "NEWLINE":
|
||||||
|
at_line_start = True
|
||||||
|
if indent == MAY_INDENT:
|
||||||
|
indent = MUST_INDENT
|
||||||
|
token.must_indent = False
|
||||||
|
|
||||||
|
elif token.type == "WS":
|
||||||
|
assert token.at_line_start == True
|
||||||
|
at_line_start = True
|
||||||
|
token.must_indent = False
|
||||||
|
|
||||||
|
else:
|
||||||
|
# A real token; only indent after COLON NEWLINE
|
||||||
|
if indent == MUST_INDENT:
|
||||||
|
token.must_indent = True
|
||||||
|
else:
|
||||||
|
token.must_indent = False
|
||||||
|
at_line_start = False
|
||||||
|
indent = NO_INDENT
|
||||||
|
|
||||||
|
yield token
|
||||||
|
lexer.at_line_start = at_line_start
|
||||||
|
|
||||||
|
def _new_token(type, lineno):
|
||||||
|
tok = lex.LexToken()
|
||||||
|
tok.type = type
|
||||||
|
tok.value = None
|
||||||
|
tok.lineno = lineno
|
||||||
|
return tok
|
||||||
|
|
||||||
|
# Synthesize a DEDENT tag
|
||||||
|
def DEDENT(lineno):
|
||||||
|
return _new_token("DEDENT", lineno)
|
||||||
|
|
||||||
|
# Synthesize an INDENT tag
|
||||||
|
def INDENT(lineno):
|
||||||
|
return _new_token("INDENT", lineno)
|
||||||
|
|
||||||
|
|
||||||
|
# Track the indentation level and emit the right INDENT / DEDENT events.
|
||||||
|
def indentation_filter(tokens):
|
||||||
|
# A stack of indentation levels; will never pop item 0
|
||||||
|
levels = [0]
|
||||||
|
token = None
|
||||||
|
depth = 0
|
||||||
|
prev_was_ws = False
|
||||||
|
for token in tokens:
|
||||||
|
## if 1:
|
||||||
|
## print "Process", token,
|
||||||
|
## if token.at_line_start:
|
||||||
|
## print "at_line_start",
|
||||||
|
## if token.must_indent:
|
||||||
|
## print "must_indent",
|
||||||
|
## print
|
||||||
|
|
||||||
|
# WS only occurs at the start of the line
|
||||||
|
# There may be WS followed by NEWLINE so
|
||||||
|
# only track the depth here. Don't indent/dedent
|
||||||
|
# until there's something real.
|
||||||
|
if token.type == "WS":
|
||||||
|
assert depth == 0
|
||||||
|
depth = len(token.value)
|
||||||
|
prev_was_ws = True
|
||||||
|
# WS tokens are never passed to the parser
|
||||||
|
continue
|
||||||
|
|
||||||
|
if token.type == "NEWLINE":
|
||||||
|
depth = 0
|
||||||
|
if prev_was_ws or token.at_line_start:
|
||||||
|
# ignore blank lines
|
||||||
|
continue
|
||||||
|
# pass the other cases on through
|
||||||
|
yield token
|
||||||
|
continue
|
||||||
|
|
||||||
|
# then it must be a real token (not WS, not NEWLINE)
|
||||||
|
# which can affect the indentation level
|
||||||
|
|
||||||
|
prev_was_ws = False
|
||||||
|
if token.must_indent:
|
||||||
|
# The current depth must be larger than the previous level
|
||||||
|
if not (depth > levels[-1]):
|
||||||
|
raise IndentationError("expected an indented block")
|
||||||
|
|
||||||
|
levels.append(depth)
|
||||||
|
yield INDENT(token.lineno)
|
||||||
|
|
||||||
|
elif token.at_line_start:
|
||||||
|
# Must be on the same level or one of the previous levels
|
||||||
|
if depth == levels[-1]:
|
||||||
|
# At the same level
|
||||||
|
pass
|
||||||
|
elif depth > levels[-1]:
|
||||||
|
raise IndentationError("indentation increase but not in new block")
|
||||||
|
else:
|
||||||
|
# Back up; but only if it matches a previous level
|
||||||
|
try:
|
||||||
|
i = levels.index(depth)
|
||||||
|
except ValueError:
|
||||||
|
raise IndentationError("inconsistent indentation")
|
||||||
|
for _ in range(i+1, len(levels)):
|
||||||
|
yield DEDENT(token.lineno)
|
||||||
|
levels.pop()
|
||||||
|
|
||||||
|
yield token
|
||||||
|
|
||||||
|
### Finished processing ###
|
||||||
|
|
||||||
|
# Must dedent any remaining levels
|
||||||
|
if len(levels) > 1:
|
||||||
|
assert token is not None
|
||||||
|
for _ in range(1, len(levels)):
|
||||||
|
yield DEDENT(token.lineno)
|
||||||
|
|
||||||
|
|
||||||
|
# The top-level filter adds an ENDMARKER, if requested.
|
||||||
|
# Python's grammar uses it.
|
||||||
|
def filter(lexer, add_endmarker = True):
|
||||||
|
token = None
|
||||||
|
tokens = iter(lexer.token, None)
|
||||||
|
tokens = track_tokens_filter(lexer, tokens)
|
||||||
|
for token in indentation_filter(tokens):
|
||||||
|
yield token
|
||||||
|
|
||||||
|
if add_endmarker:
|
||||||
|
lineno = 1
|
||||||
|
if token is not None:
|
||||||
|
lineno = token.lineno
|
||||||
|
yield _new_token("ENDMARKER", lineno)
|
||||||
|
|
||||||
|
# Combine Ply and my filters into a new lexer
|
||||||
|
|
||||||
|
class IndentLexer(object):
|
||||||
|
def __init__(self, debug=0, optimize=0, lextab='lextab', reflags=0):
|
||||||
|
self.lexer = lex.lex(debug=debug, optimize=optimize, lextab=lextab, reflags=reflags)
|
||||||
|
self.token_stream = None
|
||||||
|
def input(self, s, add_endmarker=True):
|
||||||
|
self.lexer.paren_count = 0
|
||||||
|
self.lexer.input(s)
|
||||||
|
self.token_stream = filter(self.lexer, add_endmarker)
|
||||||
|
def token(self):
|
||||||
|
try:
|
||||||
|
return self.token_stream.next()
|
||||||
|
except StopIteration:
|
||||||
|
return None
|
||||||
|
|
||||||
|
########## Parser (tokens -> AST) ######
|
||||||
|
|
||||||
|
# also part of Ply
|
||||||
|
#import yacc
|
||||||
|
|
||||||
|
# I use the Python AST
|
||||||
|
from compiler import ast
|
||||||
|
|
||||||
|
# Helper function
|
||||||
|
def Assign(left, right):
|
||||||
|
names = []
|
||||||
|
if isinstance(left, ast.Name):
|
||||||
|
# Single assignment on left
|
||||||
|
return ast.Assign([ast.AssName(left.name, 'OP_ASSIGN')], right)
|
||||||
|
elif isinstance(left, ast.Tuple):
|
||||||
|
# List of things - make sure they are Name nodes
|
||||||
|
names = []
|
||||||
|
for child in left.getChildren():
|
||||||
|
if not isinstance(child, ast.Name):
|
||||||
|
raise SyntaxError("that assignment not supported")
|
||||||
|
names.append(child.name)
|
||||||
|
ass_list = [ast.AssName(name, 'OP_ASSIGN') for name in names]
|
||||||
|
return ast.Assign([ast.AssTuple(ass_list)], right)
|
||||||
|
else:
|
||||||
|
raise SyntaxError("Can't do that yet")
|
||||||
|
|
||||||
|
|
||||||
|
# The grammar comments come from Python's Grammar/Grammar file
|
||||||
|
|
||||||
|
## NB: compound_stmt in single_input is followed by extra NEWLINE!
|
||||||
|
# file_input: (NEWLINE | stmt)* ENDMARKER
|
||||||
|
def p_file_input_end(p):
|
||||||
|
"""file_input_end : file_input ENDMARKER"""
|
||||||
|
p[0] = ast.Stmt(p[1])
|
||||||
|
def p_file_input(p):
|
||||||
|
"""file_input : file_input NEWLINE
|
||||||
|
| file_input stmt
|
||||||
|
| NEWLINE
|
||||||
|
| stmt"""
|
||||||
|
if isinstance(p[len(p)-1], basestring):
|
||||||
|
if len(p) == 3:
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
p[0] = [] # p == 2 --> only a blank line
|
||||||
|
else:
|
||||||
|
if len(p) == 3:
|
||||||
|
p[0] = p[1] + p[2]
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
|
||||||
|
# funcdef: [decorators] 'def' NAME parameters ':' suite
|
||||||
|
# ignoring decorators
|
||||||
|
def p_funcdef(p):
|
||||||
|
"funcdef : DEF NAME parameters COLON suite"
|
||||||
|
p[0] = ast.Function(None, p[2], tuple(p[3]), (), 0, None, p[5])
|
||||||
|
|
||||||
|
# parameters: '(' [varargslist] ')'
|
||||||
|
def p_parameters(p):
|
||||||
|
"""parameters : LPAR RPAR
|
||||||
|
| LPAR varargslist RPAR"""
|
||||||
|
if len(p) == 3:
|
||||||
|
p[0] = []
|
||||||
|
else:
|
||||||
|
p[0] = p[2]
|
||||||
|
|
||||||
|
|
||||||
|
# varargslist: (fpdef ['=' test] ',')* ('*' NAME [',' '**' NAME] | '**' NAME) |
|
||||||
|
# highly simplified
|
||||||
|
def p_varargslist(p):
|
||||||
|
"""varargslist : varargslist COMMA NAME
|
||||||
|
| NAME"""
|
||||||
|
if len(p) == 4:
|
||||||
|
p[0] = p[1] + p[3]
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
# stmt: simple_stmt | compound_stmt
|
||||||
|
def p_stmt_simple(p):
|
||||||
|
"""stmt : simple_stmt"""
|
||||||
|
# simple_stmt is a list
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_stmt_compound(p):
|
||||||
|
"""stmt : compound_stmt"""
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
# simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
|
||||||
|
def p_simple_stmt(p):
|
||||||
|
"""simple_stmt : small_stmts NEWLINE
|
||||||
|
| small_stmts SEMICOLON NEWLINE"""
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_small_stmts(p):
|
||||||
|
"""small_stmts : small_stmts SEMICOLON small_stmt
|
||||||
|
| small_stmt"""
|
||||||
|
if len(p) == 4:
|
||||||
|
p[0] = p[1] + [p[3]]
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
# small_stmt: expr_stmt | print_stmt | del_stmt | pass_stmt | flow_stmt |
|
||||||
|
# import_stmt | global_stmt | exec_stmt | assert_stmt
|
||||||
|
def p_small_stmt(p):
|
||||||
|
"""small_stmt : flow_stmt
|
||||||
|
| expr_stmt"""
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
# expr_stmt: testlist (augassign (yield_expr|testlist) |
|
||||||
|
# ('=' (yield_expr|testlist))*)
|
||||||
|
# augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' |
|
||||||
|
# '<<=' | '>>=' | '**=' | '//=')
|
||||||
|
def p_expr_stmt(p):
|
||||||
|
"""expr_stmt : testlist ASSIGN testlist
|
||||||
|
| testlist """
|
||||||
|
if len(p) == 2:
|
||||||
|
# a list of expressions
|
||||||
|
p[0] = ast.Discard(p[1])
|
||||||
|
else:
|
||||||
|
p[0] = Assign(p[1], p[3])
|
||||||
|
|
||||||
|
def p_flow_stmt(p):
|
||||||
|
"flow_stmt : return_stmt"
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
# return_stmt: 'return' [testlist]
|
||||||
|
def p_return_stmt(p):
|
||||||
|
"return_stmt : RETURN testlist"
|
||||||
|
p[0] = ast.Return(p[2])
|
||||||
|
|
||||||
|
|
||||||
|
def p_compound_stmt(p):
|
||||||
|
"""compound_stmt : if_stmt
|
||||||
|
| funcdef"""
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_if_stmt(p):
|
||||||
|
'if_stmt : IF test COLON suite'
|
||||||
|
p[0] = ast.If([(p[2], p[4])], None)
|
||||||
|
|
||||||
|
def p_suite(p):
|
||||||
|
"""suite : simple_stmt
|
||||||
|
| NEWLINE INDENT stmts DEDENT"""
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = ast.Stmt(p[1])
|
||||||
|
else:
|
||||||
|
p[0] = ast.Stmt(p[3])
|
||||||
|
|
||||||
|
|
||||||
|
def p_stmts(p):
|
||||||
|
"""stmts : stmts stmt
|
||||||
|
| stmt"""
|
||||||
|
if len(p) == 3:
|
||||||
|
p[0] = p[1] + p[2]
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
## No using Python's approach because Ply supports precedence
|
||||||
|
|
||||||
|
# comparison: expr (comp_op expr)*
|
||||||
|
# arith_expr: term (('+'|'-') term)*
|
||||||
|
# term: factor (('*'|'/'|'%'|'//') factor)*
|
||||||
|
# factor: ('+'|'-'|'~') factor | power
|
||||||
|
# comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
|
||||||
|
|
||||||
|
def make_lt_compare((left, right)):
|
||||||
|
return ast.Compare(left, [('<', right),])
|
||||||
|
def make_gt_compare((left, right)):
|
||||||
|
return ast.Compare(left, [('>', right),])
|
||||||
|
def make_eq_compare((left, right)):
|
||||||
|
return ast.Compare(left, [('==', right),])
|
||||||
|
|
||||||
|
|
||||||
|
binary_ops = {
|
||||||
|
"+": ast.Add,
|
||||||
|
"-": ast.Sub,
|
||||||
|
"*": ast.Mul,
|
||||||
|
"/": ast.Div,
|
||||||
|
"<": make_lt_compare,
|
||||||
|
">": make_gt_compare,
|
||||||
|
"==": make_eq_compare,
|
||||||
|
}
|
||||||
|
unary_ops = {
|
||||||
|
"+": ast.UnaryAdd,
|
||||||
|
"-": ast.UnarySub,
|
||||||
|
}
|
||||||
|
precedence = (
|
||||||
|
("left", "EQ", "GT", "LT"),
|
||||||
|
("left", "PLUS", "MINUS"),
|
||||||
|
("left", "MULT", "DIV"),
|
||||||
|
)
|
||||||
|
|
||||||
|
def p_comparison(p):
|
||||||
|
"""comparison : comparison PLUS comparison
|
||||||
|
| comparison MINUS comparison
|
||||||
|
| comparison MULT comparison
|
||||||
|
| comparison DIV comparison
|
||||||
|
| comparison LT comparison
|
||||||
|
| comparison EQ comparison
|
||||||
|
| comparison GT comparison
|
||||||
|
| PLUS comparison
|
||||||
|
| MINUS comparison
|
||||||
|
| power"""
|
||||||
|
if len(p) == 4:
|
||||||
|
p[0] = binary_ops[p[2]]((p[1], p[3]))
|
||||||
|
elif len(p) == 3:
|
||||||
|
p[0] = unary_ops[p[1]](p[2])
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
# power: atom trailer* ['**' factor]
|
||||||
|
# trailers enables function calls. I only allow one level of calls
|
||||||
|
# so this is 'trailer'
|
||||||
|
def p_power(p):
|
||||||
|
"""power : atom
|
||||||
|
| atom trailer"""
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
if p[2][0] == "CALL":
|
||||||
|
p[0] = ast.CallFunc(p[1], p[2][1], None, None)
|
||||||
|
else:
|
||||||
|
raise AssertionError("not implemented")
|
||||||
|
|
||||||
|
def p_atom_name(p):
|
||||||
|
"""atom : NAME"""
|
||||||
|
p[0] = ast.Name(p[1])
|
||||||
|
|
||||||
|
def p_atom_number(p):
|
||||||
|
"""atom : NUMBER
|
||||||
|
| STRING"""
|
||||||
|
p[0] = ast.Const(p[1])
|
||||||
|
|
||||||
|
def p_atom_tuple(p):
|
||||||
|
"""atom : LPAR testlist RPAR"""
|
||||||
|
p[0] = p[2]
|
||||||
|
|
||||||
|
# trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
|
||||||
|
def p_trailer(p):
|
||||||
|
"trailer : LPAR arglist RPAR"
|
||||||
|
p[0] = ("CALL", p[2])
|
||||||
|
|
||||||
|
# testlist: test (',' test)* [',']
|
||||||
|
# Contains shift/reduce error
|
||||||
|
def p_testlist(p):
|
||||||
|
"""testlist : testlist_multi COMMA
|
||||||
|
| testlist_multi """
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
# May need to promote singleton to tuple
|
||||||
|
if isinstance(p[1], list):
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
# Convert into a tuple?
|
||||||
|
if isinstance(p[0], list):
|
||||||
|
p[0] = ast.Tuple(p[0])
|
||||||
|
|
||||||
|
def p_testlist_multi(p):
|
||||||
|
"""testlist_multi : testlist_multi COMMA test
|
||||||
|
| test"""
|
||||||
|
if len(p) == 2:
|
||||||
|
# singleton
|
||||||
|
p[0] = p[1]
|
||||||
|
else:
|
||||||
|
if isinstance(p[1], list):
|
||||||
|
p[0] = p[1] + [p[3]]
|
||||||
|
else:
|
||||||
|
# singleton -> tuple
|
||||||
|
p[0] = [p[1], p[3]]
|
||||||
|
|
||||||
|
|
||||||
|
# test: or_test ['if' or_test 'else' test] | lambdef
|
||||||
|
# as I don't support 'and', 'or', and 'not' this works down to 'comparison'
|
||||||
|
def p_test(p):
|
||||||
|
"test : comparison"
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# arglist: (argument ',')* (argument [',']| '*' test [',' '**' test] | '**' test)
|
||||||
|
# XXX INCOMPLETE: this doesn't allow the trailing comma
|
||||||
|
def p_arglist(p):
|
||||||
|
"""arglist : arglist COMMA argument
|
||||||
|
| argument"""
|
||||||
|
if len(p) == 4:
|
||||||
|
p[0] = p[1] + [p[3]]
|
||||||
|
else:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
|
||||||
|
# argument: test [gen_for] | test '=' test # Really [keyword '='] test
|
||||||
|
def p_argument(p):
|
||||||
|
"argument : test"
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_error(p):
|
||||||
|
#print "Error!", repr(p)
|
||||||
|
raise SyntaxError(p)
|
||||||
|
|
||||||
|
|
||||||
|
class GardenSnakeParser(object):
|
||||||
|
def __init__(self, lexer = None):
|
||||||
|
if lexer is None:
|
||||||
|
lexer = IndentLexer()
|
||||||
|
self.lexer = lexer
|
||||||
|
self.parser = yacc.yacc(start="file_input_end")
|
||||||
|
|
||||||
|
def parse(self, code):
|
||||||
|
self.lexer.input(code)
|
||||||
|
result = self.parser.parse(lexer = self.lexer)
|
||||||
|
return ast.Module(None, result)
|
||||||
|
|
||||||
|
|
||||||
|
###### Code generation ######
|
||||||
|
|
||||||
|
from compiler import misc, syntax, pycodegen
|
||||||
|
|
||||||
|
class GardenSnakeCompiler(object):
|
||||||
|
def __init__(self):
|
||||||
|
self.parser = GardenSnakeParser()
|
||||||
|
def compile(self, code, filename="<string>"):
|
||||||
|
tree = self.parser.parse(code)
|
||||||
|
#print tree
|
||||||
|
misc.set_filename(filename, tree)
|
||||||
|
syntax.check(tree)
|
||||||
|
gen = pycodegen.ModuleCodeGenerator(tree)
|
||||||
|
code = gen.getCode()
|
||||||
|
return code
|
||||||
|
|
||||||
|
####### Test code #######
|
||||||
|
|
||||||
|
compile = GardenSnakeCompiler().compile
|
||||||
|
|
||||||
|
code = r"""
|
||||||
|
|
||||||
|
print('LET\'S TRY THIS \\OUT')
|
||||||
|
|
||||||
|
#Comment here
|
||||||
|
def x(a):
|
||||||
|
print('called with',a)
|
||||||
|
if a == 1:
|
||||||
|
return 2
|
||||||
|
if a*2 > 10: return 999 / 4
|
||||||
|
# Another comment here
|
||||||
|
|
||||||
|
return a+2*3
|
||||||
|
|
||||||
|
ints = (1, 2,
|
||||||
|
3, 4,
|
||||||
|
5)
|
||||||
|
print('mutiline-expression', ints)
|
||||||
|
|
||||||
|
t = 4+1/3*2+6*(9-5+1)
|
||||||
|
print('predence test; should be 34+2/3:', t, t==(34+2/3))
|
||||||
|
|
||||||
|
print('numbers', 1,2,3,4,5)
|
||||||
|
if 1:
|
||||||
|
8
|
||||||
|
a=9
|
||||||
|
print(x(a))
|
||||||
|
|
||||||
|
print(x(1))
|
||||||
|
print(x(2))
|
||||||
|
print(x(8),'3')
|
||||||
|
print('this is decimal', 1/5)
|
||||||
|
print('BIG DECIMAL', 1.234567891234567e12345)
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Set up the GardenSnake run-time environment
|
||||||
|
def print_(*args):
|
||||||
|
print "-->", " ".join(map(str,args))
|
||||||
|
|
||||||
|
globals()["print"] = print_
|
||||||
|
|
||||||
|
compiled_code = compile(code)
|
||||||
|
|
||||||
|
exec compiled_code in globals()
|
||||||
|
print "Done"
|
5
ext/ply/example/GardenSnake/README
Normal file
5
ext/ply/example/GardenSnake/README
Normal file
|
@ -0,0 +1,5 @@
|
||||||
|
This example is Andrew Dalke's GardenSnake language. It shows how to process an
|
||||||
|
indentation-like language like Python. Further details can be found here:
|
||||||
|
|
||||||
|
http://dalkescientific.com/writings/diary/archive/2006/08/30/gardensnake_language.html
|
||||||
|
|
10
ext/ply/example/README
Normal file
10
ext/ply/example/README
Normal file
|
@ -0,0 +1,10 @@
|
||||||
|
Simple examples:
|
||||||
|
calc - Simple calculator
|
||||||
|
classcalc - Simple calculate defined as a class
|
||||||
|
|
||||||
|
Complex examples
|
||||||
|
ansic - ANSI C grammar from K&R
|
||||||
|
BASIC - A small BASIC interpreter
|
||||||
|
GardenSnake - A simple python-like language
|
||||||
|
yply - Converts Unix yacc files to PLY programs.
|
||||||
|
|
|
@ -4,7 +4,10 @@
|
||||||
# A lexer for ANSI C.
|
# A lexer for ANSI C.
|
||||||
# ----------------------------------------------------------------------
|
# ----------------------------------------------------------------------
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
# Reserved words
|
# Reserved words
|
||||||
reserved = (
|
reserved = (
|
||||||
|
@ -53,7 +56,7 @@ t_ignore = ' \t\x0c'
|
||||||
# Newlines
|
# Newlines
|
||||||
def t_NEWLINE(t):
|
def t_NEWLINE(t):
|
||||||
r'\n+'
|
r'\n+'
|
||||||
t.lineno += t.value.count("\n")
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
# Operators
|
# Operators
|
||||||
t_PLUS = r'\+'
|
t_PLUS = r'\+'
|
||||||
|
@ -64,7 +67,7 @@ t_MOD = r'%'
|
||||||
t_OR = r'\|'
|
t_OR = r'\|'
|
||||||
t_AND = r'&'
|
t_AND = r'&'
|
||||||
t_NOT = r'~'
|
t_NOT = r'~'
|
||||||
t_XOR = r'^'
|
t_XOR = r'\^'
|
||||||
t_LSHIFT = r'<<'
|
t_LSHIFT = r'<<'
|
||||||
t_RSHIFT = r'>>'
|
t_RSHIFT = r'>>'
|
||||||
t_LOR = r'\|\|'
|
t_LOR = r'\|\|'
|
||||||
|
@ -149,7 +152,7 @@ def t_preprocessor(t):
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character %s" % repr(t.value[0])
|
print "Illegal character %s" % repr(t.value[0])
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
lexer = lex.lex(optimize=1)
|
lexer = lex.lex(optimize=1)
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
|
@ -4,8 +4,9 @@
|
||||||
# Simple parser for ANSI C. Based on the grammar in K&R, 2nd Ed.
|
# Simple parser for ANSI C. Based on the grammar in K&R, 2nd Ed.
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
import yacc
|
import sys
|
||||||
import clex
|
import clex
|
||||||
|
import ply.yacc as yacc
|
||||||
|
|
||||||
# Get the token map
|
# Get the token map
|
||||||
tokens = clex.tokens
|
tokens = clex.tokens
|
||||||
|
@ -852,7 +853,10 @@ def p_error(t):
|
||||||
|
|
||||||
import profile
|
import profile
|
||||||
# Build the grammar
|
# Build the grammar
|
||||||
profile.run("yacc.yacc()")
|
|
||||||
|
yacc.yacc(method='LALR')
|
||||||
|
|
||||||
|
#profile.run("yacc.yacc(method='LALR')")
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -5,21 +5,17 @@
|
||||||
# "Lex and Yacc", p. 63.
|
# "Lex and Yacc", p. 63.
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
tokens = (
|
tokens = (
|
||||||
'NAME','NUMBER',
|
'NAME','NUMBER',
|
||||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
|
||||||
'LPAREN','RPAREN',
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
literals = ['=','+','-','*','/', '(',')']
|
||||||
|
|
||||||
# Tokens
|
# Tokens
|
||||||
|
|
||||||
t_PLUS = r'\+'
|
|
||||||
t_MINUS = r'-'
|
|
||||||
t_TIMES = r'\*'
|
|
||||||
t_DIVIDE = r'/'
|
|
||||||
t_EQUALS = r'='
|
|
||||||
t_LPAREN = r'\('
|
|
||||||
t_RPAREN = r'\)'
|
|
||||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||||
|
|
||||||
def t_NUMBER(t):
|
def t_NUMBER(t):
|
||||||
|
@ -35,69 +31,69 @@ t_ignore = " \t"
|
||||||
|
|
||||||
def t_newline(t):
|
def t_newline(t):
|
||||||
r'\n+'
|
r'\n+'
|
||||||
t.lineno += t.value.count("\n")
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
import ply.lex as lex
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
||||||
# Parsing rules
|
# Parsing rules
|
||||||
|
|
||||||
precedence = (
|
precedence = (
|
||||||
('left','PLUS','MINUS'),
|
('left','+','-'),
|
||||||
('left','TIMES','DIVIDE'),
|
('left','*','/'),
|
||||||
('right','UMINUS'),
|
('right','UMINUS'),
|
||||||
)
|
)
|
||||||
|
|
||||||
# dictionary of names
|
# dictionary of names
|
||||||
names = { }
|
names = { }
|
||||||
|
|
||||||
def p_statement_assign(t):
|
def p_statement_assign(p):
|
||||||
'statement : NAME EQUALS expression'
|
'statement : NAME "=" expression'
|
||||||
names[t[1]] = t[3]
|
names[p[1]] = p[3]
|
||||||
|
|
||||||
def p_statement_expr(t):
|
def p_statement_expr(p):
|
||||||
'statement : expression'
|
'statement : expression'
|
||||||
print t[1]
|
print p[1]
|
||||||
|
|
||||||
def p_expression_binop(t):
|
def p_expression_binop(p):
|
||||||
'''expression : expression PLUS expression
|
'''expression : expression '+' expression
|
||||||
| expression MINUS expression
|
| expression '-' expression
|
||||||
| expression TIMES expression
|
| expression '*' expression
|
||||||
| expression DIVIDE expression'''
|
| expression '/' expression'''
|
||||||
if t[2] == '+' : t[0] = t[1] + t[3]
|
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||||
elif t[2] == '-': t[0] = t[1] - t[3]
|
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||||
elif t[2] == '*': t[0] = t[1] * t[3]
|
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||||
elif t[2] == '/': t[0] = t[1] / t[3]
|
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||||
|
|
||||||
def p_expression_uminus(t):
|
def p_expression_uminus(p):
|
||||||
'expression : MINUS expression %prec UMINUS'
|
"expression : '-' expression %prec UMINUS"
|
||||||
t[0] = -t[2]
|
p[0] = -p[2]
|
||||||
|
|
||||||
def p_expression_group(t):
|
def p_expression_group(p):
|
||||||
'expression : LPAREN expression RPAREN'
|
"expression : '(' expression ')'"
|
||||||
t[0] = t[2]
|
p[0] = p[2]
|
||||||
|
|
||||||
def p_expression_number(t):
|
def p_expression_number(p):
|
||||||
'expression : NUMBER'
|
"expression : NUMBER"
|
||||||
t[0] = t[1]
|
p[0] = p[1]
|
||||||
|
|
||||||
def p_expression_name(t):
|
def p_expression_name(p):
|
||||||
'expression : NAME'
|
"expression : NAME"
|
||||||
try:
|
try:
|
||||||
t[0] = names[t[1]]
|
p[0] = names[p[1]]
|
||||||
except LookupError:
|
except LookupError:
|
||||||
print "Undefined name '%s'" % t[1]
|
print "Undefined name '%s'" % p[1]
|
||||||
t[0] = 0
|
p[0] = 0
|
||||||
|
|
||||||
def p_error(t):
|
def p_error(p):
|
||||||
print "Syntax error at '%s'" % t.value
|
print "Syntax error at '%s'" % p.value
|
||||||
|
|
||||||
import yacc
|
import ply.yacc as yacc
|
||||||
yacc.yacc()
|
yacc.yacc()
|
||||||
|
|
||||||
while 1:
|
while 1:
|
||||||
|
@ -105,4 +101,5 @@ while 1:
|
||||||
s = raw_input('calc > ')
|
s = raw_input('calc > ')
|
||||||
except EOFError:
|
except EOFError:
|
||||||
break
|
break
|
||||||
|
if not s: continue
|
||||||
yacc.parse(s)
|
yacc.parse(s)
|
||||||
|
|
152
ext/ply/example/classcalc/calc.py
Normal file
152
ext/ply/example/classcalc/calc.py
Normal file
|
@ -0,0 +1,152 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# calc.py
|
||||||
|
#
|
||||||
|
# A simple calculator with variables. This is from O'Reilly's
|
||||||
|
# "Lex and Yacc", p. 63.
|
||||||
|
#
|
||||||
|
# Class-based example contributed to PLY by David McNab
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
import readline
|
||||||
|
import ply.lex as lex
|
||||||
|
import ply.yacc as yacc
|
||||||
|
import os
|
||||||
|
|
||||||
|
class Parser:
|
||||||
|
"""
|
||||||
|
Base class for a lexer/parser that has the rules defined as methods
|
||||||
|
"""
|
||||||
|
tokens = ()
|
||||||
|
precedence = ()
|
||||||
|
|
||||||
|
def __init__(self, **kw):
|
||||||
|
self.debug = kw.get('debug', 0)
|
||||||
|
self.names = { }
|
||||||
|
try:
|
||||||
|
modname = os.path.split(os.path.splitext(__file__)[0])[1] + "_" + self.__class__.__name__
|
||||||
|
except:
|
||||||
|
modname = "parser"+"_"+self.__class__.__name__
|
||||||
|
self.debugfile = modname + ".dbg"
|
||||||
|
self.tabmodule = modname + "_" + "parsetab"
|
||||||
|
#print self.debugfile, self.tabmodule
|
||||||
|
|
||||||
|
# Build the lexer and parser
|
||||||
|
lex.lex(module=self, debug=self.debug)
|
||||||
|
yacc.yacc(module=self,
|
||||||
|
debug=self.debug,
|
||||||
|
debugfile=self.debugfile,
|
||||||
|
tabmodule=self.tabmodule)
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
while 1:
|
||||||
|
try:
|
||||||
|
s = raw_input('calc > ')
|
||||||
|
except EOFError:
|
||||||
|
break
|
||||||
|
if not s: continue
|
||||||
|
yacc.parse(s)
|
||||||
|
|
||||||
|
|
||||||
|
class Calc(Parser):
|
||||||
|
|
||||||
|
tokens = (
|
||||||
|
'NAME','NUMBER',
|
||||||
|
'PLUS','MINUS','EXP', 'TIMES','DIVIDE','EQUALS',
|
||||||
|
'LPAREN','RPAREN',
|
||||||
|
)
|
||||||
|
|
||||||
|
# Tokens
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_EXP = r'\*\*'
|
||||||
|
t_TIMES = r'\*'
|
||||||
|
t_DIVIDE = r'/'
|
||||||
|
t_EQUALS = r'='
|
||||||
|
t_LPAREN = r'\('
|
||||||
|
t_RPAREN = r'\)'
|
||||||
|
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||||
|
|
||||||
|
def t_NUMBER(self, t):
|
||||||
|
r'\d+'
|
||||||
|
try:
|
||||||
|
t.value = int(t.value)
|
||||||
|
except ValueError:
|
||||||
|
print "Integer value too large", t.value
|
||||||
|
t.value = 0
|
||||||
|
#print "parsed number %s" % repr(t.value)
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_ignore = " \t"
|
||||||
|
|
||||||
|
def t_newline(self, t):
|
||||||
|
r'\n+'
|
||||||
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
|
def t_error(self, t):
|
||||||
|
print "Illegal character '%s'" % t.value[0]
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
# Parsing rules
|
||||||
|
|
||||||
|
precedence = (
|
||||||
|
('left','PLUS','MINUS'),
|
||||||
|
('left','TIMES','DIVIDE'),
|
||||||
|
('left', 'EXP'),
|
||||||
|
('right','UMINUS'),
|
||||||
|
)
|
||||||
|
|
||||||
|
def p_statement_assign(self, p):
|
||||||
|
'statement : NAME EQUALS expression'
|
||||||
|
self.names[p[1]] = p[3]
|
||||||
|
|
||||||
|
def p_statement_expr(self, p):
|
||||||
|
'statement : expression'
|
||||||
|
print p[1]
|
||||||
|
|
||||||
|
def p_expression_binop(self, p):
|
||||||
|
"""
|
||||||
|
expression : expression PLUS expression
|
||||||
|
| expression MINUS expression
|
||||||
|
| expression TIMES expression
|
||||||
|
| expression DIVIDE expression
|
||||||
|
| expression EXP expression
|
||||||
|
"""
|
||||||
|
#print [repr(p[i]) for i in range(0,4)]
|
||||||
|
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||||
|
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||||
|
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||||
|
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||||
|
elif p[2] == '**': p[0] = p[1] ** p[3]
|
||||||
|
|
||||||
|
def p_expression_uminus(self, p):
|
||||||
|
'expression : MINUS expression %prec UMINUS'
|
||||||
|
p[0] = -p[2]
|
||||||
|
|
||||||
|
def p_expression_group(self, p):
|
||||||
|
'expression : LPAREN expression RPAREN'
|
||||||
|
p[0] = p[2]
|
||||||
|
|
||||||
|
def p_expression_number(self, p):
|
||||||
|
'expression : NUMBER'
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_expression_name(self, p):
|
||||||
|
'expression : NAME'
|
||||||
|
try:
|
||||||
|
p[0] = self.names[p[1]]
|
||||||
|
except LookupError:
|
||||||
|
print "Undefined name '%s'" % p[1]
|
||||||
|
p[0] = 0
|
||||||
|
|
||||||
|
def p_error(self, p):
|
||||||
|
print "Syntax error at '%s'" % p.value
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
calc = Calc()
|
||||||
|
calc.run()
|
2
ext/ply/example/cleanup.sh
Normal file
2
ext/ply/example/cleanup.sh
Normal file
|
@ -0,0 +1,2 @@
|
||||||
|
#!/bin/sh
|
||||||
|
rm -f */*.pyc */parsetab.py */parser.out */*~ */*.class
|
|
@ -14,6 +14,10 @@
|
||||||
# such tokens
|
# such tokens
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
|
||||||
tokens = (
|
tokens = (
|
||||||
'H_EDIT_DESCRIPTOR',
|
'H_EDIT_DESCRIPTOR',
|
||||||
)
|
)
|
||||||
|
@ -34,10 +38,10 @@ def t_H_EDIT_DESCRIPTOR(t):
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
import ply.lex as lex
|
||||||
lex.lex()
|
lex.lex()
|
||||||
lex.runmain()
|
lex.runmain()
|
||||||
|
|
||||||
|
|
155
ext/ply/example/newclasscalc/calc.py
Normal file
155
ext/ply/example/newclasscalc/calc.py
Normal file
|
@ -0,0 +1,155 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# calc.py
|
||||||
|
#
|
||||||
|
# A simple calculator with variables. This is from O'Reilly's
|
||||||
|
# "Lex and Yacc", p. 63.
|
||||||
|
#
|
||||||
|
# Class-based example contributed to PLY by David McNab.
|
||||||
|
#
|
||||||
|
# Modified to use new-style classes. Test case.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
import readline
|
||||||
|
import ply.lex as lex
|
||||||
|
import ply.yacc as yacc
|
||||||
|
import os
|
||||||
|
|
||||||
|
class Parser(object):
|
||||||
|
"""
|
||||||
|
Base class for a lexer/parser that has the rules defined as methods
|
||||||
|
"""
|
||||||
|
tokens = ()
|
||||||
|
precedence = ()
|
||||||
|
|
||||||
|
|
||||||
|
def __init__(self, **kw):
|
||||||
|
self.debug = kw.get('debug', 0)
|
||||||
|
self.names = { }
|
||||||
|
try:
|
||||||
|
modname = os.path.split(os.path.splitext(__file__)[0])[1] + "_" + self.__class__.__name__
|
||||||
|
except:
|
||||||
|
modname = "parser"+"_"+self.__class__.__name__
|
||||||
|
self.debugfile = modname + ".dbg"
|
||||||
|
self.tabmodule = modname + "_" + "parsetab"
|
||||||
|
#print self.debugfile, self.tabmodule
|
||||||
|
|
||||||
|
# Build the lexer and parser
|
||||||
|
lex.lex(module=self, debug=self.debug)
|
||||||
|
yacc.yacc(module=self,
|
||||||
|
debug=self.debug,
|
||||||
|
debugfile=self.debugfile,
|
||||||
|
tabmodule=self.tabmodule)
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
while 1:
|
||||||
|
try:
|
||||||
|
s = raw_input('calc > ')
|
||||||
|
except EOFError:
|
||||||
|
break
|
||||||
|
if not s: continue
|
||||||
|
yacc.parse(s)
|
||||||
|
|
||||||
|
|
||||||
|
class Calc(Parser):
|
||||||
|
|
||||||
|
tokens = (
|
||||||
|
'NAME','NUMBER',
|
||||||
|
'PLUS','MINUS','EXP', 'TIMES','DIVIDE','EQUALS',
|
||||||
|
'LPAREN','RPAREN',
|
||||||
|
)
|
||||||
|
|
||||||
|
# Tokens
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_EXP = r'\*\*'
|
||||||
|
t_TIMES = r'\*'
|
||||||
|
t_DIVIDE = r'/'
|
||||||
|
t_EQUALS = r'='
|
||||||
|
t_LPAREN = r'\('
|
||||||
|
t_RPAREN = r'\)'
|
||||||
|
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||||
|
|
||||||
|
def t_NUMBER(self, t):
|
||||||
|
r'\d+'
|
||||||
|
try:
|
||||||
|
t.value = int(t.value)
|
||||||
|
except ValueError:
|
||||||
|
print "Integer value too large", t.value
|
||||||
|
t.value = 0
|
||||||
|
#print "parsed number %s" % repr(t.value)
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_ignore = " \t"
|
||||||
|
|
||||||
|
def t_newline(self, t):
|
||||||
|
r'\n+'
|
||||||
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
|
def t_error(self, t):
|
||||||
|
print "Illegal character '%s'" % t.value[0]
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
# Parsing rules
|
||||||
|
|
||||||
|
precedence = (
|
||||||
|
('left','PLUS','MINUS'),
|
||||||
|
('left','TIMES','DIVIDE'),
|
||||||
|
('left', 'EXP'),
|
||||||
|
('right','UMINUS'),
|
||||||
|
)
|
||||||
|
|
||||||
|
def p_statement_assign(self, p):
|
||||||
|
'statement : NAME EQUALS expression'
|
||||||
|
self.names[p[1]] = p[3]
|
||||||
|
|
||||||
|
def p_statement_expr(self, p):
|
||||||
|
'statement : expression'
|
||||||
|
print p[1]
|
||||||
|
|
||||||
|
def p_expression_binop(self, p):
|
||||||
|
"""
|
||||||
|
expression : expression PLUS expression
|
||||||
|
| expression MINUS expression
|
||||||
|
| expression TIMES expression
|
||||||
|
| expression DIVIDE expression
|
||||||
|
| expression EXP expression
|
||||||
|
"""
|
||||||
|
#print [repr(p[i]) for i in range(0,4)]
|
||||||
|
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||||
|
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||||
|
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||||
|
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||||
|
elif p[2] == '**': p[0] = p[1] ** p[3]
|
||||||
|
|
||||||
|
def p_expression_uminus(self, p):
|
||||||
|
'expression : MINUS expression %prec UMINUS'
|
||||||
|
p[0] = -p[2]
|
||||||
|
|
||||||
|
def p_expression_group(self, p):
|
||||||
|
'expression : LPAREN expression RPAREN'
|
||||||
|
p[0] = p[2]
|
||||||
|
|
||||||
|
def p_expression_number(self, p):
|
||||||
|
'expression : NUMBER'
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_expression_name(self, p):
|
||||||
|
'expression : NAME'
|
||||||
|
try:
|
||||||
|
p[0] = self.names[p[1]]
|
||||||
|
except LookupError:
|
||||||
|
print "Undefined name '%s'" % p[1]
|
||||||
|
p[0] = 0
|
||||||
|
|
||||||
|
def p_error(self, p):
|
||||||
|
print "Syntax error at '%s'" % p.value
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
calc = Calc()
|
||||||
|
calc.run()
|
|
@ -5,6 +5,9 @@
|
||||||
# "Lex and Yacc", p. 63.
|
# "Lex and Yacc", p. 63.
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
tokens = (
|
tokens = (
|
||||||
'NAME','NUMBER',
|
'NAME','NUMBER',
|
||||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||||
|
@ -35,14 +38,14 @@ t_ignore = " \t"
|
||||||
|
|
||||||
def t_newline(t):
|
def t_newline(t):
|
||||||
r'\n+'
|
r'\n+'
|
||||||
t.lineno += t.value.count("\n")
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
import ply.lex as lex
|
||||||
lex.lex(optimize=1)
|
lex.lex(optimize=1)
|
||||||
|
|
||||||
# Parsing rules
|
# Parsing rules
|
||||||
|
@ -98,7 +101,7 @@ def p_expression_name(t):
|
||||||
def p_error(t):
|
def p_error(t):
|
||||||
print "Syntax error at '%s'" % t.value
|
print "Syntax error at '%s'" % t.value
|
||||||
|
|
||||||
import yacc
|
import ply.yacc as yacc
|
||||||
yacc.yacc(optimize=1)
|
yacc.yacc(optimize=1)
|
||||||
|
|
||||||
while 1:
|
while 1:
|
||||||
|
|
114
ext/ply/example/unicalc/calc.py
Normal file
114
ext/ply/example/unicalc/calc.py
Normal file
|
@ -0,0 +1,114 @@
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# calc.py
|
||||||
|
#
|
||||||
|
# A simple calculator with variables. This is from O'Reilly's
|
||||||
|
# "Lex and Yacc", p. 63.
|
||||||
|
#
|
||||||
|
# This example uses unicode strings for tokens, docstrings, and input.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
tokens = (
|
||||||
|
'NAME','NUMBER',
|
||||||
|
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||||
|
'LPAREN','RPAREN',
|
||||||
|
)
|
||||||
|
|
||||||
|
# Tokens
|
||||||
|
|
||||||
|
t_PLUS = ur'\+'
|
||||||
|
t_MINUS = ur'-'
|
||||||
|
t_TIMES = ur'\*'
|
||||||
|
t_DIVIDE = ur'/'
|
||||||
|
t_EQUALS = ur'='
|
||||||
|
t_LPAREN = ur'\('
|
||||||
|
t_RPAREN = ur'\)'
|
||||||
|
t_NAME = ur'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||||
|
|
||||||
|
def t_NUMBER(t):
|
||||||
|
ur'\d+'
|
||||||
|
try:
|
||||||
|
t.value = int(t.value)
|
||||||
|
except ValueError:
|
||||||
|
print "Integer value too large", t.value
|
||||||
|
t.value = 0
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_ignore = u" \t"
|
||||||
|
|
||||||
|
def t_newline(t):
|
||||||
|
ur'\n+'
|
||||||
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
print "Illegal character '%s'" % t.value[0]
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
# Build the lexer
|
||||||
|
import ply.lex as lex
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
# Parsing rules
|
||||||
|
|
||||||
|
precedence = (
|
||||||
|
('left','PLUS','MINUS'),
|
||||||
|
('left','TIMES','DIVIDE'),
|
||||||
|
('right','UMINUS'),
|
||||||
|
)
|
||||||
|
|
||||||
|
# dictionary of names
|
||||||
|
names = { }
|
||||||
|
|
||||||
|
def p_statement_assign(p):
|
||||||
|
'statement : NAME EQUALS expression'
|
||||||
|
names[p[1]] = p[3]
|
||||||
|
|
||||||
|
def p_statement_expr(p):
|
||||||
|
'statement : expression'
|
||||||
|
print p[1]
|
||||||
|
|
||||||
|
def p_expression_binop(p):
|
||||||
|
'''expression : expression PLUS expression
|
||||||
|
| expression MINUS expression
|
||||||
|
| expression TIMES expression
|
||||||
|
| expression DIVIDE expression'''
|
||||||
|
if p[2] == u'+' : p[0] = p[1] + p[3]
|
||||||
|
elif p[2] == u'-': p[0] = p[1] - p[3]
|
||||||
|
elif p[2] == u'*': p[0] = p[1] * p[3]
|
||||||
|
elif p[2] == u'/': p[0] = p[1] / p[3]
|
||||||
|
|
||||||
|
def p_expression_uminus(p):
|
||||||
|
'expression : MINUS expression %prec UMINUS'
|
||||||
|
p[0] = -p[2]
|
||||||
|
|
||||||
|
def p_expression_group(p):
|
||||||
|
'expression : LPAREN expression RPAREN'
|
||||||
|
p[0] = p[2]
|
||||||
|
|
||||||
|
def p_expression_number(p):
|
||||||
|
'expression : NUMBER'
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_expression_name(p):
|
||||||
|
'expression : NAME'
|
||||||
|
try:
|
||||||
|
p[0] = names[p[1]]
|
||||||
|
except LookupError:
|
||||||
|
print "Undefined name '%s'" % p[1]
|
||||||
|
p[0] = 0
|
||||||
|
|
||||||
|
def p_error(p):
|
||||||
|
print "Syntax error at '%s'" % p.value
|
||||||
|
|
||||||
|
import ply.yacc as yacc
|
||||||
|
yacc.yacc()
|
||||||
|
|
||||||
|
while 1:
|
||||||
|
try:
|
||||||
|
s = raw_input('calc > ')
|
||||||
|
except EOFError:
|
||||||
|
break
|
||||||
|
if not s: continue
|
||||||
|
yacc.parse(unicode(s))
|
41
ext/ply/example/yply/README
Normal file
41
ext/ply/example/yply/README
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
yply.py
|
||||||
|
|
||||||
|
This example implements a program yply.py that converts a UNIX-yacc
|
||||||
|
specification file into a PLY-compatible program. To use, simply
|
||||||
|
run it like this:
|
||||||
|
|
||||||
|
% python yply.py [-nocode] inputfile.y >myparser.py
|
||||||
|
|
||||||
|
The output of this program is Python code. In the output,
|
||||||
|
any C code in the original file is included, but is commented out.
|
||||||
|
If you use the -nocode option, then all of the C code in the
|
||||||
|
original file is just discarded.
|
||||||
|
|
||||||
|
To use the resulting grammer with PLY, you'll need to edit the
|
||||||
|
myparser.py file. Within this file, some stub code is included that
|
||||||
|
can be used to test the construction of the parsing tables. However,
|
||||||
|
you'll need to do more editing to make a workable parser.
|
||||||
|
|
||||||
|
Disclaimer: This just an example I threw together in an afternoon.
|
||||||
|
It might have some bugs. However, it worked when I tried it on
|
||||||
|
a yacc-specified C++ parser containing 442 rules and 855 parsing
|
||||||
|
states.
|
||||||
|
|
||||||
|
Comments:
|
||||||
|
|
||||||
|
1. This example does not parse specification files meant for lex/flex.
|
||||||
|
You'll need to specify the tokenizer on your own.
|
||||||
|
|
||||||
|
2. This example shows a number of interesting PLY features including
|
||||||
|
|
||||||
|
- Parsing of literal text delimited by nested parentheses
|
||||||
|
- Some interaction between the parser and the lexer.
|
||||||
|
- Use of literals in the grammar specification
|
||||||
|
- One pass compilation. The program just emits the result,
|
||||||
|
there is no intermediate parse tree.
|
||||||
|
|
||||||
|
3. This program could probably be cleaned up and enhanced a lot.
|
||||||
|
It would be great if someone wanted to work on this (hint).
|
||||||
|
|
||||||
|
-Dave
|
||||||
|
|
112
ext/ply/example/yply/ylex.py
Normal file
112
ext/ply/example/yply/ylex.py
Normal file
|
@ -0,0 +1,112 @@
|
||||||
|
# lexer for yacc-grammars
|
||||||
|
#
|
||||||
|
# Author: David Beazley (dave@dabeaz.com)
|
||||||
|
# Date : October 2, 2006
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.append("../..")
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
tokens = (
|
||||||
|
'LITERAL','SECTION','TOKEN','LEFT','RIGHT','PREC','START','TYPE','NONASSOC','UNION','CODE',
|
||||||
|
'ID','QLITERAL','NUMBER',
|
||||||
|
)
|
||||||
|
|
||||||
|
states = (('code','exclusive'),)
|
||||||
|
|
||||||
|
literals = [ ';', ',', '<', '>', '|',':' ]
|
||||||
|
t_ignore = ' \t'
|
||||||
|
|
||||||
|
t_TOKEN = r'%token'
|
||||||
|
t_LEFT = r'%left'
|
||||||
|
t_RIGHT = r'%right'
|
||||||
|
t_NONASSOC = r'%nonassoc'
|
||||||
|
t_PREC = r'%prec'
|
||||||
|
t_START = r'%start'
|
||||||
|
t_TYPE = r'%type'
|
||||||
|
t_UNION = r'%union'
|
||||||
|
t_ID = r'[a-zA-Z_][a-zA-Z_0-9]*'
|
||||||
|
t_QLITERAL = r'''(?P<quote>['"]).*?(?P=quote)'''
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
def t_SECTION(t):
|
||||||
|
r'%%'
|
||||||
|
if getattr(t.lexer,"lastsection",0):
|
||||||
|
t.value = t.lexer.lexdata[t.lexpos+2:]
|
||||||
|
t.lexer.lexpos = len(t.lexer.lexdata)
|
||||||
|
else:
|
||||||
|
t.lexer.lastsection = 0
|
||||||
|
return t
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_ccomment(t):
|
||||||
|
r'/\*(.|\n)*?\*/'
|
||||||
|
t.lineno += t.value.count('\n')
|
||||||
|
|
||||||
|
t_ignore_cppcomment = r'//.*'
|
||||||
|
|
||||||
|
def t_LITERAL(t):
|
||||||
|
r'%\{(.|\n)*?%\}'
|
||||||
|
t.lexer.lineno += t.value.count("\n")
|
||||||
|
return t
|
||||||
|
|
||||||
|
def t_NEWLINE(t):
|
||||||
|
r'\n'
|
||||||
|
t.lexer.lineno += 1
|
||||||
|
|
||||||
|
def t_code(t):
|
||||||
|
r'\{'
|
||||||
|
t.lexer.codestart = t.lexpos
|
||||||
|
t.lexer.level = 1
|
||||||
|
t.lexer.begin('code')
|
||||||
|
|
||||||
|
def t_code_ignore_string(t):
|
||||||
|
r'\"([^\\\n]|(\\.))*?\"'
|
||||||
|
|
||||||
|
def t_code_ignore_char(t):
|
||||||
|
r'\'([^\\\n]|(\\.))*?\''
|
||||||
|
|
||||||
|
def t_code_ignore_comment(t):
|
||||||
|
r'/\*(.|\n)*?\*/'
|
||||||
|
|
||||||
|
def t_code_ignore_cppcom(t):
|
||||||
|
r'//.*'
|
||||||
|
|
||||||
|
def t_code_lbrace(t):
|
||||||
|
r'\{'
|
||||||
|
t.lexer.level += 1
|
||||||
|
|
||||||
|
def t_code_rbrace(t):
|
||||||
|
r'\}'
|
||||||
|
t.lexer.level -= 1
|
||||||
|
if t.lexer.level == 0:
|
||||||
|
t.type = 'CODE'
|
||||||
|
t.value = t.lexer.lexdata[t.lexer.codestart:t.lexpos+1]
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
t.lexer.lineno += t.value.count('\n')
|
||||||
|
return t
|
||||||
|
|
||||||
|
t_code_ignore_nonspace = r'[^\s\}\'\"\{]+'
|
||||||
|
t_code_ignore_whitespace = r'\s+'
|
||||||
|
t_code_ignore = ""
|
||||||
|
|
||||||
|
def t_code_error(t):
|
||||||
|
raise RuntimeError
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
print "%d: Illegal character '%s'" % (t.lineno, t.value[0])
|
||||||
|
print t.value
|
||||||
|
t.lexer.skip(1)
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
lex.runmain()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
217
ext/ply/example/yply/yparse.py
Normal file
217
ext/ply/example/yply/yparse.py
Normal file
|
@ -0,0 +1,217 @@
|
||||||
|
# parser for Unix yacc-based grammars
|
||||||
|
#
|
||||||
|
# Author: David Beazley (dave@dabeaz.com)
|
||||||
|
# Date : October 2, 2006
|
||||||
|
|
||||||
|
import ylex
|
||||||
|
tokens = ylex.tokens
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
tokenlist = []
|
||||||
|
preclist = []
|
||||||
|
|
||||||
|
emit_code = 1
|
||||||
|
|
||||||
|
def p_yacc(p):
|
||||||
|
'''yacc : defsection rulesection'''
|
||||||
|
|
||||||
|
def p_defsection(p):
|
||||||
|
'''defsection : definitions SECTION
|
||||||
|
| SECTION'''
|
||||||
|
p.lexer.lastsection = 1
|
||||||
|
print "tokens = ", repr(tokenlist)
|
||||||
|
print
|
||||||
|
print "precedence = ", repr(preclist)
|
||||||
|
print
|
||||||
|
print "# -------------- RULES ----------------"
|
||||||
|
print
|
||||||
|
|
||||||
|
def p_rulesection(p):
|
||||||
|
'''rulesection : rules SECTION'''
|
||||||
|
|
||||||
|
print "# -------------- RULES END ----------------"
|
||||||
|
print_code(p[2],0)
|
||||||
|
|
||||||
|
def p_definitions(p):
|
||||||
|
'''definitions : definitions definition
|
||||||
|
| definition'''
|
||||||
|
|
||||||
|
def p_definition_literal(p):
|
||||||
|
'''definition : LITERAL'''
|
||||||
|
print_code(p[1],0)
|
||||||
|
|
||||||
|
def p_definition_start(p):
|
||||||
|
'''definition : START ID'''
|
||||||
|
print "start = '%s'" % p[2]
|
||||||
|
|
||||||
|
def p_definition_token(p):
|
||||||
|
'''definition : toktype opttype idlist optsemi '''
|
||||||
|
for i in p[3]:
|
||||||
|
if i[0] not in "'\"":
|
||||||
|
tokenlist.append(i)
|
||||||
|
if p[1] == '%left':
|
||||||
|
preclist.append(('left',) + tuple(p[3]))
|
||||||
|
elif p[1] == '%right':
|
||||||
|
preclist.append(('right',) + tuple(p[3]))
|
||||||
|
elif p[1] == '%nonassoc':
|
||||||
|
preclist.append(('nonassoc',)+ tuple(p[3]))
|
||||||
|
|
||||||
|
def p_toktype(p):
|
||||||
|
'''toktype : TOKEN
|
||||||
|
| LEFT
|
||||||
|
| RIGHT
|
||||||
|
| NONASSOC'''
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_opttype(p):
|
||||||
|
'''opttype : '<' ID '>'
|
||||||
|
| empty'''
|
||||||
|
|
||||||
|
def p_idlist(p):
|
||||||
|
'''idlist : idlist optcomma tokenid
|
||||||
|
| tokenid'''
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[1].append(p[3])
|
||||||
|
|
||||||
|
def p_tokenid(p):
|
||||||
|
'''tokenid : ID
|
||||||
|
| ID NUMBER
|
||||||
|
| QLITERAL
|
||||||
|
| QLITERAL NUMBER'''
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_optsemi(p):
|
||||||
|
'''optsemi : ';'
|
||||||
|
| empty'''
|
||||||
|
|
||||||
|
def p_optcomma(p):
|
||||||
|
'''optcomma : ','
|
||||||
|
| empty'''
|
||||||
|
|
||||||
|
def p_definition_type(p):
|
||||||
|
'''definition : TYPE '<' ID '>' namelist optsemi'''
|
||||||
|
# type declarations are ignored
|
||||||
|
|
||||||
|
def p_namelist(p):
|
||||||
|
'''namelist : namelist optcomma ID
|
||||||
|
| ID'''
|
||||||
|
|
||||||
|
def p_definition_union(p):
|
||||||
|
'''definition : UNION CODE optsemi'''
|
||||||
|
# Union declarations are ignored
|
||||||
|
|
||||||
|
def p_rules(p):
|
||||||
|
'''rules : rules rule
|
||||||
|
| rule'''
|
||||||
|
if len(p) == 2:
|
||||||
|
rule = p[1]
|
||||||
|
else:
|
||||||
|
rule = p[2]
|
||||||
|
|
||||||
|
# Print out a Python equivalent of this rule
|
||||||
|
|
||||||
|
embedded = [ ] # Embedded actions (a mess)
|
||||||
|
embed_count = 0
|
||||||
|
|
||||||
|
rulename = rule[0]
|
||||||
|
rulecount = 1
|
||||||
|
for r in rule[1]:
|
||||||
|
# r contains one of the rule possibilities
|
||||||
|
print "def p_%s_%d(p):" % (rulename,rulecount)
|
||||||
|
prod = []
|
||||||
|
prodcode = ""
|
||||||
|
for i in range(len(r)):
|
||||||
|
item = r[i]
|
||||||
|
if item[0] == '{': # A code block
|
||||||
|
if i == len(r) - 1:
|
||||||
|
prodcode = item
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
# an embedded action
|
||||||
|
embed_name = "_embed%d_%s" % (embed_count,rulename)
|
||||||
|
prod.append(embed_name)
|
||||||
|
embedded.append((embed_name,item))
|
||||||
|
embed_count += 1
|
||||||
|
else:
|
||||||
|
prod.append(item)
|
||||||
|
print " '''%s : %s'''" % (rulename, " ".join(prod))
|
||||||
|
# Emit code
|
||||||
|
print_code(prodcode,4)
|
||||||
|
print
|
||||||
|
rulecount += 1
|
||||||
|
|
||||||
|
for e,code in embedded:
|
||||||
|
print "def p_%s(p):" % e
|
||||||
|
print " '''%s : '''" % e
|
||||||
|
print_code(code,4)
|
||||||
|
print
|
||||||
|
|
||||||
|
def p_rule(p):
|
||||||
|
'''rule : ID ':' rulelist ';' '''
|
||||||
|
p[0] = (p[1],[p[3]])
|
||||||
|
|
||||||
|
def p_rule2(p):
|
||||||
|
'''rule : ID ':' rulelist morerules ';' '''
|
||||||
|
p[4].insert(0,p[3])
|
||||||
|
p[0] = (p[1],p[4])
|
||||||
|
|
||||||
|
def p_rule_empty(p):
|
||||||
|
'''rule : ID ':' ';' '''
|
||||||
|
p[0] = (p[1],[[]])
|
||||||
|
|
||||||
|
def p_rule_empty2(p):
|
||||||
|
'''rule : ID ':' morerules ';' '''
|
||||||
|
|
||||||
|
p[3].insert(0,[])
|
||||||
|
p[0] = (p[1],p[3])
|
||||||
|
|
||||||
|
def p_morerules(p):
|
||||||
|
'''morerules : morerules '|' rulelist
|
||||||
|
| '|' rulelist
|
||||||
|
| '|' '''
|
||||||
|
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = [[]]
|
||||||
|
elif len(p) == 3:
|
||||||
|
p[0] = [p[2]]
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[0].append(p[3])
|
||||||
|
|
||||||
|
# print "morerules", len(p), p[0]
|
||||||
|
|
||||||
|
def p_rulelist(p):
|
||||||
|
'''rulelist : rulelist ruleitem
|
||||||
|
| ruleitem'''
|
||||||
|
|
||||||
|
if len(p) == 2:
|
||||||
|
p[0] = [p[1]]
|
||||||
|
else:
|
||||||
|
p[0] = p[1]
|
||||||
|
p[1].append(p[2])
|
||||||
|
|
||||||
|
def p_ruleitem(p):
|
||||||
|
'''ruleitem : ID
|
||||||
|
| QLITERAL
|
||||||
|
| CODE
|
||||||
|
| PREC'''
|
||||||
|
p[0] = p[1]
|
||||||
|
|
||||||
|
def p_empty(p):
|
||||||
|
'''empty : '''
|
||||||
|
|
||||||
|
def p_error(p):
|
||||||
|
pass
|
||||||
|
|
||||||
|
yacc.yacc(debug=0)
|
||||||
|
|
||||||
|
def print_code(code,indent):
|
||||||
|
if not emit_code: return
|
||||||
|
codelines = code.splitlines()
|
||||||
|
for c in codelines:
|
||||||
|
print "%s# %s" % (" "*indent,c)
|
||||||
|
|
53
ext/ply/example/yply/yply.py
Normal file
53
ext/ply/example/yply/yply.py
Normal file
|
@ -0,0 +1,53 @@
|
||||||
|
#!/usr/local/bin/python
|
||||||
|
# yply.py
|
||||||
|
#
|
||||||
|
# Author: David Beazley (dave@dabeaz.com)
|
||||||
|
# Date : October 2, 2006
|
||||||
|
#
|
||||||
|
# Converts a UNIX-yacc specification file into a PLY-compatible
|
||||||
|
# specification. To use, simply do this:
|
||||||
|
#
|
||||||
|
# % python yply.py [-nocode] inputfile.y >myparser.py
|
||||||
|
#
|
||||||
|
# The output of this program is Python code. In the output,
|
||||||
|
# any C code in the original file is included, but is commented.
|
||||||
|
# If you use the -nocode option, then all of the C code in the
|
||||||
|
# original file is discarded.
|
||||||
|
#
|
||||||
|
# Disclaimer: This just an example I threw together in an afternoon.
|
||||||
|
# It might have some bugs. However, it worked when I tried it on
|
||||||
|
# a yacc-specified C++ parser containing 442 rules and 855 parsing
|
||||||
|
# states.
|
||||||
|
#
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"../..")
|
||||||
|
|
||||||
|
import ylex
|
||||||
|
import yparse
|
||||||
|
|
||||||
|
from ply import *
|
||||||
|
|
||||||
|
if len(sys.argv) == 1:
|
||||||
|
print "usage : yply.py [-nocode] inputfile"
|
||||||
|
raise SystemExit
|
||||||
|
|
||||||
|
if len(sys.argv) == 3:
|
||||||
|
if sys.argv[1] == '-nocode':
|
||||||
|
yparse.emit_code = 0
|
||||||
|
else:
|
||||||
|
print "Unknown option '%s'" % sys.argv[1]
|
||||||
|
raise SystemExit
|
||||||
|
filename = sys.argv[2]
|
||||||
|
else:
|
||||||
|
filename = sys.argv[1]
|
||||||
|
|
||||||
|
yacc.parse(open(filename).read())
|
||||||
|
|
||||||
|
print """
|
||||||
|
if __name__ == '__main__':
|
||||||
|
from ply import *
|
||||||
|
yacc.yacc()
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
681
ext/ply/lex.py
681
ext/ply/lex.py
|
@ -1,681 +0,0 @@
|
||||||
#-----------------------------------------------------------------------------
|
|
||||||
# ply: lex.py
|
|
||||||
#
|
|
||||||
# Author: David M. Beazley (beazley@cs.uchicago.edu)
|
|
||||||
# Department of Computer Science
|
|
||||||
# University of Chicago
|
|
||||||
# Chicago, IL 60637
|
|
||||||
#
|
|
||||||
# Copyright (C) 2001, David M. Beazley
|
|
||||||
#
|
|
||||||
# $Header: /home/stever/bk/newmem2/ext/ply/lex.py 1.1 03/06/06 14:53:34-00:00 stever@ $
|
|
||||||
#
|
|
||||||
# This library is free software; you can redistribute it and/or
|
|
||||||
# modify it under the terms of the GNU Lesser General Public
|
|
||||||
# License as published by the Free Software Foundation; either
|
|
||||||
# version 2.1 of the License, or (at your option) any later version.
|
|
||||||
#
|
|
||||||
# This library is distributed in the hope that it will be useful,
|
|
||||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
||||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
||||||
# Lesser General Public License for more details.
|
|
||||||
#
|
|
||||||
# You should have received a copy of the GNU Lesser General Public
|
|
||||||
# License along with this library; if not, write to the Free Software
|
|
||||||
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
||||||
#
|
|
||||||
# See the file COPYING for a complete copy of the LGPL.
|
|
||||||
#
|
|
||||||
#
|
|
||||||
# This module automatically constructs a lexical analysis module from regular
|
|
||||||
# expression rules defined in a user-defined module. The idea is essentially the same
|
|
||||||
# as that used in John Aycock's Spark framework, but the implementation works
|
|
||||||
# at the module level rather than requiring the use of classes.
|
|
||||||
#
|
|
||||||
# This module tries to provide an interface that is closely modeled after
|
|
||||||
# the traditional lex interface in Unix. It also differs from Spark
|
|
||||||
# in that:
|
|
||||||
#
|
|
||||||
# - It provides more extensive error checking and reporting if
|
|
||||||
# the user supplies a set of regular expressions that can't
|
|
||||||
# be compiled or if there is any other kind of a problem in
|
|
||||||
# the specification.
|
|
||||||
#
|
|
||||||
# - The interface is geared towards LALR(1) and LR(1) parser
|
|
||||||
# generators. That is tokens are generated one at a time
|
|
||||||
# rather than being generated in advanced all in one step.
|
|
||||||
#
|
|
||||||
# There are a few limitations of this module
|
|
||||||
#
|
|
||||||
# - The module interface makes it somewhat awkward to support more
|
|
||||||
# than one lexer at a time. Although somewhat inelegant from a
|
|
||||||
# design perspective, this is rarely a practical concern for
|
|
||||||
# most compiler projects.
|
|
||||||
#
|
|
||||||
# - The lexer requires that the entire input text be read into
|
|
||||||
# a string before scanning. I suppose that most machines have
|
|
||||||
# enough memory to make this a minor issues, but it makes
|
|
||||||
# the lexer somewhat difficult to use in interactive sessions
|
|
||||||
# or with streaming data.
|
|
||||||
#
|
|
||||||
#-----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
r"""
|
|
||||||
lex.py
|
|
||||||
|
|
||||||
This module builds lex-like scanners based on regular expression rules.
|
|
||||||
To use the module, simply write a collection of regular expression rules
|
|
||||||
and actions like this:
|
|
||||||
|
|
||||||
# lexer.py
|
|
||||||
import lex
|
|
||||||
|
|
||||||
# Define a list of valid tokens
|
|
||||||
tokens = (
|
|
||||||
'IDENTIFIER', 'NUMBER', 'PLUS', 'MINUS'
|
|
||||||
)
|
|
||||||
|
|
||||||
# Define tokens as functions
|
|
||||||
def t_IDENTIFIER(t):
|
|
||||||
r' ([a-zA-Z_](\w|_)* '
|
|
||||||
return t
|
|
||||||
|
|
||||||
def t_NUMBER(t):
|
|
||||||
r' \d+ '
|
|
||||||
return t
|
|
||||||
|
|
||||||
# Some simple tokens with no actions
|
|
||||||
t_PLUS = r'\+'
|
|
||||||
t_MINUS = r'-'
|
|
||||||
|
|
||||||
# Initialize the lexer
|
|
||||||
lex.lex()
|
|
||||||
|
|
||||||
The tokens list is required and contains a complete list of all valid
|
|
||||||
token types that the lexer is allowed to produce. Token types are
|
|
||||||
restricted to be valid identifiers. This means that 'MINUS' is a valid
|
|
||||||
token type whereas '-' is not.
|
|
||||||
|
|
||||||
Rules are defined by writing a function with a name of the form
|
|
||||||
t_rulename. Each rule must accept a single argument which is
|
|
||||||
a token object generated by the lexer. This token has the following
|
|
||||||
attributes:
|
|
||||||
|
|
||||||
t.type = type string of the token. This is initially set to the
|
|
||||||
name of the rule without the leading t_
|
|
||||||
t.value = The value of the lexeme.
|
|
||||||
t.lineno = The value of the line number where the token was encountered
|
|
||||||
|
|
||||||
For example, the t_NUMBER() rule above might be called with the following:
|
|
||||||
|
|
||||||
t.type = 'NUMBER'
|
|
||||||
t.value = '42'
|
|
||||||
t.lineno = 3
|
|
||||||
|
|
||||||
Each rule returns the token object it would like to supply to the
|
|
||||||
parser. In most cases, the token t is returned with few, if any
|
|
||||||
modifications. To discard a token for things like whitespace or
|
|
||||||
comments, simply return nothing. For instance:
|
|
||||||
|
|
||||||
def t_whitespace(t):
|
|
||||||
r' \s+ '
|
|
||||||
pass
|
|
||||||
|
|
||||||
For faster lexing, you can also define this in terms of the ignore set like this:
|
|
||||||
|
|
||||||
t_ignore = ' \t'
|
|
||||||
|
|
||||||
The characters in this string are ignored by the lexer. Use of this feature can speed
|
|
||||||
up parsing significantly since scanning will immediately proceed to the next token.
|
|
||||||
|
|
||||||
lex requires that the token returned by each rule has an attribute
|
|
||||||
t.type. Other than this, rules are free to return any kind of token
|
|
||||||
object that they wish and may construct a new type of token object
|
|
||||||
from the attributes of t (provided the new object has the required
|
|
||||||
type attribute).
|
|
||||||
|
|
||||||
If illegal characters are encountered, the scanner executes the
|
|
||||||
function t_error(t) where t is a token representing the rest of the
|
|
||||||
string that hasn't been matched. If this function isn't defined, a
|
|
||||||
LexError exception is raised. The .text attribute of this exception
|
|
||||||
object contains the part of the string that wasn't matched.
|
|
||||||
|
|
||||||
The t.skip(n) method can be used to skip ahead n characters in the
|
|
||||||
input stream. This is usually only used in the error handling rule.
|
|
||||||
For instance, the following rule would print an error message and
|
|
||||||
continue:
|
|
||||||
|
|
||||||
def t_error(t):
|
|
||||||
print "Illegal character in input %s" % t.value[0]
|
|
||||||
t.skip(1)
|
|
||||||
|
|
||||||
Of course, a nice scanner might wish to skip more than one character
|
|
||||||
if the input looks very corrupted.
|
|
||||||
|
|
||||||
The lex module defines a t.lineno attribute on each token that can be used
|
|
||||||
to track the current line number in the input. The value of this
|
|
||||||
variable is not modified by lex so it is up to your lexer module
|
|
||||||
to correctly update its value depending on the lexical properties
|
|
||||||
of the input language. To do this, you might write rules such as
|
|
||||||
the following:
|
|
||||||
|
|
||||||
def t_newline(t):
|
|
||||||
r' \n+ '
|
|
||||||
t.lineno += t.value.count("\n")
|
|
||||||
|
|
||||||
To initialize your lexer so that it can be used, simply call the lex.lex()
|
|
||||||
function in your rule file. If there are any errors in your
|
|
||||||
specification, warning messages or an exception will be generated to
|
|
||||||
alert you to the problem.
|
|
||||||
|
|
||||||
(dave: this needs to be rewritten)
|
|
||||||
To use the newly constructed lexer from another module, simply do
|
|
||||||
this:
|
|
||||||
|
|
||||||
import lex
|
|
||||||
import lexer
|
|
||||||
plex.input("position = initial + rate*60")
|
|
||||||
|
|
||||||
while 1:
|
|
||||||
token = plex.token() # Get a token
|
|
||||||
if not token: break # No more tokens
|
|
||||||
... do whatever ...
|
|
||||||
|
|
||||||
Assuming that the module 'lexer' has initialized plex as shown
|
|
||||||
above, parsing modules can safely import 'plex' without having
|
|
||||||
to import the rule file or any additional imformation about the
|
|
||||||
scanner you have defined.
|
|
||||||
"""
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
__version__ = "1.3"
|
|
||||||
|
|
||||||
import re, types, sys, copy
|
|
||||||
|
|
||||||
# Exception thrown when invalid token encountered and no default
|
|
||||||
class LexError(Exception):
|
|
||||||
def __init__(self,message,s):
|
|
||||||
self.args = (message,)
|
|
||||||
self.text = s
|
|
||||||
|
|
||||||
# Token class
|
|
||||||
class LexToken:
|
|
||||||
def __str__(self):
|
|
||||||
return "LexToken(%s,%r,%d)" % (self.type,self.value,self.lineno)
|
|
||||||
def __repr__(self):
|
|
||||||
return str(self)
|
|
||||||
def skip(self,n):
|
|
||||||
try:
|
|
||||||
self._skipn += n
|
|
||||||
except AttributeError:
|
|
||||||
self._skipn = n
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
# Lexer class
|
|
||||||
#
|
|
||||||
# input() - Store a new string in the lexer
|
|
||||||
# token() - Get the next token
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
class Lexer:
|
|
||||||
def __init__(self):
|
|
||||||
self.lexre = None # Master regular expression
|
|
||||||
self.lexdata = None # Actual input data (as a string)
|
|
||||||
self.lexpos = 0 # Current position in input text
|
|
||||||
self.lexlen = 0 # Length of the input text
|
|
||||||
self.lexindexfunc = [ ] # Reverse mapping of groups to functions and types
|
|
||||||
self.lexerrorf = None # Error rule (if any)
|
|
||||||
self.lextokens = None # List of valid tokens
|
|
||||||
self.lexignore = None # Ignored characters
|
|
||||||
self.lineno = 1 # Current line number
|
|
||||||
self.debug = 0 # Debugging mode
|
|
||||||
self.optimize = 0 # Optimized mode
|
|
||||||
self.token = self.errtoken
|
|
||||||
|
|
||||||
def __copy__(self):
|
|
||||||
c = Lexer()
|
|
||||||
c.lexre = self.lexre
|
|
||||||
c.lexdata = self.lexdata
|
|
||||||
c.lexpos = self.lexpos
|
|
||||||
c.lexlen = self.lexlen
|
|
||||||
c.lenindexfunc = self.lexindexfunc
|
|
||||||
c.lexerrorf = self.lexerrorf
|
|
||||||
c.lextokens = self.lextokens
|
|
||||||
c.lexignore = self.lexignore
|
|
||||||
c.lineno = self.lineno
|
|
||||||
c.optimize = self.optimize
|
|
||||||
c.token = c.realtoken
|
|
||||||
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
# input() - Push a new string into the lexer
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
def input(self,s):
|
|
||||||
if not isinstance(s,types.StringType):
|
|
||||||
raise ValueError, "Expected a string"
|
|
||||||
self.lexdata = s
|
|
||||||
self.lexpos = 0
|
|
||||||
self.lexlen = len(s)
|
|
||||||
self.token = self.realtoken
|
|
||||||
|
|
||||||
# Change the token routine to point to realtoken()
|
|
||||||
global token
|
|
||||||
if token == self.errtoken:
|
|
||||||
token = self.token
|
|
||||||
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
# errtoken() - Return error if token is called with no data
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
def errtoken(self):
|
|
||||||
raise RuntimeError, "No input string given with input()"
|
|
||||||
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
# token() - Return the next token from the Lexer
|
|
||||||
#
|
|
||||||
# Note: This function has been carefully implemented to be as fast
|
|
||||||
# as possible. Don't make changes unless you really know what
|
|
||||||
# you are doing
|
|
||||||
# ------------------------------------------------------------
|
|
||||||
def realtoken(self):
|
|
||||||
# Make local copies of frequently referenced attributes
|
|
||||||
lexpos = self.lexpos
|
|
||||||
lexlen = self.lexlen
|
|
||||||
lexignore = self.lexignore
|
|
||||||
lexdata = self.lexdata
|
|
||||||
|
|
||||||
while lexpos < lexlen:
|
|
||||||
# This code provides some short-circuit code for whitespace, tabs, and other ignored characters
|
|
||||||
if lexdata[lexpos] in lexignore:
|
|
||||||
lexpos += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Look for a regular expression match
|
|
||||||
m = self.lexre.match(lexdata,lexpos)
|
|
||||||
if m:
|
|
||||||
i = m.lastindex
|
|
||||||
lexpos = m.end()
|
|
||||||
tok = LexToken()
|
|
||||||
tok.value = m.group()
|
|
||||||
tok.lineno = self.lineno
|
|
||||||
tok.lexer = self
|
|
||||||
func,tok.type = self.lexindexfunc[i]
|
|
||||||
if not func:
|
|
||||||
self.lexpos = lexpos
|
|
||||||
return tok
|
|
||||||
|
|
||||||
# If token is processed by a function, call it
|
|
||||||
self.lexpos = lexpos
|
|
||||||
newtok = func(tok)
|
|
||||||
self.lineno = tok.lineno # Update line number
|
|
||||||
|
|
||||||
# Every function must return a token, if nothing, we just move to next token
|
|
||||||
if not newtok: continue
|
|
||||||
|
|
||||||
# Verify type of the token. If not in the token map, raise an error
|
|
||||||
if not self.optimize:
|
|
||||||
if not self.lextokens.has_key(newtok.type):
|
|
||||||
raise LexError, ("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
|
|
||||||
func.func_code.co_filename, func.func_code.co_firstlineno,
|
|
||||||
func.__name__, newtok.type),lexdata[lexpos:])
|
|
||||||
|
|
||||||
return newtok
|
|
||||||
|
|
||||||
# No match. Call t_error() if defined.
|
|
||||||
if self.lexerrorf:
|
|
||||||
tok = LexToken()
|
|
||||||
tok.value = self.lexdata[lexpos:]
|
|
||||||
tok.lineno = self.lineno
|
|
||||||
tok.type = "error"
|
|
||||||
tok.lexer = self
|
|
||||||
oldpos = lexpos
|
|
||||||
newtok = self.lexerrorf(tok)
|
|
||||||
lexpos += getattr(tok,"_skipn",0)
|
|
||||||
if oldpos == lexpos:
|
|
||||||
# Error method didn't change text position at all. This is an error.
|
|
||||||
self.lexpos = lexpos
|
|
||||||
raise LexError, ("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
|
|
||||||
if not newtok: continue
|
|
||||||
self.lexpos = lexpos
|
|
||||||
return newtok
|
|
||||||
|
|
||||||
self.lexpos = lexpos
|
|
||||||
raise LexError, ("No match found", lexdata[lexpos:])
|
|
||||||
|
|
||||||
# No more input data
|
|
||||||
self.lexpos = lexpos + 1
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
# validate_file()
|
|
||||||
#
|
|
||||||
# This checks to see if there are duplicated t_rulename() functions or strings
|
|
||||||
# in the parser input file. This is done using a simple regular expression
|
|
||||||
# match on each line in the filename.
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def validate_file(filename):
|
|
||||||
import os.path
|
|
||||||
base,ext = os.path.splitext(filename)
|
|
||||||
if ext != '.py': return 1 # No idea what the file is. Return OK
|
|
||||||
|
|
||||||
try:
|
|
||||||
f = open(filename)
|
|
||||||
lines = f.readlines()
|
|
||||||
f.close()
|
|
||||||
except IOError:
|
|
||||||
return 1 # Oh well
|
|
||||||
|
|
||||||
fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
|
|
||||||
sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
|
|
||||||
counthash = { }
|
|
||||||
linen = 1
|
|
||||||
noerror = 1
|
|
||||||
for l in lines:
|
|
||||||
m = fre.match(l)
|
|
||||||
if not m:
|
|
||||||
m = sre.match(l)
|
|
||||||
if m:
|
|
||||||
name = m.group(1)
|
|
||||||
prev = counthash.get(name)
|
|
||||||
if not prev:
|
|
||||||
counthash[name] = linen
|
|
||||||
else:
|
|
||||||
print "%s:%d: Rule %s redefined. Previously defined on line %d" % (filename,linen,name,prev)
|
|
||||||
noerror = 0
|
|
||||||
linen += 1
|
|
||||||
return noerror
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
# _read_lextab(module)
|
|
||||||
#
|
|
||||||
# Reads lexer table from a lextab file instead of using introspection.
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def _read_lextab(lexer, fdict, module):
|
|
||||||
exec "import %s as lextab" % module
|
|
||||||
lexer.lexre = re.compile(lextab._lexre, re.VERBOSE)
|
|
||||||
lexer.lexindexfunc = lextab._lextab
|
|
||||||
for i in range(len(lextab._lextab)):
|
|
||||||
t = lexer.lexindexfunc[i]
|
|
||||||
if t:
|
|
||||||
if t[0]:
|
|
||||||
lexer.lexindexfunc[i] = (fdict[t[0]],t[1])
|
|
||||||
lexer.lextokens = lextab._lextokens
|
|
||||||
lexer.lexignore = lextab._lexignore
|
|
||||||
if lextab._lexerrorf:
|
|
||||||
lexer.lexerrorf = fdict[lextab._lexerrorf]
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
# lex(module)
|
|
||||||
#
|
|
||||||
# Build all of the regular expression rules from definitions in the supplied module
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
def lex(module=None,debug=0,optimize=0,lextab="lextab"):
|
|
||||||
ldict = None
|
|
||||||
regex = ""
|
|
||||||
error = 0
|
|
||||||
files = { }
|
|
||||||
lexer = Lexer()
|
|
||||||
lexer.debug = debug
|
|
||||||
lexer.optimize = optimize
|
|
||||||
global token,input
|
|
||||||
|
|
||||||
if module:
|
|
||||||
if not isinstance(module, types.ModuleType):
|
|
||||||
raise ValueError,"Expected a module"
|
|
||||||
|
|
||||||
ldict = module.__dict__
|
|
||||||
|
|
||||||
else:
|
|
||||||
# No module given. We might be able to get information from the caller.
|
|
||||||
try:
|
|
||||||
raise RuntimeError
|
|
||||||
except RuntimeError:
|
|
||||||
e,b,t = sys.exc_info()
|
|
||||||
f = t.tb_frame
|
|
||||||
f = f.f_back # Walk out to our calling function
|
|
||||||
ldict = f.f_globals # Grab its globals dictionary
|
|
||||||
|
|
||||||
if optimize and lextab:
|
|
||||||
try:
|
|
||||||
_read_lextab(lexer,ldict, lextab)
|
|
||||||
if not lexer.lexignore: lexer.lexignore = ""
|
|
||||||
token = lexer.token
|
|
||||||
input = lexer.input
|
|
||||||
return lexer
|
|
||||||
|
|
||||||
except ImportError:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# Get the tokens map
|
|
||||||
tokens = ldict.get("tokens",None)
|
|
||||||
if not tokens:
|
|
||||||
raise SyntaxError,"lex: module does not define 'tokens'"
|
|
||||||
if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
|
|
||||||
raise SyntaxError,"lex: tokens must be a list or tuple."
|
|
||||||
|
|
||||||
# Build a dictionary of valid token names
|
|
||||||
lexer.lextokens = { }
|
|
||||||
if not optimize:
|
|
||||||
|
|
||||||
# Utility function for verifying tokens
|
|
||||||
def is_identifier(s):
|
|
||||||
for c in s:
|
|
||||||
if not (c.isalnum() or c == '_'): return 0
|
|
||||||
return 1
|
|
||||||
|
|
||||||
for n in tokens:
|
|
||||||
if not is_identifier(n):
|
|
||||||
print "lex: Bad token name '%s'" % n
|
|
||||||
error = 1
|
|
||||||
if lexer.lextokens.has_key(n):
|
|
||||||
print "lex: Warning. Token '%s' multiply defined." % n
|
|
||||||
lexer.lextokens[n] = None
|
|
||||||
else:
|
|
||||||
for n in tokens: lexer.lextokens[n] = None
|
|
||||||
|
|
||||||
|
|
||||||
if debug:
|
|
||||||
print "lex: tokens = '%s'" % lexer.lextokens.keys()
|
|
||||||
|
|
||||||
# Get a list of symbols with the t_ prefix
|
|
||||||
tsymbols = [f for f in ldict.keys() if f[:2] == 't_']
|
|
||||||
|
|
||||||
# Now build up a list of functions and a list of strings
|
|
||||||
fsymbols = [ ]
|
|
||||||
ssymbols = [ ]
|
|
||||||
for f in tsymbols:
|
|
||||||
if isinstance(ldict[f],types.FunctionType):
|
|
||||||
fsymbols.append(ldict[f])
|
|
||||||
elif isinstance(ldict[f],types.StringType):
|
|
||||||
ssymbols.append((f,ldict[f]))
|
|
||||||
else:
|
|
||||||
print "lex: %s not defined as a function or string" % f
|
|
||||||
error = 1
|
|
||||||
|
|
||||||
# Sort the functions by line number
|
|
||||||
fsymbols.sort(lambda x,y: cmp(x.func_code.co_firstlineno,y.func_code.co_firstlineno))
|
|
||||||
|
|
||||||
# Sort the strings by regular expression length
|
|
||||||
ssymbols.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
|
|
||||||
|
|
||||||
# Check for non-empty symbols
|
|
||||||
if len(fsymbols) == 0 and len(ssymbols) == 0:
|
|
||||||
raise SyntaxError,"lex: no rules of the form t_rulename are defined."
|
|
||||||
|
|
||||||
# Add all of the rules defined with actions first
|
|
||||||
for f in fsymbols:
|
|
||||||
|
|
||||||
line = f.func_code.co_firstlineno
|
|
||||||
file = f.func_code.co_filename
|
|
||||||
files[file] = None
|
|
||||||
|
|
||||||
if not optimize:
|
|
||||||
if f.func_code.co_argcount > 1:
|
|
||||||
print "%s:%d: Rule '%s' has too many arguments." % (file,line,f.__name__)
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if f.func_code.co_argcount < 1:
|
|
||||||
print "%s:%d: Rule '%s' requires an argument." % (file,line,f.__name__)
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if f.__name__ == 't_ignore':
|
|
||||||
print "%s:%d: Rule '%s' must be defined as a string." % (file,line,f.__name__)
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if f.__name__ == 't_error':
|
|
||||||
lexer.lexerrorf = f
|
|
||||||
continue
|
|
||||||
|
|
||||||
if f.__doc__:
|
|
||||||
if not optimize:
|
|
||||||
try:
|
|
||||||
c = re.compile(f.__doc__, re.VERBOSE)
|
|
||||||
except re.error,e:
|
|
||||||
print "%s:%d: Invalid regular expression for rule '%s'. %s" % (file,line,f.__name__,e)
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if debug:
|
|
||||||
print "lex: Adding rule %s -> '%s'" % (f.__name__,f.__doc__)
|
|
||||||
|
|
||||||
# Okay. The regular expression seemed okay. Let's append it to the master regular
|
|
||||||
# expression we're building
|
|
||||||
|
|
||||||
if (regex): regex += "|"
|
|
||||||
regex += "(?P<%s>%s)" % (f.__name__,f.__doc__)
|
|
||||||
else:
|
|
||||||
print "%s:%d: No regular expression defined for rule '%s'" % (file,line,f.__name__)
|
|
||||||
|
|
||||||
# Now add all of the simple rules
|
|
||||||
for name,r in ssymbols:
|
|
||||||
|
|
||||||
if name == 't_ignore':
|
|
||||||
lexer.lexignore = r
|
|
||||||
continue
|
|
||||||
|
|
||||||
if not optimize:
|
|
||||||
if name == 't_error':
|
|
||||||
raise SyntaxError,"lex: Rule 't_error' must be defined as a function"
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if not lexer.lextokens.has_key(name[2:]):
|
|
||||||
print "lex: Rule '%s' defined for an unspecified token %s." % (name,name[2:])
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
c = re.compile(r,re.VERBOSE)
|
|
||||||
except re.error,e:
|
|
||||||
print "lex: Invalid regular expression for rule '%s'. %s" % (name,e)
|
|
||||||
error = 1
|
|
||||||
continue
|
|
||||||
if debug:
|
|
||||||
print "lex: Adding rule %s -> '%s'" % (name,r)
|
|
||||||
|
|
||||||
if regex: regex += "|"
|
|
||||||
regex += "(?P<%s>%s)" % (name,r)
|
|
||||||
|
|
||||||
if not optimize:
|
|
||||||
for f in files.keys():
|
|
||||||
if not validate_file(f):
|
|
||||||
error = 1
|
|
||||||
try:
|
|
||||||
if debug:
|
|
||||||
print "lex: regex = '%s'" % regex
|
|
||||||
lexer.lexre = re.compile(regex, re.VERBOSE)
|
|
||||||
|
|
||||||
# Build the index to function map for the matching engine
|
|
||||||
lexer.lexindexfunc = [ None ] * (max(lexer.lexre.groupindex.values())+1)
|
|
||||||
for f,i in lexer.lexre.groupindex.items():
|
|
||||||
handle = ldict[f]
|
|
||||||
if isinstance(handle,types.FunctionType):
|
|
||||||
lexer.lexindexfunc[i] = (handle,handle.__name__[2:])
|
|
||||||
else:
|
|
||||||
# If rule was specified as a string, we build an anonymous
|
|
||||||
# callback function to carry out the action
|
|
||||||
lexer.lexindexfunc[i] = (None,f[2:])
|
|
||||||
|
|
||||||
# If a lextab was specified, we create a file containing the precomputed
|
|
||||||
# regular expression and index table
|
|
||||||
|
|
||||||
if lextab and optimize:
|
|
||||||
lt = open(lextab+".py","w")
|
|
||||||
lt.write("# %s.py. This file automatically created by PLY. Don't edit.\n" % lextab)
|
|
||||||
lt.write("_lexre = %s\n" % repr(regex))
|
|
||||||
lt.write("_lextab = [\n");
|
|
||||||
for i in range(0,len(lexer.lexindexfunc)):
|
|
||||||
t = lexer.lexindexfunc[i]
|
|
||||||
if t:
|
|
||||||
if t[0]:
|
|
||||||
lt.write(" ('%s',%s),\n"% (t[0].__name__, repr(t[1])))
|
|
||||||
else:
|
|
||||||
lt.write(" (None,%s),\n" % repr(t[1]))
|
|
||||||
else:
|
|
||||||
lt.write(" None,\n")
|
|
||||||
|
|
||||||
lt.write("]\n");
|
|
||||||
lt.write("_lextokens = %s\n" % repr(lexer.lextokens))
|
|
||||||
lt.write("_lexignore = %s\n" % repr(lexer.lexignore))
|
|
||||||
if (lexer.lexerrorf):
|
|
||||||
lt.write("_lexerrorf = %s\n" % repr(lexer.lexerrorf.__name__))
|
|
||||||
else:
|
|
||||||
lt.write("_lexerrorf = None\n")
|
|
||||||
lt.close()
|
|
||||||
|
|
||||||
except re.error,e:
|
|
||||||
print "lex: Fatal error. Unable to compile regular expression rules. %s" % e
|
|
||||||
error = 1
|
|
||||||
if error:
|
|
||||||
raise SyntaxError,"lex: Unable to build lexer."
|
|
||||||
if not lexer.lexerrorf:
|
|
||||||
print "lex: Warning. no t_error rule is defined."
|
|
||||||
|
|
||||||
if not lexer.lexignore: lexer.lexignore = ""
|
|
||||||
|
|
||||||
# Create global versions of the token() and input() functions
|
|
||||||
token = lexer.token
|
|
||||||
input = lexer.input
|
|
||||||
|
|
||||||
return lexer
|
|
||||||
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
# run()
|
|
||||||
#
|
|
||||||
# This runs the lexer as a main program
|
|
||||||
# -----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def runmain(lexer=None,data=None):
|
|
||||||
if not data:
|
|
||||||
try:
|
|
||||||
filename = sys.argv[1]
|
|
||||||
f = open(filename)
|
|
||||||
data = f.read()
|
|
||||||
f.close()
|
|
||||||
except IndexError:
|
|
||||||
print "Reading from standard input (type EOF to end):"
|
|
||||||
data = sys.stdin.read()
|
|
||||||
|
|
||||||
if lexer:
|
|
||||||
_input = lexer.input
|
|
||||||
else:
|
|
||||||
_input = input
|
|
||||||
_input(data)
|
|
||||||
if lexer:
|
|
||||||
_token = lexer.token
|
|
||||||
else:
|
|
||||||
_token = token
|
|
||||||
|
|
||||||
while 1:
|
|
||||||
tok = _token()
|
|
||||||
if not tok: break
|
|
||||||
print "(%s,'%s',%d)" % (tok.type, tok.value, tok.lineno)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
4
ext/ply/ply/__init__.py
Normal file
4
ext/ply/ply/__init__.py
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
# PLY package
|
||||||
|
# Author: David Beazley (dave@dabeaz.com)
|
||||||
|
|
||||||
|
__all__ = ['lex','yacc']
|
867
ext/ply/ply/lex.py
Normal file
867
ext/ply/ply/lex.py
Normal file
|
@ -0,0 +1,867 @@
|
||||||
|
#-----------------------------------------------------------------------------
|
||||||
|
# ply: lex.py
|
||||||
|
#
|
||||||
|
# Author: David M. Beazley (dave@dabeaz.com)
|
||||||
|
#
|
||||||
|
# Copyright (C) 2001-2007, David M. Beazley
|
||||||
|
#
|
||||||
|
# This library is free software; you can redistribute it and/or
|
||||||
|
# modify it under the terms of the GNU Lesser General Public
|
||||||
|
# License as published by the Free Software Foundation; either
|
||||||
|
# version 2.1 of the License, or (at your option) any later version.
|
||||||
|
#
|
||||||
|
# This library is distributed in the hope that it will be useful,
|
||||||
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||||
|
# Lesser General Public License for more details.
|
||||||
|
#
|
||||||
|
# You should have received a copy of the GNU Lesser General Public
|
||||||
|
# License along with this library; if not, write to the Free Software
|
||||||
|
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||||
|
#
|
||||||
|
# See the file COPYING for a complete copy of the LGPL.
|
||||||
|
#-----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
__version__ = "2.3"
|
||||||
|
|
||||||
|
import re, sys, types
|
||||||
|
|
||||||
|
# Regular expression used to match valid token names
|
||||||
|
_is_identifier = re.compile(r'^[a-zA-Z0-9_]+$')
|
||||||
|
|
||||||
|
# Available instance types. This is used when lexers are defined by a class.
|
||||||
|
# It's a little funky because I want to preserve backwards compatibility
|
||||||
|
# with Python 2.0 where types.ObjectType is undefined.
|
||||||
|
|
||||||
|
try:
|
||||||
|
_INSTANCETYPE = (types.InstanceType, types.ObjectType)
|
||||||
|
except AttributeError:
|
||||||
|
_INSTANCETYPE = types.InstanceType
|
||||||
|
class object: pass # Note: needed if no new-style classes present
|
||||||
|
|
||||||
|
# Exception thrown when invalid token encountered and no default error
|
||||||
|
# handler is defined.
|
||||||
|
class LexError(Exception):
|
||||||
|
def __init__(self,message,s):
|
||||||
|
self.args = (message,)
|
||||||
|
self.text = s
|
||||||
|
|
||||||
|
# Token class
|
||||||
|
class LexToken(object):
|
||||||
|
def __str__(self):
|
||||||
|
return "LexToken(%s,%r,%d,%d)" % (self.type,self.value,self.lineno,self.lexpos)
|
||||||
|
def __repr__(self):
|
||||||
|
return str(self)
|
||||||
|
def skip(self,n):
|
||||||
|
self.lexer.skip(n)
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Lexer class
|
||||||
|
#
|
||||||
|
# This class encapsulates all of the methods and data associated with a lexer.
|
||||||
|
#
|
||||||
|
# input() - Store a new string in the lexer
|
||||||
|
# token() - Get the next token
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class Lexer:
|
||||||
|
def __init__(self):
|
||||||
|
self.lexre = None # Master regular expression. This is a list of
|
||||||
|
# tuples (re,findex) where re is a compiled
|
||||||
|
# regular expression and findex is a list
|
||||||
|
# mapping regex group numbers to rules
|
||||||
|
self.lexretext = None # Current regular expression strings
|
||||||
|
self.lexstatere = {} # Dictionary mapping lexer states to master regexs
|
||||||
|
self.lexstateretext = {} # Dictionary mapping lexer states to regex strings
|
||||||
|
self.lexstate = "INITIAL" # Current lexer state
|
||||||
|
self.lexstatestack = [] # Stack of lexer states
|
||||||
|
self.lexstateinfo = None # State information
|
||||||
|
self.lexstateignore = {} # Dictionary of ignored characters for each state
|
||||||
|
self.lexstateerrorf = {} # Dictionary of error functions for each state
|
||||||
|
self.lexreflags = 0 # Optional re compile flags
|
||||||
|
self.lexdata = None # Actual input data (as a string)
|
||||||
|
self.lexpos = 0 # Current position in input text
|
||||||
|
self.lexlen = 0 # Length of the input text
|
||||||
|
self.lexerrorf = None # Error rule (if any)
|
||||||
|
self.lextokens = None # List of valid tokens
|
||||||
|
self.lexignore = "" # Ignored characters
|
||||||
|
self.lexliterals = "" # Literal characters that can be passed through
|
||||||
|
self.lexmodule = None # Module
|
||||||
|
self.lineno = 1 # Current line number
|
||||||
|
self.lexdebug = 0 # Debugging mode
|
||||||
|
self.lexoptimize = 0 # Optimized mode
|
||||||
|
|
||||||
|
def clone(self,object=None):
|
||||||
|
c = Lexer()
|
||||||
|
c.lexstatere = self.lexstatere
|
||||||
|
c.lexstateinfo = self.lexstateinfo
|
||||||
|
c.lexstateretext = self.lexstateretext
|
||||||
|
c.lexstate = self.lexstate
|
||||||
|
c.lexstatestack = self.lexstatestack
|
||||||
|
c.lexstateignore = self.lexstateignore
|
||||||
|
c.lexstateerrorf = self.lexstateerrorf
|
||||||
|
c.lexreflags = self.lexreflags
|
||||||
|
c.lexdata = self.lexdata
|
||||||
|
c.lexpos = self.lexpos
|
||||||
|
c.lexlen = self.lexlen
|
||||||
|
c.lextokens = self.lextokens
|
||||||
|
c.lexdebug = self.lexdebug
|
||||||
|
c.lineno = self.lineno
|
||||||
|
c.lexoptimize = self.lexoptimize
|
||||||
|
c.lexliterals = self.lexliterals
|
||||||
|
c.lexmodule = self.lexmodule
|
||||||
|
|
||||||
|
# If the object parameter has been supplied, it means we are attaching the
|
||||||
|
# lexer to a new object. In this case, we have to rebind all methods in
|
||||||
|
# the lexstatere and lexstateerrorf tables.
|
||||||
|
|
||||||
|
if object:
|
||||||
|
newtab = { }
|
||||||
|
for key, ritem in self.lexstatere.items():
|
||||||
|
newre = []
|
||||||
|
for cre, findex in ritem:
|
||||||
|
newfindex = []
|
||||||
|
for f in findex:
|
||||||
|
if not f or not f[0]:
|
||||||
|
newfindex.append(f)
|
||||||
|
continue
|
||||||
|
newfindex.append((getattr(object,f[0].__name__),f[1]))
|
||||||
|
newre.append((cre,newfindex))
|
||||||
|
newtab[key] = newre
|
||||||
|
c.lexstatere = newtab
|
||||||
|
c.lexstateerrorf = { }
|
||||||
|
for key, ef in self.lexstateerrorf.items():
|
||||||
|
c.lexstateerrorf[key] = getattr(object,ef.__name__)
|
||||||
|
c.lexmodule = object
|
||||||
|
|
||||||
|
# Set up other attributes
|
||||||
|
c.begin(c.lexstate)
|
||||||
|
return c
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# writetab() - Write lexer information to a table file
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def writetab(self,tabfile):
|
||||||
|
tf = open(tabfile+".py","w")
|
||||||
|
tf.write("# %s.py. This file automatically created by PLY (version %s). Don't edit!\n" % (tabfile,__version__))
|
||||||
|
tf.write("_lextokens = %s\n" % repr(self.lextokens))
|
||||||
|
tf.write("_lexreflags = %s\n" % repr(self.lexreflags))
|
||||||
|
tf.write("_lexliterals = %s\n" % repr(self.lexliterals))
|
||||||
|
tf.write("_lexstateinfo = %s\n" % repr(self.lexstateinfo))
|
||||||
|
|
||||||
|
tabre = { }
|
||||||
|
for key, lre in self.lexstatere.items():
|
||||||
|
titem = []
|
||||||
|
for i in range(len(lre)):
|
||||||
|
titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1])))
|
||||||
|
tabre[key] = titem
|
||||||
|
|
||||||
|
tf.write("_lexstatere = %s\n" % repr(tabre))
|
||||||
|
tf.write("_lexstateignore = %s\n" % repr(self.lexstateignore))
|
||||||
|
|
||||||
|
taberr = { }
|
||||||
|
for key, ef in self.lexstateerrorf.items():
|
||||||
|
if ef:
|
||||||
|
taberr[key] = ef.__name__
|
||||||
|
else:
|
||||||
|
taberr[key] = None
|
||||||
|
tf.write("_lexstateerrorf = %s\n" % repr(taberr))
|
||||||
|
tf.close()
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# readtab() - Read lexer information from a tab file
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def readtab(self,tabfile,fdict):
|
||||||
|
exec "import %s as lextab" % tabfile
|
||||||
|
self.lextokens = lextab._lextokens
|
||||||
|
self.lexreflags = lextab._lexreflags
|
||||||
|
self.lexliterals = lextab._lexliterals
|
||||||
|
self.lexstateinfo = lextab._lexstateinfo
|
||||||
|
self.lexstateignore = lextab._lexstateignore
|
||||||
|
self.lexstatere = { }
|
||||||
|
self.lexstateretext = { }
|
||||||
|
for key,lre in lextab._lexstatere.items():
|
||||||
|
titem = []
|
||||||
|
txtitem = []
|
||||||
|
for i in range(len(lre)):
|
||||||
|
titem.append((re.compile(lre[i][0],lextab._lexreflags),_names_to_funcs(lre[i][1],fdict)))
|
||||||
|
txtitem.append(lre[i][0])
|
||||||
|
self.lexstatere[key] = titem
|
||||||
|
self.lexstateretext[key] = txtitem
|
||||||
|
self.lexstateerrorf = { }
|
||||||
|
for key,ef in lextab._lexstateerrorf.items():
|
||||||
|
self.lexstateerrorf[key] = fdict[ef]
|
||||||
|
self.begin('INITIAL')
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# input() - Push a new string into the lexer
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def input(self,s):
|
||||||
|
if not (isinstance(s,types.StringType) or isinstance(s,types.UnicodeType)):
|
||||||
|
raise ValueError, "Expected a string"
|
||||||
|
self.lexdata = s
|
||||||
|
self.lexpos = 0
|
||||||
|
self.lexlen = len(s)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# begin() - Changes the lexing state
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def begin(self,state):
|
||||||
|
if not self.lexstatere.has_key(state):
|
||||||
|
raise ValueError, "Undefined state"
|
||||||
|
self.lexre = self.lexstatere[state]
|
||||||
|
self.lexretext = self.lexstateretext[state]
|
||||||
|
self.lexignore = self.lexstateignore.get(state,"")
|
||||||
|
self.lexerrorf = self.lexstateerrorf.get(state,None)
|
||||||
|
self.lexstate = state
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# push_state() - Changes the lexing state and saves old on stack
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def push_state(self,state):
|
||||||
|
self.lexstatestack.append(self.lexstate)
|
||||||
|
self.begin(state)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# pop_state() - Restores the previous state
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def pop_state(self):
|
||||||
|
self.begin(self.lexstatestack.pop())
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# current_state() - Returns the current lexing state
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def current_state(self):
|
||||||
|
return self.lexstate
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# skip() - Skip ahead n characters
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def skip(self,n):
|
||||||
|
self.lexpos += n
|
||||||
|
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
# token() - Return the next token from the Lexer
|
||||||
|
#
|
||||||
|
# Note: This function has been carefully implemented to be as fast
|
||||||
|
# as possible. Don't make changes unless you really know what
|
||||||
|
# you are doing
|
||||||
|
# ------------------------------------------------------------
|
||||||
|
def token(self):
|
||||||
|
# Make local copies of frequently referenced attributes
|
||||||
|
lexpos = self.lexpos
|
||||||
|
lexlen = self.lexlen
|
||||||
|
lexignore = self.lexignore
|
||||||
|
lexdata = self.lexdata
|
||||||
|
|
||||||
|
while lexpos < lexlen:
|
||||||
|
# This code provides some short-circuit code for whitespace, tabs, and other ignored characters
|
||||||
|
if lexdata[lexpos] in lexignore:
|
||||||
|
lexpos += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Look for a regular expression match
|
||||||
|
for lexre,lexindexfunc in self.lexre:
|
||||||
|
m = lexre.match(lexdata,lexpos)
|
||||||
|
if not m: continue
|
||||||
|
|
||||||
|
# Set last match in lexer so that rules can access it if they want
|
||||||
|
self.lexmatch = m
|
||||||
|
|
||||||
|
# Create a token for return
|
||||||
|
tok = LexToken()
|
||||||
|
tok.value = m.group()
|
||||||
|
tok.lineno = self.lineno
|
||||||
|
tok.lexpos = lexpos
|
||||||
|
tok.lexer = self
|
||||||
|
|
||||||
|
lexpos = m.end()
|
||||||
|
i = m.lastindex
|
||||||
|
func,tok.type = lexindexfunc[i]
|
||||||
|
self.lexpos = lexpos
|
||||||
|
|
||||||
|
if not func:
|
||||||
|
# If no token type was set, it's an ignored token
|
||||||
|
if tok.type: return tok
|
||||||
|
break
|
||||||
|
|
||||||
|
# if func not callable, it means it's an ignored token
|
||||||
|
if not callable(func):
|
||||||
|
break
|
||||||
|
|
||||||
|
# If token is processed by a function, call it
|
||||||
|
newtok = func(tok)
|
||||||
|
|
||||||
|
# Every function must return a token, if nothing, we just move to next token
|
||||||
|
if not newtok:
|
||||||
|
lexpos = self.lexpos # This is here in case user has updated lexpos.
|
||||||
|
break
|
||||||
|
|
||||||
|
# Verify type of the token. If not in the token map, raise an error
|
||||||
|
if not self.lexoptimize:
|
||||||
|
if not self.lextokens.has_key(newtok.type):
|
||||||
|
raise LexError, ("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
|
||||||
|
func.func_code.co_filename, func.func_code.co_firstlineno,
|
||||||
|
func.__name__, newtok.type),lexdata[lexpos:])
|
||||||
|
|
||||||
|
return newtok
|
||||||
|
else:
|
||||||
|
# No match, see if in literals
|
||||||
|
if lexdata[lexpos] in self.lexliterals:
|
||||||
|
tok = LexToken()
|
||||||
|
tok.value = lexdata[lexpos]
|
||||||
|
tok.lineno = self.lineno
|
||||||
|
tok.lexer = self
|
||||||
|
tok.type = tok.value
|
||||||
|
tok.lexpos = lexpos
|
||||||
|
self.lexpos = lexpos + 1
|
||||||
|
return tok
|
||||||
|
|
||||||
|
# No match. Call t_error() if defined.
|
||||||
|
if self.lexerrorf:
|
||||||
|
tok = LexToken()
|
||||||
|
tok.value = self.lexdata[lexpos:]
|
||||||
|
tok.lineno = self.lineno
|
||||||
|
tok.type = "error"
|
||||||
|
tok.lexer = self
|
||||||
|
tok.lexpos = lexpos
|
||||||
|
self.lexpos = lexpos
|
||||||
|
newtok = self.lexerrorf(tok)
|
||||||
|
if lexpos == self.lexpos:
|
||||||
|
# Error method didn't change text position at all. This is an error.
|
||||||
|
raise LexError, ("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
|
||||||
|
lexpos = self.lexpos
|
||||||
|
if not newtok: continue
|
||||||
|
return newtok
|
||||||
|
|
||||||
|
self.lexpos = lexpos
|
||||||
|
raise LexError, ("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
|
||||||
|
|
||||||
|
self.lexpos = lexpos + 1
|
||||||
|
if self.lexdata is None:
|
||||||
|
raise RuntimeError, "No input string given with input()"
|
||||||
|
return None
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# _validate_file()
|
||||||
|
#
|
||||||
|
# This checks to see if there are duplicated t_rulename() functions or strings
|
||||||
|
# in the parser input file. This is done using a simple regular expression
|
||||||
|
# match on each line in the filename.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _validate_file(filename):
|
||||||
|
import os.path
|
||||||
|
base,ext = os.path.splitext(filename)
|
||||||
|
if ext != '.py': return 1 # No idea what the file is. Return OK
|
||||||
|
|
||||||
|
try:
|
||||||
|
f = open(filename)
|
||||||
|
lines = f.readlines()
|
||||||
|
f.close()
|
||||||
|
except IOError:
|
||||||
|
return 1 # Oh well
|
||||||
|
|
||||||
|
fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
|
||||||
|
sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
|
||||||
|
counthash = { }
|
||||||
|
linen = 1
|
||||||
|
noerror = 1
|
||||||
|
for l in lines:
|
||||||
|
m = fre.match(l)
|
||||||
|
if not m:
|
||||||
|
m = sre.match(l)
|
||||||
|
if m:
|
||||||
|
name = m.group(1)
|
||||||
|
prev = counthash.get(name)
|
||||||
|
if not prev:
|
||||||
|
counthash[name] = linen
|
||||||
|
else:
|
||||||
|
print >>sys.stderr, "%s:%d: Rule %s redefined. Previously defined on line %d" % (filename,linen,name,prev)
|
||||||
|
noerror = 0
|
||||||
|
linen += 1
|
||||||
|
return noerror
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# _funcs_to_names()
|
||||||
|
#
|
||||||
|
# Given a list of regular expression functions, this converts it to a list
|
||||||
|
# suitable for output to a table file
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _funcs_to_names(funclist):
|
||||||
|
result = []
|
||||||
|
for f in funclist:
|
||||||
|
if f and f[0]:
|
||||||
|
result.append((f[0].__name__,f[1]))
|
||||||
|
else:
|
||||||
|
result.append(f)
|
||||||
|
return result
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# _names_to_funcs()
|
||||||
|
#
|
||||||
|
# Given a list of regular expression function names, this converts it back to
|
||||||
|
# functions.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _names_to_funcs(namelist,fdict):
|
||||||
|
result = []
|
||||||
|
for n in namelist:
|
||||||
|
if n and n[0]:
|
||||||
|
result.append((fdict[n[0]],n[1]))
|
||||||
|
else:
|
||||||
|
result.append(n)
|
||||||
|
return result
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# _form_master_re()
|
||||||
|
#
|
||||||
|
# This function takes a list of all of the regex components and attempts to
|
||||||
|
# form the master regular expression. Given limitations in the Python re
|
||||||
|
# module, it may be necessary to break the master regex into separate expressions.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _form_master_re(relist,reflags,ldict,toknames):
|
||||||
|
if not relist: return []
|
||||||
|
regex = "|".join(relist)
|
||||||
|
try:
|
||||||
|
lexre = re.compile(regex,re.VERBOSE | reflags)
|
||||||
|
|
||||||
|
# Build the index to function map for the matching engine
|
||||||
|
lexindexfunc = [ None ] * (max(lexre.groupindex.values())+1)
|
||||||
|
for f,i in lexre.groupindex.items():
|
||||||
|
handle = ldict.get(f,None)
|
||||||
|
if type(handle) in (types.FunctionType, types.MethodType):
|
||||||
|
lexindexfunc[i] = (handle,toknames[handle.__name__])
|
||||||
|
elif handle is not None:
|
||||||
|
# If rule was specified as a string, we build an anonymous
|
||||||
|
# callback function to carry out the action
|
||||||
|
if f.find("ignore_") > 0:
|
||||||
|
lexindexfunc[i] = (None,None)
|
||||||
|
else:
|
||||||
|
lexindexfunc[i] = (None, toknames[f])
|
||||||
|
|
||||||
|
return [(lexre,lexindexfunc)],[regex]
|
||||||
|
except Exception,e:
|
||||||
|
m = int(len(relist)/2)
|
||||||
|
if m == 0: m = 1
|
||||||
|
llist, lre = _form_master_re(relist[:m],reflags,ldict,toknames)
|
||||||
|
rlist, rre = _form_master_re(relist[m:],reflags,ldict,toknames)
|
||||||
|
return llist+rlist, lre+rre
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# def _statetoken(s,names)
|
||||||
|
#
|
||||||
|
# Given a declaration name s of the form "t_" and a dictionary whose keys are
|
||||||
|
# state names, this function returns a tuple (states,tokenname) where states
|
||||||
|
# is a tuple of state names and tokenname is the name of the token. For example,
|
||||||
|
# calling this with s = "t_foo_bar_SPAM" might return (('foo','bar'),'SPAM')
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _statetoken(s,names):
|
||||||
|
nonstate = 1
|
||||||
|
parts = s.split("_")
|
||||||
|
for i in range(1,len(parts)):
|
||||||
|
if not names.has_key(parts[i]) and parts[i] != 'ANY': break
|
||||||
|
if i > 1:
|
||||||
|
states = tuple(parts[1:i])
|
||||||
|
else:
|
||||||
|
states = ('INITIAL',)
|
||||||
|
|
||||||
|
if 'ANY' in states:
|
||||||
|
states = tuple(names.keys())
|
||||||
|
|
||||||
|
tokenname = "_".join(parts[i:])
|
||||||
|
return (states,tokenname)
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# lex(module)
|
||||||
|
#
|
||||||
|
# Build all of the regular expression rules from definitions in the supplied module
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,nowarn=0):
|
||||||
|
global lexer
|
||||||
|
ldict = None
|
||||||
|
stateinfo = { 'INITIAL' : 'inclusive'}
|
||||||
|
error = 0
|
||||||
|
files = { }
|
||||||
|
lexobj = Lexer()
|
||||||
|
lexobj.lexdebug = debug
|
||||||
|
lexobj.lexoptimize = optimize
|
||||||
|
global token,input
|
||||||
|
|
||||||
|
if nowarn: warn = 0
|
||||||
|
else: warn = 1
|
||||||
|
|
||||||
|
if object: module = object
|
||||||
|
|
||||||
|
if module:
|
||||||
|
# User supplied a module object.
|
||||||
|
if isinstance(module, types.ModuleType):
|
||||||
|
ldict = module.__dict__
|
||||||
|
elif isinstance(module, _INSTANCETYPE):
|
||||||
|
_items = [(k,getattr(module,k)) for k in dir(module)]
|
||||||
|
ldict = { }
|
||||||
|
for (i,v) in _items:
|
||||||
|
ldict[i] = v
|
||||||
|
else:
|
||||||
|
raise ValueError,"Expected a module or instance"
|
||||||
|
lexobj.lexmodule = module
|
||||||
|
|
||||||
|
else:
|
||||||
|
# No module given. We might be able to get information from the caller.
|
||||||
|
try:
|
||||||
|
raise RuntimeError
|
||||||
|
except RuntimeError:
|
||||||
|
e,b,t = sys.exc_info()
|
||||||
|
f = t.tb_frame
|
||||||
|
f = f.f_back # Walk out to our calling function
|
||||||
|
ldict = f.f_globals # Grab its globals dictionary
|
||||||
|
|
||||||
|
if optimize and lextab:
|
||||||
|
try:
|
||||||
|
lexobj.readtab(lextab,ldict)
|
||||||
|
token = lexobj.token
|
||||||
|
input = lexobj.input
|
||||||
|
lexer = lexobj
|
||||||
|
return lexobj
|
||||||
|
|
||||||
|
except ImportError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Get the tokens, states, and literals variables (if any)
|
||||||
|
if (module and isinstance(module,_INSTANCETYPE)):
|
||||||
|
tokens = getattr(module,"tokens",None)
|
||||||
|
states = getattr(module,"states",None)
|
||||||
|
literals = getattr(module,"literals","")
|
||||||
|
else:
|
||||||
|
tokens = ldict.get("tokens",None)
|
||||||
|
states = ldict.get("states",None)
|
||||||
|
literals = ldict.get("literals","")
|
||||||
|
|
||||||
|
if not tokens:
|
||||||
|
raise SyntaxError,"lex: module does not define 'tokens'"
|
||||||
|
if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
|
||||||
|
raise SyntaxError,"lex: tokens must be a list or tuple."
|
||||||
|
|
||||||
|
# Build a dictionary of valid token names
|
||||||
|
lexobj.lextokens = { }
|
||||||
|
if not optimize:
|
||||||
|
for n in tokens:
|
||||||
|
if not _is_identifier.match(n):
|
||||||
|
print >>sys.stderr, "lex: Bad token name '%s'" % n
|
||||||
|
error = 1
|
||||||
|
if warn and lexobj.lextokens.has_key(n):
|
||||||
|
print >>sys.stderr, "lex: Warning. Token '%s' multiply defined." % n
|
||||||
|
lexobj.lextokens[n] = None
|
||||||
|
else:
|
||||||
|
for n in tokens: lexobj.lextokens[n] = None
|
||||||
|
|
||||||
|
if debug:
|
||||||
|
print "lex: tokens = '%s'" % lexobj.lextokens.keys()
|
||||||
|
|
||||||
|
try:
|
||||||
|
for c in literals:
|
||||||
|
if not (isinstance(c,types.StringType) or isinstance(c,types.UnicodeType)) or len(c) > 1:
|
||||||
|
print >>sys.stderr, "lex: Invalid literal %s. Must be a single character" % repr(c)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
except TypeError:
|
||||||
|
print >>sys.stderr, "lex: Invalid literals specification. literals must be a sequence of characters."
|
||||||
|
error = 1
|
||||||
|
|
||||||
|
lexobj.lexliterals = literals
|
||||||
|
|
||||||
|
# Build statemap
|
||||||
|
if states:
|
||||||
|
if not (isinstance(states,types.TupleType) or isinstance(states,types.ListType)):
|
||||||
|
print >>sys.stderr, "lex: states must be defined as a tuple or list."
|
||||||
|
error = 1
|
||||||
|
else:
|
||||||
|
for s in states:
|
||||||
|
if not isinstance(s,types.TupleType) or len(s) != 2:
|
||||||
|
print >>sys.stderr, "lex: invalid state specifier %s. Must be a tuple (statename,'exclusive|inclusive')" % repr(s)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
name, statetype = s
|
||||||
|
if not isinstance(name,types.StringType):
|
||||||
|
print >>sys.stderr, "lex: state name %s must be a string" % repr(name)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
if not (statetype == 'inclusive' or statetype == 'exclusive'):
|
||||||
|
print >>sys.stderr, "lex: state type for state %s must be 'inclusive' or 'exclusive'" % name
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
if stateinfo.has_key(name):
|
||||||
|
print >>sys.stderr, "lex: state '%s' already defined." % name
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
stateinfo[name] = statetype
|
||||||
|
|
||||||
|
# Get a list of symbols with the t_ or s_ prefix
|
||||||
|
tsymbols = [f for f in ldict.keys() if f[:2] == 't_' ]
|
||||||
|
|
||||||
|
# Now build up a list of functions and a list of strings
|
||||||
|
|
||||||
|
funcsym = { } # Symbols defined as functions
|
||||||
|
strsym = { } # Symbols defined as strings
|
||||||
|
toknames = { } # Mapping of symbols to token names
|
||||||
|
|
||||||
|
for s in stateinfo.keys():
|
||||||
|
funcsym[s] = []
|
||||||
|
strsym[s] = []
|
||||||
|
|
||||||
|
ignore = { } # Ignore strings by state
|
||||||
|
errorf = { } # Error functions by state
|
||||||
|
|
||||||
|
if len(tsymbols) == 0:
|
||||||
|
raise SyntaxError,"lex: no rules of the form t_rulename are defined."
|
||||||
|
|
||||||
|
for f in tsymbols:
|
||||||
|
t = ldict[f]
|
||||||
|
states, tokname = _statetoken(f,stateinfo)
|
||||||
|
toknames[f] = tokname
|
||||||
|
|
||||||
|
if callable(t):
|
||||||
|
for s in states: funcsym[s].append((f,t))
|
||||||
|
elif (isinstance(t, types.StringType) or isinstance(t,types.UnicodeType)):
|
||||||
|
for s in states: strsym[s].append((f,t))
|
||||||
|
else:
|
||||||
|
print >>sys.stderr, "lex: %s not defined as a function or string" % f
|
||||||
|
error = 1
|
||||||
|
|
||||||
|
# Sort the functions by line number
|
||||||
|
for f in funcsym.values():
|
||||||
|
f.sort(lambda x,y: cmp(x[1].func_code.co_firstlineno,y[1].func_code.co_firstlineno))
|
||||||
|
|
||||||
|
# Sort the strings by regular expression length
|
||||||
|
for s in strsym.values():
|
||||||
|
s.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
|
||||||
|
|
||||||
|
regexs = { }
|
||||||
|
|
||||||
|
# Build the master regular expressions
|
||||||
|
for state in stateinfo.keys():
|
||||||
|
regex_list = []
|
||||||
|
|
||||||
|
# Add rules defined by functions first
|
||||||
|
for fname, f in funcsym[state]:
|
||||||
|
line = f.func_code.co_firstlineno
|
||||||
|
file = f.func_code.co_filename
|
||||||
|
files[file] = None
|
||||||
|
tokname = toknames[fname]
|
||||||
|
|
||||||
|
ismethod = isinstance(f, types.MethodType)
|
||||||
|
|
||||||
|
if not optimize:
|
||||||
|
nargs = f.func_code.co_argcount
|
||||||
|
if ismethod:
|
||||||
|
reqargs = 2
|
||||||
|
else:
|
||||||
|
reqargs = 1
|
||||||
|
if nargs > reqargs:
|
||||||
|
print >>sys.stderr, "%s:%d: Rule '%s' has too many arguments." % (file,line,f.__name__)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if nargs < reqargs:
|
||||||
|
print >>sys.stderr, "%s:%d: Rule '%s' requires an argument." % (file,line,f.__name__)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if tokname == 'ignore':
|
||||||
|
print >>sys.stderr, "%s:%d: Rule '%s' must be defined as a string." % (file,line,f.__name__)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if tokname == 'error':
|
||||||
|
errorf[state] = f
|
||||||
|
continue
|
||||||
|
|
||||||
|
if f.__doc__:
|
||||||
|
if not optimize:
|
||||||
|
try:
|
||||||
|
c = re.compile("(?P<%s>%s)" % (f.__name__,f.__doc__), re.VERBOSE | reflags)
|
||||||
|
if c.match(""):
|
||||||
|
print >>sys.stderr, "%s:%d: Regular expression for rule '%s' matches empty string." % (file,line,f.__name__)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
except re.error,e:
|
||||||
|
print >>sys.stderr, "%s:%d: Invalid regular expression for rule '%s'. %s" % (file,line,f.__name__,e)
|
||||||
|
if '#' in f.__doc__:
|
||||||
|
print >>sys.stderr, "%s:%d. Make sure '#' in rule '%s' is escaped with '\\#'." % (file,line, f.__name__)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if debug:
|
||||||
|
print "lex: Adding rule %s -> '%s' (state '%s')" % (f.__name__,f.__doc__, state)
|
||||||
|
|
||||||
|
# Okay. The regular expression seemed okay. Let's append it to the master regular
|
||||||
|
# expression we're building
|
||||||
|
|
||||||
|
regex_list.append("(?P<%s>%s)" % (f.__name__,f.__doc__))
|
||||||
|
else:
|
||||||
|
print >>sys.stderr, "%s:%d: No regular expression defined for rule '%s'" % (file,line,f.__name__)
|
||||||
|
|
||||||
|
# Now add all of the simple rules
|
||||||
|
for name,r in strsym[state]:
|
||||||
|
tokname = toknames[name]
|
||||||
|
|
||||||
|
if tokname == 'ignore':
|
||||||
|
if "\\" in r:
|
||||||
|
print >>sys.stderr, "lex: Warning. %s contains a literal backslash '\\'" % name
|
||||||
|
ignore[state] = r
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not optimize:
|
||||||
|
if tokname == 'error':
|
||||||
|
raise SyntaxError,"lex: Rule '%s' must be defined as a function" % name
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not lexobj.lextokens.has_key(tokname) and tokname.find("ignore_") < 0:
|
||||||
|
print >>sys.stderr, "lex: Rule '%s' defined for an unspecified token %s." % (name,tokname)
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
c = re.compile("(?P<%s>%s)" % (name,r),re.VERBOSE | reflags)
|
||||||
|
if (c.match("")):
|
||||||
|
print >>sys.stderr, "lex: Regular expression for rule '%s' matches empty string." % name
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
except re.error,e:
|
||||||
|
print >>sys.stderr, "lex: Invalid regular expression for rule '%s'. %s" % (name,e)
|
||||||
|
if '#' in r:
|
||||||
|
print >>sys.stderr, "lex: Make sure '#' in rule '%s' is escaped with '\\#'." % name
|
||||||
|
|
||||||
|
error = 1
|
||||||
|
continue
|
||||||
|
if debug:
|
||||||
|
print "lex: Adding rule %s -> '%s' (state '%s')" % (name,r,state)
|
||||||
|
|
||||||
|
regex_list.append("(?P<%s>%s)" % (name,r))
|
||||||
|
|
||||||
|
if not regex_list:
|
||||||
|
print >>sys.stderr, "lex: No rules defined for state '%s'" % state
|
||||||
|
error = 1
|
||||||
|
|
||||||
|
regexs[state] = regex_list
|
||||||
|
|
||||||
|
|
||||||
|
if not optimize:
|
||||||
|
for f in files.keys():
|
||||||
|
if not _validate_file(f):
|
||||||
|
error = 1
|
||||||
|
|
||||||
|
if error:
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
|
||||||
|
# From this point forward, we're reasonably confident that we can build the lexer.
|
||||||
|
# No more errors will be generated, but there might be some warning messages.
|
||||||
|
|
||||||
|
# Build the master regular expressions
|
||||||
|
|
||||||
|
for state in regexs.keys():
|
||||||
|
lexre, re_text = _form_master_re(regexs[state],reflags,ldict,toknames)
|
||||||
|
lexobj.lexstatere[state] = lexre
|
||||||
|
lexobj.lexstateretext[state] = re_text
|
||||||
|
if debug:
|
||||||
|
for i in range(len(re_text)):
|
||||||
|
print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i])
|
||||||
|
|
||||||
|
# For inclusive states, we need to add the INITIAL state
|
||||||
|
for state,type in stateinfo.items():
|
||||||
|
if state != "INITIAL" and type == 'inclusive':
|
||||||
|
lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL'])
|
||||||
|
lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL'])
|
||||||
|
|
||||||
|
lexobj.lexstateinfo = stateinfo
|
||||||
|
lexobj.lexre = lexobj.lexstatere["INITIAL"]
|
||||||
|
lexobj.lexretext = lexobj.lexstateretext["INITIAL"]
|
||||||
|
|
||||||
|
# Set up ignore variables
|
||||||
|
lexobj.lexstateignore = ignore
|
||||||
|
lexobj.lexignore = lexobj.lexstateignore.get("INITIAL","")
|
||||||
|
|
||||||
|
# Set up error functions
|
||||||
|
lexobj.lexstateerrorf = errorf
|
||||||
|
lexobj.lexerrorf = errorf.get("INITIAL",None)
|
||||||
|
if warn and not lexobj.lexerrorf:
|
||||||
|
print >>sys.stderr, "lex: Warning. no t_error rule is defined."
|
||||||
|
|
||||||
|
# Check state information for ignore and error rules
|
||||||
|
for s,stype in stateinfo.items():
|
||||||
|
if stype == 'exclusive':
|
||||||
|
if warn and not errorf.has_key(s):
|
||||||
|
print >>sys.stderr, "lex: Warning. no error rule is defined for exclusive state '%s'" % s
|
||||||
|
if warn and not ignore.has_key(s) and lexobj.lexignore:
|
||||||
|
print >>sys.stderr, "lex: Warning. no ignore rule is defined for exclusive state '%s'" % s
|
||||||
|
elif stype == 'inclusive':
|
||||||
|
if not errorf.has_key(s):
|
||||||
|
errorf[s] = errorf.get("INITIAL",None)
|
||||||
|
if not ignore.has_key(s):
|
||||||
|
ignore[s] = ignore.get("INITIAL","")
|
||||||
|
|
||||||
|
|
||||||
|
# Create global versions of the token() and input() functions
|
||||||
|
token = lexobj.token
|
||||||
|
input = lexobj.input
|
||||||
|
lexer = lexobj
|
||||||
|
|
||||||
|
# If in optimize mode, we write the lextab
|
||||||
|
if lextab and optimize:
|
||||||
|
lexobj.writetab(lextab)
|
||||||
|
|
||||||
|
return lexobj
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# runmain()
|
||||||
|
#
|
||||||
|
# This runs the lexer as a main program
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def runmain(lexer=None,data=None):
|
||||||
|
if not data:
|
||||||
|
try:
|
||||||
|
filename = sys.argv[1]
|
||||||
|
f = open(filename)
|
||||||
|
data = f.read()
|
||||||
|
f.close()
|
||||||
|
except IndexError:
|
||||||
|
print "Reading from standard input (type EOF to end):"
|
||||||
|
data = sys.stdin.read()
|
||||||
|
|
||||||
|
if lexer:
|
||||||
|
_input = lexer.input
|
||||||
|
else:
|
||||||
|
_input = input
|
||||||
|
_input(data)
|
||||||
|
if lexer:
|
||||||
|
_token = lexer.token
|
||||||
|
else:
|
||||||
|
_token = token
|
||||||
|
|
||||||
|
while 1:
|
||||||
|
tok = _token()
|
||||||
|
if not tok: break
|
||||||
|
print "(%s,%r,%d,%d)" % (tok.type, tok.value, tok.lineno,tok.lexpos)
|
||||||
|
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# @TOKEN(regex)
|
||||||
|
#
|
||||||
|
# This decorator function can be used to set the regex expression on a function
|
||||||
|
# when its docstring might need to be set in an alternative way
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def TOKEN(r):
|
||||||
|
def set_doc(f):
|
||||||
|
f.__doc__ = r
|
||||||
|
return f
|
||||||
|
return set_doc
|
||||||
|
|
||||||
|
# Alternative spelling of the TOKEN decorator
|
||||||
|
Token = TOKEN
|
||||||
|
|
File diff suppressed because it is too large
Load diff
27
ext/ply/setup.py
Normal file
27
ext/ply/setup.py
Normal file
|
@ -0,0 +1,27 @@
|
||||||
|
from distutils.core import setup
|
||||||
|
|
||||||
|
setup(name = "ply",
|
||||||
|
description="Python Lex & Yacc",
|
||||||
|
long_description = """
|
||||||
|
PLY is yet another implementation of lex and yacc for Python. Although several other
|
||||||
|
parsing tools are available for Python, there are several reasons why you might
|
||||||
|
want to take a look at PLY:
|
||||||
|
|
||||||
|
It's implemented entirely in Python.
|
||||||
|
|
||||||
|
It uses LR-parsing which is reasonably efficient and well suited for larger grammars.
|
||||||
|
|
||||||
|
PLY provides most of the standard lex/yacc features including support for empty
|
||||||
|
productions, precedence rules, error recovery, and support for ambiguous grammars.
|
||||||
|
|
||||||
|
PLY is extremely easy to use and provides very extensive error checking.
|
||||||
|
""",
|
||||||
|
license="""Lesser GPL (LGPL)""",
|
||||||
|
version = "2.3",
|
||||||
|
author = "David Beazley",
|
||||||
|
author_email = "dave@dabeaz.com",
|
||||||
|
maintainer = "David Beazley",
|
||||||
|
maintainer_email = "dave@dabeaz.com",
|
||||||
|
url = "http://www.dabeaz.com/ply/",
|
||||||
|
packages = ['ply'],
|
||||||
|
)
|
|
@ -4,6 +4,8 @@ conditions. To run:
|
||||||
$ python testlex.py .
|
$ python testlex.py .
|
||||||
$ python testyacc.py .
|
$ python testyacc.py .
|
||||||
|
|
||||||
(make sure lex.py and yacc.py exist in this directory before
|
The tests can also be run using the Python unittest module.
|
||||||
running the tests).
|
|
||||||
|
|
||||||
|
$ python rununit.py
|
||||||
|
|
||||||
|
The script 'cleanup.sh' cleans up this directory to its original state.
|
||||||
|
|
|
@ -1,6 +1,10 @@
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
# calclex.py
|
# calclex.py
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
import sys
|
||||||
|
|
||||||
|
sys.path.append("..")
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = (
|
tokens = (
|
||||||
'NAME','NUMBER',
|
'NAME','NUMBER',
|
||||||
|
@ -36,10 +40,9 @@ def t_newline(t):
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
||||||
|
|
||||||
|
|
4
ext/ply/test/cleanup.sh
Normal file
4
ext/ply/test/cleanup.sh
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
rm -f *~ *.pyc *.dif *.out
|
||||||
|
|
|
@ -1 +1 @@
|
||||||
./lex_doc1.py:15: No regular expression defined for rule 't_NUMBER'
|
./lex_doc1.py:18: No regular expression defined for rule 't_NUMBER'
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Missing documentation string
|
# Missing documentation string
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
|
|
@ -1,2 +1,2 @@
|
||||||
./lex_dup1.py:17: Rule t_NUMBER redefined. Previously defined on line 15
|
./lex_dup1.py:20: Rule t_NUMBER redefined. Previously defined on line 18
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Duplicated rule specifiers
|
# Duplicated rule specifiers
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -19,7 +22,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1,2 +1,2 @@
|
||||||
./lex_dup2.py:19: Rule t_NUMBER redefined. Previously defined on line 15
|
./lex_dup2.py:22: Rule t_NUMBER redefined. Previously defined on line 18
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Duplicated rule specifiers
|
# Duplicated rule specifiers
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -23,7 +26,6 @@ def t_NUMBER(t):
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1,2 +1,2 @@
|
||||||
./lex_dup3.py:17: Rule t_NUMBER redefined. Previously defined on line 15
|
./lex_dup3.py:20: Rule t_NUMBER redefined. Previously defined on line 18
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Duplicated rule specifiers
|
# Duplicated rule specifiers
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -21,7 +24,6 @@ def t_NUMBER(t):
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# No rules defined
|
# No rules defined
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -10,7 +13,6 @@ tokens = [
|
||||||
"NUMBER",
|
"NUMBER",
|
||||||
]
|
]
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Missing t_error() rule
|
# Missing t_error() rule
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -14,7 +17,6 @@ t_PLUS = r'\+'
|
||||||
t_MINUS = r'-'
|
t_MINUS = r'-'
|
||||||
t_NUMBER = r'\d+'
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# t_error defined, but not function
|
# t_error defined, but not function
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -16,7 +19,6 @@ t_NUMBER = r'\d+'
|
||||||
|
|
||||||
t_error = "foo"
|
t_error = "foo"
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1,2 +1,2 @@
|
||||||
./lex_error3.py:17: Rule 't_error' requires an argument.
|
./lex_error3.py:20: Rule 't_error' requires an argument.
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# t_error defined as function, but with wrong # args
|
# t_error defined as function, but with wrong # args
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -17,7 +20,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error():
|
def t_error():
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1,2 +1,2 @@
|
||||||
./lex_error4.py:17: Rule 't_error' has too many arguments.
|
./lex_error4.py:20: Rule 't_error' has too many arguments.
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# t_error defined as function, but too many args
|
# t_error defined as function, but too many args
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -17,7 +20,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t,s):
|
def t_error(t,s):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1,3 +1,3 @@
|
||||||
(H_EDIT_DESCRIPTOR,'abc',1)
|
(H_EDIT_DESCRIPTOR,'abc',1,0)
|
||||||
(H_EDIT_DESCRIPTOR,'abcdefghij',1)
|
(H_EDIT_DESCRIPTOR,'abcdefghij',1,6)
|
||||||
(H_EDIT_DESCRIPTOR,'xy',1)
|
(H_EDIT_DESCRIPTOR,'xy',1,20)
|
||||||
|
|
|
@ -13,6 +13,10 @@
|
||||||
# This example shows how to modify the state of the lexer to parse
|
# This example shows how to modify the state of the lexer to parse
|
||||||
# such tokens
|
# such tokens
|
||||||
# -----------------------------------------------------------------------------
|
# -----------------------------------------------------------------------------
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = (
|
tokens = (
|
||||||
'H_EDIT_DESCRIPTOR',
|
'H_EDIT_DESCRIPTOR',
|
||||||
|
@ -33,10 +37,9 @@ def t_H_EDIT_DESCRIPTOR(t):
|
||||||
|
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
print "Illegal character '%s'" % t.value[0]
|
print "Illegal character '%s'" % t.value[0]
|
||||||
t.skip(1)
|
t.lexer.skip(1)
|
||||||
|
|
||||||
# Build the lexer
|
# Build the lexer
|
||||||
import lex
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
lex.runmain(data="3Habc 10Habcdefghij 2Hxy")
|
lex.runmain(data="3Habc 10Habcdefghij 2Hxy")
|
||||||
|
|
||||||
|
|
|
@ -1,2 +1,7 @@
|
||||||
./lex_ignore.py:17: Rule 't_ignore' must be defined as a string.
|
./lex_ignore.py:20: Rule 't_ignore' must be defined as a string.
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_ignore.py", line 29, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Improperly specific ignore declaration
|
# Improperly specific ignore declaration
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -22,7 +25,6 @@ def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
sys.tracebacklimit = 0
|
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
||||||
|
|
1
ext/ply/test/lex_ignore2.exp
Normal file
1
ext/ply/test/lex_ignore2.exp
Normal file
|
@ -0,0 +1 @@
|
||||||
|
lex: Warning. t_ignore contains a literal backslash '\'
|
29
ext/ply/test/lex_ignore2.py
Normal file
29
ext/ply/test/lex_ignore2.py
Normal file
|
@ -0,0 +1,29 @@
|
||||||
|
# lex_token.py
|
||||||
|
#
|
||||||
|
# ignore declaration as a raw string
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
t_ignore = r' \t'
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
30
ext/ply/test/lex_nowarn.py
Normal file
30
ext/ply/test/lex_nowarn.py
Normal file
|
@ -0,0 +1,30 @@
|
||||||
|
# lex_token.py
|
||||||
|
#
|
||||||
|
# Missing t_error() rule
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
states = (('foo','exclusive'),)
|
||||||
|
|
||||||
|
t_ignore = ' \t'
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
t_foo_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
|
lex.lex(nowarn=1)
|
||||||
|
|
||||||
|
|
|
@ -1,2 +1,7 @@
|
||||||
lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis
|
lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_re1.py", line 25, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
SyntaxError: lex: Unable to build lexer.
|
SyntaxError: lex: Unable to build lexer.
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Bad regular expression in a string
|
# Bad regular expression in a string
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -18,7 +21,6 @@ def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
sys.tracebacklimit = 0
|
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
||||||
|
|
7
ext/ply/test/lex_re2.exp
Normal file
7
ext/ply/test/lex_re2.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
lex: Regular expression for rule 't_PLUS' matches empty string.
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_re2.py", line 25, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
27
ext/ply/test/lex_re2.py
Normal file
27
ext/ply/test/lex_re2.py
Normal file
|
@ -0,0 +1,27 @@
|
||||||
|
# lex_token.py
|
||||||
|
#
|
||||||
|
# Regular expression rule matches empty string
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
t_PLUS = r'\+?'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'(\d+)'
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
8
ext/ply/test/lex_re3.exp
Normal file
8
ext/ply/test/lex_re3.exp
Normal file
|
@ -0,0 +1,8 @@
|
||||||
|
lex: Invalid regular expression for rule 't_POUND'. unbalanced parenthesis
|
||||||
|
lex: Make sure '#' in rule 't_POUND' is escaped with '\#'.
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_re3.py", line 27, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
29
ext/ply/test/lex_re3.py
Normal file
29
ext/ply/test/lex_re3.py
Normal file
|
@ -0,0 +1,29 @@
|
||||||
|
# lex_token.py
|
||||||
|
#
|
||||||
|
# Regular expression rule matches empty string
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
"POUND",
|
||||||
|
]
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'(\d+)'
|
||||||
|
t_POUND = r'#'
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Rule defined as some other type
|
# Rule defined as some other type
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -17,7 +20,6 @@ t_NUMBER = 1
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
7
ext/ply/test/lex_state1.exp
Normal file
7
ext/ply/test/lex_state1.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
lex: states must be defined as a tuple or list.
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state1.py", line 38, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
40
ext/ply/test/lex_state1.py
Normal file
40
ext/ply/test/lex_state1.py
Normal file
|
@ -0,0 +1,40 @@
|
||||||
|
# lex_state1.py
|
||||||
|
#
|
||||||
|
# Bad state declaration
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
states = 'comment'
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
8
ext/ply/test/lex_state2.exp
Normal file
8
ext/ply/test/lex_state2.exp
Normal file
|
@ -0,0 +1,8 @@
|
||||||
|
lex: invalid state specifier 'comment'. Must be a tuple (statename,'exclusive|inclusive')
|
||||||
|
lex: invalid state specifier 'example'. Must be a tuple (statename,'exclusive|inclusive')
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state2.py", line 38, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
40
ext/ply/test/lex_state2.py
Normal file
40
ext/ply/test/lex_state2.py
Normal file
|
@ -0,0 +1,40 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Bad state declaration
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
states = ('comment','example')
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
8
ext/ply/test/lex_state3.exp
Normal file
8
ext/ply/test/lex_state3.exp
Normal file
|
@ -0,0 +1,8 @@
|
||||||
|
lex: state name 1 must be a string
|
||||||
|
lex: No rules defined for state 'example'
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state3.py", line 40, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
42
ext/ply/test/lex_state3.py
Normal file
42
ext/ply/test/lex_state3.py
Normal file
|
@ -0,0 +1,42 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Bad state declaration
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = ((comment, 'inclusive'),
|
||||||
|
('example', 'exclusive'))
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
7
ext/ply/test/lex_state4.exp
Normal file
7
ext/ply/test/lex_state4.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
lex: state type for state comment must be 'inclusive' or 'exclusive'
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state4.py", line 39, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
41
ext/ply/test/lex_state4.py
Normal file
41
ext/ply/test/lex_state4.py
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Bad state declaration
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = (('comment', 'exclsive'),)
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
7
ext/ply/test/lex_state5.exp
Normal file
7
ext/ply/test/lex_state5.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
lex: state 'comment' already defined.
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state5.py", line 40, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
42
ext/ply/test/lex_state5.py
Normal file
42
ext/ply/test/lex_state5.py
Normal file
|
@ -0,0 +1,42 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Bad state declaration
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = (('comment', 'exclusive'),
|
||||||
|
('comment', 'exclusive'))
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
1
ext/ply/test/lex_state_noerror.exp
Normal file
1
ext/ply/test/lex_state_noerror.exp
Normal file
|
@ -0,0 +1 @@
|
||||||
|
lex: Warning. no error rule is defined for exclusive state 'comment'
|
41
ext/ply/test/lex_state_noerror.py
Normal file
41
ext/ply/test/lex_state_noerror.py
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Declaration of a state for which no rules are defined
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = (('comment', 'exclusive'),)
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
7
ext/ply/test/lex_state_norule.exp
Normal file
7
ext/ply/test/lex_state_norule.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
lex: No rules defined for state 'example'
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "./lex_state_norule.py", line 40, in <module>
|
||||||
|
lex.lex()
|
||||||
|
File "../ply/lex.py", line 759, in lex
|
||||||
|
raise SyntaxError,"lex: Unable to build lexer."
|
||||||
|
SyntaxError: lex: Unable to build lexer.
|
42
ext/ply/test/lex_state_norule.py
Normal file
42
ext/ply/test/lex_state_norule.py
Normal file
|
@ -0,0 +1,42 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Declaration of a state for which no rules are defined
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = (('comment', 'exclusive'),
|
||||||
|
('example', 'exclusive'))
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
|
7
ext/ply/test/lex_state_try.exp
Normal file
7
ext/ply/test/lex_state_try.exp
Normal file
|
@ -0,0 +1,7 @@
|
||||||
|
(NUMBER,'3',1,0)
|
||||||
|
(PLUS,'+',1,2)
|
||||||
|
(NUMBER,'4',1,4)
|
||||||
|
Entering comment state
|
||||||
|
comment body LexToken(body_part,'This is a comment */',1,9)
|
||||||
|
(PLUS,'+',1,30)
|
||||||
|
(NUMBER,'10',1,32)
|
48
ext/ply/test/lex_state_try.py
Normal file
48
ext/ply/test/lex_state_try.py
Normal file
|
@ -0,0 +1,48 @@
|
||||||
|
# lex_state2.py
|
||||||
|
#
|
||||||
|
# Declaration of a state for which no rules are defined
|
||||||
|
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
|
tokens = [
|
||||||
|
"PLUS",
|
||||||
|
"MINUS",
|
||||||
|
"NUMBER",
|
||||||
|
]
|
||||||
|
|
||||||
|
comment = 1
|
||||||
|
states = (('comment', 'exclusive'),)
|
||||||
|
|
||||||
|
t_PLUS = r'\+'
|
||||||
|
t_MINUS = r'-'
|
||||||
|
t_NUMBER = r'\d+'
|
||||||
|
|
||||||
|
t_ignore = " \t"
|
||||||
|
|
||||||
|
# Comments
|
||||||
|
def t_comment(t):
|
||||||
|
r'/\*'
|
||||||
|
t.lexer.begin('comment')
|
||||||
|
print "Entering comment state"
|
||||||
|
|
||||||
|
def t_comment_body_part(t):
|
||||||
|
r'(.|\n)*\*/'
|
||||||
|
print "comment body", t
|
||||||
|
t.lexer.begin('INITIAL')
|
||||||
|
|
||||||
|
def t_error(t):
|
||||||
|
pass
|
||||||
|
|
||||||
|
t_comment_error = t_error
|
||||||
|
t_comment_ignore = t_ignore
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
lex.lex()
|
||||||
|
|
||||||
|
data = "3 + 4 /* This is a comment */ + 10"
|
||||||
|
|
||||||
|
lex.runmain(data=data)
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Tests for absence of tokens variable
|
# Tests for absence of tokens variable
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
t_PLUS = r'\+'
|
t_PLUS = r'\+'
|
||||||
t_MINUS = r'-'
|
t_MINUS = r'-'
|
||||||
|
@ -11,7 +14,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Tests for tokens of wrong type
|
# Tests for tokens of wrong type
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = "PLUS MINUS NUMBER"
|
tokens = "PLUS MINUS NUMBER"
|
||||||
|
|
||||||
|
@ -13,7 +16,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# tokens is right type, but is missing a token for one rule
|
# tokens is right type, but is missing a token for one rule
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -16,7 +19,7 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Bad token name
|
# Bad token name
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -18,7 +21,6 @@ t_NUMBER = r'\d+'
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
|
@ -1 +1 @@
|
||||||
lex.LexError: ./lex_token5.py:16: Rule 't_NUMBER' returned an unknown token type 'NUM'
|
ply.lex.LexError: ./lex_token5.py:19: Rule 't_NUMBER' returned an unknown token type 'NUM'
|
||||||
|
|
|
@ -2,7 +2,10 @@
|
||||||
#
|
#
|
||||||
# Return a bad token name
|
# Return a bad token name
|
||||||
|
|
||||||
import lex
|
import sys
|
||||||
|
sys.path.insert(0,"..")
|
||||||
|
|
||||||
|
import ply.lex as lex
|
||||||
|
|
||||||
tokens = [
|
tokens = [
|
||||||
"PLUS",
|
"PLUS",
|
||||||
|
@ -21,7 +24,6 @@ def t_NUMBER(t):
|
||||||
def t_error(t):
|
def t_error(t):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
import sys
|
|
||||||
sys.tracebacklimit = 0
|
sys.tracebacklimit = 0
|
||||||
|
|
||||||
lex.lex()
|
lex.lex()
|
||||||
|
|
62
ext/ply/test/rununit.py
Normal file
62
ext/ply/test/rununit.py
Normal file
|
@ -0,0 +1,62 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
'''Script to run all tests using python "unittest" module'''
|
||||||
|
|
||||||
|
__author__ = "Miki Tebeka <miki.tebeka@zoran.com>"
|
||||||
|
|
||||||
|
from unittest import TestCase, main, makeSuite, TestSuite
|
||||||
|
from os import popen, environ, remove
|
||||||
|
from glob import glob
|
||||||
|
from sys import executable, argv
|
||||||
|
from os.path import isfile, basename, splitext
|
||||||
|
|
||||||
|
# Add path to lex.py and yacc.py
|
||||||
|
environ["PYTHONPATH"] = ".."
|
||||||
|
|
||||||
|
class PLYTest(TestCase):
|
||||||
|
'''General test case for PLY test'''
|
||||||
|
def _runtest(self, filename):
|
||||||
|
'''Run a single test file an compare result'''
|
||||||
|
exp_file = filename.replace(".py", ".exp")
|
||||||
|
self.failUnless(isfile(exp_file), "can't find %s" % exp_file)
|
||||||
|
pipe = popen("%s %s 2>&1" % (executable, filename))
|
||||||
|
out = pipe.read().strip()
|
||||||
|
self.failUnlessEqual(out, open(exp_file).read().strip())
|
||||||
|
|
||||||
|
|
||||||
|
class LexText(PLYTest):
|
||||||
|
'''Testing Lex'''
|
||||||
|
pass
|
||||||
|
|
||||||
|
class YaccTest(PLYTest):
|
||||||
|
'''Testing Yacc'''
|
||||||
|
|
||||||
|
def tearDown(self):
|
||||||
|
'''Cleanup parsetab.py[c] file'''
|
||||||
|
for ext in (".py", ".pyc"):
|
||||||
|
fname = "parsetab%s" % ext
|
||||||
|
if isfile(fname):
|
||||||
|
remove(fname)
|
||||||
|
|
||||||
|
def add_test(klass, filename):
|
||||||
|
'''Add a test to TestCase class'''
|
||||||
|
def t(self):
|
||||||
|
self._runtest(filename)
|
||||||
|
# Test name is test_FILENAME without the ./ and without the .py
|
||||||
|
setattr(klass, "test_%s" % (splitext(basename(filename))[0]), t)
|
||||||
|
|
||||||
|
# Add lex tests
|
||||||
|
for file in glob("./lex_*.py"):
|
||||||
|
add_test(LexText, file)
|
||||||
|
lex_suite = makeSuite(LexText, "test_")
|
||||||
|
|
||||||
|
# Add yacc tests
|
||||||
|
for file in glob("./yacc_*.py"):
|
||||||
|
add_test(YaccTest, file)
|
||||||
|
yacc_suite = makeSuite(YaccTest, "test_")
|
||||||
|
|
||||||
|
# All tests suite
|
||||||
|
test_suite = TestSuite((lex_suite, yacc_suite))
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
|
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue