ply: update PLY to version 3.2
This commit is contained in:
parent
bcaf93d182
commit
e1270f81bd
166 changed files with 9107 additions and 3804 deletions
|
@ -1,12 +1,13 @@
|
|||
February 19, 2007
|
||||
March 24, 2009
|
||||
|
||||
Announcing : PLY-2.3 (Python Lex-Yacc)
|
||||
Announcing : PLY-3.2 (Python Lex-Yacc)
|
||||
|
||||
http://www.dabeaz.com/ply
|
||||
|
||||
I'm pleased to announce a significant new update to PLY---a 100% Python
|
||||
implementation of the common parsing tools lex and yacc. PLY-2.3 is
|
||||
a minor bug fix release, but also features improved performance.
|
||||
implementation of the common parsing tools lex and yacc. PLY-3.2 adds
|
||||
compatibility for Python 2.6 and 3.0, provides some new customization
|
||||
options, and cleans up a lot of internal implementation details.
|
||||
|
||||
If you are new to PLY, here are a few highlights:
|
||||
|
||||
|
@ -29,19 +30,11 @@ If you are new to PLY, here are a few highlights:
|
|||
problems. Currently, PLY can build its parsing tables using
|
||||
either SLR or LALR(1) algorithms.
|
||||
|
||||
- PLY can be used to build parsers for large programming languages.
|
||||
Although it is not ultra-fast due to its Python implementation,
|
||||
PLY can be used to parse grammars consisting of several hundred
|
||||
rules (as might be found for a language like C). The lexer and LR
|
||||
parser are also reasonably efficient when parsing normal
|
||||
sized programs.
|
||||
|
||||
More information about PLY can be obtained on the PLY webpage at:
|
||||
|
||||
http://www.dabeaz.com/ply
|
||||
|
||||
PLY is freely available and is licensed under the terms of the Lesser
|
||||
GNU Public License (LGPL).
|
||||
PLY is freely available.
|
||||
|
||||
Cheers,
|
||||
|
||||
|
|
332
ext/ply/CHANGES
332
ext/ply/CHANGES
|
@ -1,3 +1,335 @@
|
|||
|
||||
Version 3.2
|
||||
-----------------------------
|
||||
03/24/09: beazley
|
||||
Added an extra check to not print duplicated warning messages
|
||||
about reduce/reduce conflicts.
|
||||
|
||||
03/24/09: beazley
|
||||
Switched PLY over to a BSD-license.
|
||||
|
||||
03/23/09: beazley
|
||||
Performance optimization. Discovered a few places to make
|
||||
speedups in LR table generation.
|
||||
|
||||
03/23/09: beazley
|
||||
New warning message. PLY now warns about rules never
|
||||
reduced due to reduce/reduce conflicts. Suggested by
|
||||
Bruce Frederiksen.
|
||||
|
||||
03/23/09: beazley
|
||||
Some clean-up of warning messages related to reduce/reduce errors.
|
||||
|
||||
03/23/09: beazley
|
||||
Added a new picklefile option to yacc() to write the parsing
|
||||
tables to a filename using the pickle module. Here is how
|
||||
it works:
|
||||
|
||||
yacc(picklefile="parsetab.p")
|
||||
|
||||
This option can be used if the normal parsetab.py file is
|
||||
extremely large. For example, on jython, it is impossible
|
||||
to read parsing tables if the parsetab.py exceeds a certain
|
||||
threshold.
|
||||
|
||||
The filename supplied to the picklefile option is opened
|
||||
relative to the current working directory of the Python
|
||||
interpreter. If you need to refer to the file elsewhere,
|
||||
you will need to supply an absolute or relative path.
|
||||
|
||||
For maximum portability, the pickle file is written
|
||||
using protocol 0.
|
||||
|
||||
03/13/09: beazley
|
||||
Fixed a bug in parser.out generation where the rule numbers
|
||||
where off by one.
|
||||
|
||||
03/13/09: beazley
|
||||
Fixed a string formatting bug with one of the error messages.
|
||||
Reported by Richard Reitmeyer
|
||||
|
||||
Version 3.1
|
||||
-----------------------------
|
||||
02/28/09: beazley
|
||||
Fixed broken start argument to yacc(). PLY-3.0 broke this
|
||||
feature by accident.
|
||||
|
||||
02/28/09: beazley
|
||||
Fixed debugging output. yacc() no longer reports shift/reduce
|
||||
or reduce/reduce conflicts if debugging is turned off. This
|
||||
restores similar behavior in PLY-2.5. Reported by Andrew Waters.
|
||||
|
||||
Version 3.0
|
||||
-----------------------------
|
||||
02/03/09: beazley
|
||||
Fixed missing lexer attribute on certain tokens when
|
||||
invoking the parser p_error() function. Reported by
|
||||
Bart Whiteley.
|
||||
|
||||
02/02/09: beazley
|
||||
The lex() command now does all error-reporting and diagonistics
|
||||
using the logging module interface. Pass in a Logger object
|
||||
using the errorlog parameter to specify a different logger.
|
||||
|
||||
02/02/09: beazley
|
||||
Refactored ply.lex to use a more object-oriented and organized
|
||||
approach to collecting lexer information.
|
||||
|
||||
02/01/09: beazley
|
||||
Removed the nowarn option from lex(). All output is controlled
|
||||
by passing in a logger object. Just pass in a logger with a high
|
||||
level setting to suppress output. This argument was never
|
||||
documented to begin with so hopefully no one was relying upon it.
|
||||
|
||||
02/01/09: beazley
|
||||
Discovered and removed a dead if-statement in the lexer. This
|
||||
resulted in a 6-7% speedup in lexing when I tested it.
|
||||
|
||||
01/13/09: beazley
|
||||
Minor change to the procedure for signalling a syntax error in a
|
||||
production rule. A normal SyntaxError exception should be raised
|
||||
instead of yacc.SyntaxError.
|
||||
|
||||
01/13/09: beazley
|
||||
Added a new method p.set_lineno(n,lineno) that can be used to set the
|
||||
line number of symbol n in grammar rules. This simplifies manual
|
||||
tracking of line numbers.
|
||||
|
||||
01/11/09: beazley
|
||||
Vastly improved debugging support for yacc.parse(). Instead of passing
|
||||
debug as an integer, you can supply a Logging object (see the logging
|
||||
module). Messages will be generated at the ERROR, INFO, and DEBUG
|
||||
logging levels, each level providing progressively more information.
|
||||
The debugging trace also shows states, grammar rule, values passed
|
||||
into grammar rules, and the result of each reduction.
|
||||
|
||||
01/09/09: beazley
|
||||
The yacc() command now does all error-reporting and diagnostics using
|
||||
the interface of the logging module. Use the errorlog parameter to
|
||||
specify a logging object for error messages. Use the debuglog parameter
|
||||
to specify a logging object for the 'parser.out' output.
|
||||
|
||||
01/09/09: beazley
|
||||
*HUGE* refactoring of the the ply.yacc() implementation. The high-level
|
||||
user interface is backwards compatible, but the internals are completely
|
||||
reorganized into classes. No more global variables. The internals
|
||||
are also more extensible. For example, you can use the classes to
|
||||
construct a LALR(1) parser in an entirely different manner than
|
||||
what is currently the case. Documentation is forthcoming.
|
||||
|
||||
01/07/09: beazley
|
||||
Various cleanup and refactoring of yacc internals.
|
||||
|
||||
01/06/09: beazley
|
||||
Fixed a bug with precedence assignment. yacc was assigning the precedence
|
||||
each rule based on the left-most token, when in fact, it should have been
|
||||
using the right-most token. Reported by Bruce Frederiksen.
|
||||
|
||||
11/27/08: beazley
|
||||
Numerous changes to support Python 3.0 including removal of deprecated
|
||||
statements (e.g., has_key) and the additional of compatibility code
|
||||
to emulate features from Python 2 that have been removed, but which
|
||||
are needed. Fixed the unit testing suite to work with Python 3.0.
|
||||
The code should be backwards compatible with Python 2.
|
||||
|
||||
11/26/08: beazley
|
||||
Loosened the rules on what kind of objects can be passed in as the
|
||||
"module" parameter to lex() and yacc(). Previously, you could only use
|
||||
a module or an instance. Now, PLY just uses dir() to get a list of
|
||||
symbols on whatever the object is without regard for its type.
|
||||
|
||||
11/26/08: beazley
|
||||
Changed all except: statements to be compatible with Python2.x/3.x syntax.
|
||||
|
||||
11/26/08: beazley
|
||||
Changed all raise Exception, value statements to raise Exception(value) for
|
||||
forward compatibility.
|
||||
|
||||
11/26/08: beazley
|
||||
Removed all print statements from lex and yacc, using sys.stdout and sys.stderr
|
||||
directly. Preparation for Python 3.0 support.
|
||||
|
||||
11/04/08: beazley
|
||||
Fixed a bug with referring to symbols on the the parsing stack using negative
|
||||
indices.
|
||||
|
||||
05/29/08: beazley
|
||||
Completely revamped the testing system to use the unittest module for everything.
|
||||
Added additional tests to cover new errors/warnings.
|
||||
|
||||
Version 2.5
|
||||
-----------------------------
|
||||
05/28/08: beazley
|
||||
Fixed a bug with writing lex-tables in optimized mode and start states.
|
||||
Reported by Kevin Henry.
|
||||
|
||||
Version 2.4
|
||||
-----------------------------
|
||||
05/04/08: beazley
|
||||
A version number is now embedded in the table file signature so that
|
||||
yacc can more gracefully accomodate changes to the output format
|
||||
in the future.
|
||||
|
||||
05/04/08: beazley
|
||||
Removed undocumented .pushback() method on grammar productions. I'm
|
||||
not sure this ever worked and can't recall ever using it. Might have
|
||||
been an abandoned idea that never really got fleshed out. This
|
||||
feature was never described or tested so removing it is hopefully
|
||||
harmless.
|
||||
|
||||
05/04/08: beazley
|
||||
Added extra error checking to yacc() to detect precedence rules defined
|
||||
for undefined terminal symbols. This allows yacc() to detect a potential
|
||||
problem that can be really tricky to debug if no warning message or error
|
||||
message is generated about it.
|
||||
|
||||
05/04/08: beazley
|
||||
lex() now has an outputdir that can specify the output directory for
|
||||
tables when running in optimize mode. For example:
|
||||
|
||||
lexer = lex.lex(optimize=True, lextab="ltab", outputdir="foo/bar")
|
||||
|
||||
The behavior of specifying a table module and output directory are
|
||||
more aligned with the behavior of yacc().
|
||||
|
||||
05/04/08: beazley
|
||||
[Issue 9]
|
||||
Fixed filename bug in when specifying the modulename in lex() and yacc().
|
||||
If you specified options such as the following:
|
||||
|
||||
parser = yacc.yacc(tabmodule="foo.bar.parsetab",outputdir="foo/bar")
|
||||
|
||||
yacc would create a file "foo.bar.parsetab.py" in the given directory.
|
||||
Now, it simply generates a file "parsetab.py" in that directory.
|
||||
Bug reported by cptbinho.
|
||||
|
||||
05/04/08: beazley
|
||||
Slight modification to lex() and yacc() to allow their table files
|
||||
to be loaded from a previously loaded module. This might make
|
||||
it easier to load the parsing tables from a complicated package
|
||||
structure. For example:
|
||||
|
||||
import foo.bar.spam.parsetab as parsetab
|
||||
parser = yacc.yacc(tabmodule=parsetab)
|
||||
|
||||
Note: lex and yacc will never regenerate the table file if used
|
||||
in the form---you will get a warning message instead.
|
||||
This idea suggested by Brian Clapper.
|
||||
|
||||
|
||||
04/28/08: beazley
|
||||
Fixed a big with p_error() functions being picked up correctly
|
||||
when running in yacc(optimize=1) mode. Patch contributed by
|
||||
Bart Whiteley.
|
||||
|
||||
02/28/08: beazley
|
||||
Fixed a bug with 'nonassoc' precedence rules. Basically the
|
||||
non-precedence was being ignored and not producing the correct
|
||||
run-time behavior in the parser.
|
||||
|
||||
02/16/08: beazley
|
||||
Slight relaxation of what the input() method to a lexer will
|
||||
accept as a string. Instead of testing the input to see
|
||||
if the input is a string or unicode string, it checks to see
|
||||
if the input object looks like it contains string data.
|
||||
This change makes it possible to pass string-like objects
|
||||
in as input. For example, the object returned by mmap.
|
||||
|
||||
import mmap, os
|
||||
data = mmap.mmap(os.open(filename,os.O_RDONLY),
|
||||
os.path.getsize(filename),
|
||||
access=mmap.ACCESS_READ)
|
||||
lexer.input(data)
|
||||
|
||||
|
||||
11/29/07: beazley
|
||||
Modification of ply.lex to allow token functions to aliased.
|
||||
This is subtle, but it makes it easier to create libraries and
|
||||
to reuse token specifications. For example, suppose you defined
|
||||
a function like this:
|
||||
|
||||
def number(t):
|
||||
r'\d+'
|
||||
t.value = int(t.value)
|
||||
return t
|
||||
|
||||
This change would allow you to define a token rule as follows:
|
||||
|
||||
t_NUMBER = number
|
||||
|
||||
In this case, the token type will be set to 'NUMBER' and use
|
||||
the associated number() function to process tokens.
|
||||
|
||||
11/28/07: beazley
|
||||
Slight modification to lex and yacc to grab symbols from both
|
||||
the local and global dictionaries of the caller. This
|
||||
modification allows lexers and parsers to be defined using
|
||||
inner functions and closures.
|
||||
|
||||
11/28/07: beazley
|
||||
Performance optimization: The lexer.lexmatch and t.lexer
|
||||
attributes are no longer set for lexer tokens that are not
|
||||
defined by functions. The only normal use of these attributes
|
||||
would be in lexer rules that need to perform some kind of
|
||||
special processing. Thus, it doesn't make any sense to set
|
||||
them on every token.
|
||||
|
||||
*** POTENTIAL INCOMPATIBILITY *** This might break code
|
||||
that is mucking around with internal lexer state in some
|
||||
sort of magical way.
|
||||
|
||||
11/27/07: beazley
|
||||
Added the ability to put the parser into error-handling mode
|
||||
from within a normal production. To do this, simply raise
|
||||
a yacc.SyntaxError exception like this:
|
||||
|
||||
def p_some_production(p):
|
||||
'some_production : prod1 prod2'
|
||||
...
|
||||
raise yacc.SyntaxError # Signal an error
|
||||
|
||||
A number of things happen after this occurs:
|
||||
|
||||
- The last symbol shifted onto the symbol stack is discarded
|
||||
and parser state backed up to what it was before the
|
||||
the rule reduction.
|
||||
|
||||
- The current lookahead symbol is saved and replaced by
|
||||
the 'error' symbol.
|
||||
|
||||
- The parser enters error recovery mode where it tries
|
||||
to either reduce the 'error' rule or it starts
|
||||
discarding items off of the stack until the parser
|
||||
resets.
|
||||
|
||||
When an error is manually set, the parser does *not* call
|
||||
the p_error() function (if any is defined).
|
||||
*** NEW FEATURE *** Suggested on the mailing list
|
||||
|
||||
11/27/07: beazley
|
||||
Fixed structure bug in examples/ansic. Reported by Dion Blazakis.
|
||||
|
||||
11/27/07: beazley
|
||||
Fixed a bug in the lexer related to start conditions and ignored
|
||||
token rules. If a rule was defined that changed state, but
|
||||
returned no token, the lexer could be left in an inconsistent
|
||||
state. Reported by
|
||||
|
||||
11/27/07: beazley
|
||||
Modified setup.py to support Python Eggs. Patch contributed by
|
||||
Simon Cross.
|
||||
|
||||
11/09/07: beazely
|
||||
Fixed a bug in error handling in yacc. If a syntax error occurred and the
|
||||
parser rolled the entire parse stack back, the parser would be left in in
|
||||
inconsistent state that would cause it to trigger incorrect actions on
|
||||
subsequent input. Reported by Ton Biegstraaten, Justin King, and others.
|
||||
|
||||
11/09/07: beazley
|
||||
Fixed a bug when passing empty input strings to yacc.parse(). This
|
||||
would result in an error message about "No input given". Reported
|
||||
by Andrew Dalke.
|
||||
|
||||
Version 2.3
|
||||
-----------------------------
|
||||
02/20/07: beazley
|
||||
|
|
504
ext/ply/COPYING
504
ext/ply/COPYING
|
@ -1,504 +0,0 @@
|
|||
GNU LESSER GENERAL PUBLIC LICENSE
|
||||
Version 2.1, February 1999
|
||||
|
||||
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
|
||||
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
[This is the first released version of the Lesser GPL. It also counts
|
||||
as the successor of the GNU Library Public License, version 2, hence
|
||||
the version number 2.1.]
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
Licenses are intended to guarantee your freedom to share and change
|
||||
free software--to make sure the software is free for all its users.
|
||||
|
||||
This license, the Lesser General Public License, applies to some
|
||||
specially designated software packages--typically libraries--of the
|
||||
Free Software Foundation and other authors who decide to use it. You
|
||||
can use it too, but we suggest you first think carefully about whether
|
||||
this license or the ordinary General Public License is the better
|
||||
strategy to use in any particular case, based on the explanations below.
|
||||
|
||||
When we speak of free software, we are referring to freedom of use,
|
||||
not price. Our General Public Licenses are designed to make sure that
|
||||
you have the freedom to distribute copies of free software (and charge
|
||||
for this service if you wish); that you receive source code or can get
|
||||
it if you want it; that you can change the software and use pieces of
|
||||
it in new free programs; and that you are informed that you can do
|
||||
these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
distributors to deny you these rights or to ask you to surrender these
|
||||
rights. These restrictions translate to certain responsibilities for
|
||||
you if you distribute copies of the library or if you modify it.
|
||||
|
||||
For example, if you distribute copies of the library, whether gratis
|
||||
or for a fee, you must give the recipients all the rights that we gave
|
||||
you. You must make sure that they, too, receive or can get the source
|
||||
code. If you link other code with the library, you must provide
|
||||
complete object files to the recipients, so that they can relink them
|
||||
with the library after making changes to the library and recompiling
|
||||
it. And you must show them these terms so they know their rights.
|
||||
|
||||
We protect your rights with a two-step method: (1) we copyright the
|
||||
library, and (2) we offer you this license, which gives you legal
|
||||
permission to copy, distribute and/or modify the library.
|
||||
|
||||
To protect each distributor, we want to make it very clear that
|
||||
there is no warranty for the free library. Also, if the library is
|
||||
modified by someone else and passed on, the recipients should know
|
||||
that what they have is not the original version, so that the original
|
||||
author's reputation will not be affected by problems that might be
|
||||
introduced by others.
|
||||
|
||||
Finally, software patents pose a constant threat to the existence of
|
||||
any free program. We wish to make sure that a company cannot
|
||||
effectively restrict the users of a free program by obtaining a
|
||||
restrictive license from a patent holder. Therefore, we insist that
|
||||
any patent license obtained for a version of the library must be
|
||||
consistent with the full freedom of use specified in this license.
|
||||
|
||||
Most GNU software, including some libraries, is covered by the
|
||||
ordinary GNU General Public License. This license, the GNU Lesser
|
||||
General Public License, applies to certain designated libraries, and
|
||||
is quite different from the ordinary General Public License. We use
|
||||
this license for certain libraries in order to permit linking those
|
||||
libraries into non-free programs.
|
||||
|
||||
When a program is linked with a library, whether statically or using
|
||||
a shared library, the combination of the two is legally speaking a
|
||||
combined work, a derivative of the original library. The ordinary
|
||||
General Public License therefore permits such linking only if the
|
||||
entire combination fits its criteria of freedom. The Lesser General
|
||||
Public License permits more lax criteria for linking other code with
|
||||
the library.
|
||||
|
||||
We call this license the "Lesser" General Public License because it
|
||||
does Less to protect the user's freedom than the ordinary General
|
||||
Public License. It also provides other free software developers Less
|
||||
of an advantage over competing non-free programs. These disadvantages
|
||||
are the reason we use the ordinary General Public License for many
|
||||
libraries. However, the Lesser license provides advantages in certain
|
||||
special circumstances.
|
||||
|
||||
For example, on rare occasions, there may be a special need to
|
||||
encourage the widest possible use of a certain library, so that it becomes
|
||||
a de-facto standard. To achieve this, non-free programs must be
|
||||
allowed to use the library. A more frequent case is that a free
|
||||
library does the same job as widely used non-free libraries. In this
|
||||
case, there is little to gain by limiting the free library to free
|
||||
software only, so we use the Lesser General Public License.
|
||||
|
||||
In other cases, permission to use a particular library in non-free
|
||||
programs enables a greater number of people to use a large body of
|
||||
free software. For example, permission to use the GNU C Library in
|
||||
non-free programs enables many more people to use the whole GNU
|
||||
operating system, as well as its variant, the GNU/Linux operating
|
||||
system.
|
||||
|
||||
Although the Lesser General Public License is Less protective of the
|
||||
users' freedom, it does ensure that the user of a program that is
|
||||
linked with the Library has the freedom and the wherewithal to run
|
||||
that program using a modified version of the Library.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow. Pay close attention to the difference between a
|
||||
"work based on the library" and a "work that uses the library". The
|
||||
former contains code derived from the library, whereas the latter must
|
||||
be combined with the library in order to run.
|
||||
|
||||
GNU LESSER GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License Agreement applies to any software library or other
|
||||
program which contains a notice placed by the copyright holder or
|
||||
other authorized party saying it may be distributed under the terms of
|
||||
this Lesser General Public License (also called "this License").
|
||||
Each licensee is addressed as "you".
|
||||
|
||||
A "library" means a collection of software functions and/or data
|
||||
prepared so as to be conveniently linked with application programs
|
||||
(which use some of those functions and data) to form executables.
|
||||
|
||||
The "Library", below, refers to any such software library or work
|
||||
which has been distributed under these terms. A "work based on the
|
||||
Library" means either the Library or any derivative work under
|
||||
copyright law: that is to say, a work containing the Library or a
|
||||
portion of it, either verbatim or with modifications and/or translated
|
||||
straightforwardly into another language. (Hereinafter, translation is
|
||||
included without limitation in the term "modification".)
|
||||
|
||||
"Source code" for a work means the preferred form of the work for
|
||||
making modifications to it. For a library, complete source code means
|
||||
all the source code for all modules it contains, plus any associated
|
||||
interface definition files, plus the scripts used to control compilation
|
||||
and installation of the library.
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running a program using the Library is not restricted, and output from
|
||||
such a program is covered only if its contents constitute a work based
|
||||
on the Library (independent of the use of the Library in a tool for
|
||||
writing it). Whether that is true depends on what the Library does
|
||||
and what the program that uses the Library does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Library's
|
||||
complete source code as you receive it, in any medium, provided that
|
||||
you conspicuously and appropriately publish on each copy an
|
||||
appropriate copyright notice and disclaimer of warranty; keep intact
|
||||
all the notices that refer to this License and to the absence of any
|
||||
warranty; and distribute a copy of this License along with the
|
||||
Library.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy,
|
||||
and you may at your option offer warranty protection in exchange for a
|
||||
fee.
|
||||
|
||||
2. You may modify your copy or copies of the Library or any portion
|
||||
of it, thus forming a work based on the Library, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) The modified work must itself be a software library.
|
||||
|
||||
b) You must cause the files modified to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
c) You must cause the whole of the work to be licensed at no
|
||||
charge to all third parties under the terms of this License.
|
||||
|
||||
d) If a facility in the modified Library refers to a function or a
|
||||
table of data to be supplied by an application program that uses
|
||||
the facility, other than as an argument passed when the facility
|
||||
is invoked, then you must make a good faith effort to ensure that,
|
||||
in the event an application does not supply such function or
|
||||
table, the facility still operates, and performs whatever part of
|
||||
its purpose remains meaningful.
|
||||
|
||||
(For example, a function in a library to compute square roots has
|
||||
a purpose that is entirely well-defined independent of the
|
||||
application. Therefore, Subsection 2d requires that any
|
||||
application-supplied function or table used by this function must
|
||||
be optional: if the application does not supply it, the square
|
||||
root function must still compute square roots.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Library,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Library, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote
|
||||
it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Library.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Library
|
||||
with the Library (or with a work based on the Library) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may opt to apply the terms of the ordinary GNU General Public
|
||||
License instead of this License to a given copy of the Library. To do
|
||||
this, you must alter all the notices that refer to this License, so
|
||||
that they refer to the ordinary GNU General Public License, version 2,
|
||||
instead of to this License. (If a newer version than version 2 of the
|
||||
ordinary GNU General Public License has appeared, then you can specify
|
||||
that version instead if you wish.) Do not make any other change in
|
||||
these notices.
|
||||
|
||||
Once this change is made in a given copy, it is irreversible for
|
||||
that copy, so the ordinary GNU General Public License applies to all
|
||||
subsequent copies and derivative works made from that copy.
|
||||
|
||||
This option is useful when you wish to copy part of the code of
|
||||
the Library into a program that is not a library.
|
||||
|
||||
4. You may copy and distribute the Library (or a portion or
|
||||
derivative of it, under Section 2) in object code or executable form
|
||||
under the terms of Sections 1 and 2 above provided that you accompany
|
||||
it with the complete corresponding machine-readable source code, which
|
||||
must be distributed under the terms of Sections 1 and 2 above on a
|
||||
medium customarily used for software interchange.
|
||||
|
||||
If distribution of object code is made by offering access to copy
|
||||
from a designated place, then offering equivalent access to copy the
|
||||
source code from the same place satisfies the requirement to
|
||||
distribute the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
5. A program that contains no derivative of any portion of the
|
||||
Library, but is designed to work with the Library by being compiled or
|
||||
linked with it, is called a "work that uses the Library". Such a
|
||||
work, in isolation, is not a derivative work of the Library, and
|
||||
therefore falls outside the scope of this License.
|
||||
|
||||
However, linking a "work that uses the Library" with the Library
|
||||
creates an executable that is a derivative of the Library (because it
|
||||
contains portions of the Library), rather than a "work that uses the
|
||||
library". The executable is therefore covered by this License.
|
||||
Section 6 states terms for distribution of such executables.
|
||||
|
||||
When a "work that uses the Library" uses material from a header file
|
||||
that is part of the Library, the object code for the work may be a
|
||||
derivative work of the Library even though the source code is not.
|
||||
Whether this is true is especially significant if the work can be
|
||||
linked without the Library, or if the work is itself a library. The
|
||||
threshold for this to be true is not precisely defined by law.
|
||||
|
||||
If such an object file uses only numerical parameters, data
|
||||
structure layouts and accessors, and small macros and small inline
|
||||
functions (ten lines or less in length), then the use of the object
|
||||
file is unrestricted, regardless of whether it is legally a derivative
|
||||
work. (Executables containing this object code plus portions of the
|
||||
Library will still fall under Section 6.)
|
||||
|
||||
Otherwise, if the work is a derivative of the Library, you may
|
||||
distribute the object code for the work under the terms of Section 6.
|
||||
Any executables containing that work also fall under Section 6,
|
||||
whether or not they are linked directly with the Library itself.
|
||||
|
||||
6. As an exception to the Sections above, you may also combine or
|
||||
link a "work that uses the Library" with the Library to produce a
|
||||
work containing portions of the Library, and distribute that work
|
||||
under terms of your choice, provided that the terms permit
|
||||
modification of the work for the customer's own use and reverse
|
||||
engineering for debugging such modifications.
|
||||
|
||||
You must give prominent notice with each copy of the work that the
|
||||
Library is used in it and that the Library and its use are covered by
|
||||
this License. You must supply a copy of this License. If the work
|
||||
during execution displays copyright notices, you must include the
|
||||
copyright notice for the Library among them, as well as a reference
|
||||
directing the user to the copy of this License. Also, you must do one
|
||||
of these things:
|
||||
|
||||
a) Accompany the work with the complete corresponding
|
||||
machine-readable source code for the Library including whatever
|
||||
changes were used in the work (which must be distributed under
|
||||
Sections 1 and 2 above); and, if the work is an executable linked
|
||||
with the Library, with the complete machine-readable "work that
|
||||
uses the Library", as object code and/or source code, so that the
|
||||
user can modify the Library and then relink to produce a modified
|
||||
executable containing the modified Library. (It is understood
|
||||
that the user who changes the contents of definitions files in the
|
||||
Library will not necessarily be able to recompile the application
|
||||
to use the modified definitions.)
|
||||
|
||||
b) Use a suitable shared library mechanism for linking with the
|
||||
Library. A suitable mechanism is one that (1) uses at run time a
|
||||
copy of the library already present on the user's computer system,
|
||||
rather than copying library functions into the executable, and (2)
|
||||
will operate properly with a modified version of the library, if
|
||||
the user installs one, as long as the modified version is
|
||||
interface-compatible with the version that the work was made with.
|
||||
|
||||
c) Accompany the work with a written offer, valid for at
|
||||
least three years, to give the same user the materials
|
||||
specified in Subsection 6a, above, for a charge no more
|
||||
than the cost of performing this distribution.
|
||||
|
||||
d) If distribution of the work is made by offering access to copy
|
||||
from a designated place, offer equivalent access to copy the above
|
||||
specified materials from the same place.
|
||||
|
||||
e) Verify that the user has already received a copy of these
|
||||
materials or that you have already sent this user a copy.
|
||||
|
||||
For an executable, the required form of the "work that uses the
|
||||
Library" must include any data and utility programs needed for
|
||||
reproducing the executable from it. However, as a special exception,
|
||||
the materials to be distributed need not include anything that is
|
||||
normally distributed (in either source or binary form) with the major
|
||||
components (compiler, kernel, and so on) of the operating system on
|
||||
which the executable runs, unless that component itself accompanies
|
||||
the executable.
|
||||
|
||||
It may happen that this requirement contradicts the license
|
||||
restrictions of other proprietary libraries that do not normally
|
||||
accompany the operating system. Such a contradiction means you cannot
|
||||
use both them and the Library together in an executable that you
|
||||
distribute.
|
||||
|
||||
7. You may place library facilities that are a work based on the
|
||||
Library side-by-side in a single library together with other library
|
||||
facilities not covered by this License, and distribute such a combined
|
||||
library, provided that the separate distribution of the work based on
|
||||
the Library and of the other library facilities is otherwise
|
||||
permitted, and provided that you do these two things:
|
||||
|
||||
a) Accompany the combined library with a copy of the same work
|
||||
based on the Library, uncombined with any other library
|
||||
facilities. This must be distributed under the terms of the
|
||||
Sections above.
|
||||
|
||||
b) Give prominent notice with the combined library of the fact
|
||||
that part of it is a work based on the Library, and explaining
|
||||
where to find the accompanying uncombined form of the same work.
|
||||
|
||||
8. You may not copy, modify, sublicense, link with, or distribute
|
||||
the Library except as expressly provided under this License. Any
|
||||
attempt otherwise to copy, modify, sublicense, link with, or
|
||||
distribute the Library is void, and will automatically terminate your
|
||||
rights under this License. However, parties who have received copies,
|
||||
or rights, from you under this License will not have their licenses
|
||||
terminated so long as such parties remain in full compliance.
|
||||
|
||||
9. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Library or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Library (or any work based on the
|
||||
Library), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Library or works based on it.
|
||||
|
||||
10. Each time you redistribute the Library (or any work based on the
|
||||
Library), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute, link with or modify the Library
|
||||
subject to these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties with
|
||||
this License.
|
||||
|
||||
11. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Library at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Library by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Library.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under any
|
||||
particular circumstance, the balance of the section is intended to apply,
|
||||
and the section as a whole is intended to apply in other circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
12. If the distribution and/or use of the Library is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Library under this License may add
|
||||
an explicit geographical distribution limitation excluding those countries,
|
||||
so that distribution is permitted only in or among countries not thus
|
||||
excluded. In such case, this License incorporates the limitation as if
|
||||
written in the body of this License.
|
||||
|
||||
13. The Free Software Foundation may publish revised and/or new
|
||||
versions of the Lesser General Public License from time to time.
|
||||
Such new versions will be similar in spirit to the present version,
|
||||
but may differ in detail to address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Library
|
||||
specifies a version number of this License which applies to it and
|
||||
"any later version", you have the option of following the terms and
|
||||
conditions either of that version or of any later version published by
|
||||
the Free Software Foundation. If the Library does not specify a
|
||||
license version number, you may choose any version ever published by
|
||||
the Free Software Foundation.
|
||||
|
||||
14. If you wish to incorporate parts of the Library into other free
|
||||
programs whose distribution conditions are incompatible with these,
|
||||
write to the author to ask for permission. For software which is
|
||||
copyrighted by the Free Software Foundation, write to the Free
|
||||
Software Foundation; we sometimes make exceptions for this. Our
|
||||
decision will be guided by the two goals of preserving the free status
|
||||
of all derivatives of our free software and of promoting the sharing
|
||||
and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
|
||||
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
|
||||
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
|
||||
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
|
||||
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
|
||||
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
|
||||
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||
|
||||
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
|
||||
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
|
||||
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
|
||||
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
|
||||
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
|
||||
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
|
||||
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
|
||||
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
|
||||
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
|
||||
DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Libraries
|
||||
|
||||
If you develop a new library, and you want it to be of the greatest
|
||||
possible use to the public, we recommend making it free software that
|
||||
everyone can redistribute and change. You can do so by permitting
|
||||
redistribution under these terms (or, alternatively, under the terms of the
|
||||
ordinary General Public License).
|
||||
|
||||
To apply these terms, attach the following notices to the library. It is
|
||||
safest to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least the
|
||||
"copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the library's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This library is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU Lesser General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2.1 of the License, or (at your option) any later version.
|
||||
|
||||
This library is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
Lesser General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU Lesser General Public
|
||||
License along with this library; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the library, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the
|
||||
library `Frob' (a library for tweaking knobs) written by James Random Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1990
|
||||
Ty Coon, President of Vice
|
||||
|
||||
That's all there is to it!
|
||||
|
||||
|
|
@ -1,33 +1,41 @@
|
|||
PLY (Python Lex-Yacc) Version 2.3 (February 18, 2007)
|
||||
PLY (Python Lex-Yacc) Version 3.2
|
||||
|
||||
David M. Beazley (dave@dabeaz.com)
|
||||
Copyright (C) 2001-2009,
|
||||
David M. Beazley (Dabeaz LLC)
|
||||
All rights reserved.
|
||||
|
||||
Copyright (C) 2001-2007 David M. Beazley
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions are
|
||||
met:
|
||||
|
||||
This library is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU Lesser General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2.1 of the License, or (at your option) any later version.
|
||||
* Redistributions of source code must retain the above copyright notice,
|
||||
this list of conditions and the following disclaimer.
|
||||
* Redistributions in binary form must reproduce the above copyright notice,
|
||||
this list of conditions and the following disclaimer in the documentation
|
||||
and/or other materials provided with the distribution.
|
||||
* Neither the name of the David Beazley or Dabeaz LLC may be used to
|
||||
endorse or promote products derived from this software without
|
||||
specific prior written permission.
|
||||
|
||||
This library is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
Lesser General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU Lesser General Public
|
||||
License along with this library; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
|
||||
See the file COPYING for a complete copy of the LGPL.
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
PLY is a 100% Python implementation of the common parsing tools lex
|
||||
and yacc. Although several other parsing tools are available for
|
||||
Python, there are several reasons why you might want to consider PLY:
|
||||
and yacc. Here are a few highlights:
|
||||
|
||||
- The tools are very closely modeled after traditional lex/yacc.
|
||||
- PLY is very closely modeled after traditional lex/yacc.
|
||||
If you know how to use these tools in C, you will find PLY
|
||||
to be similar.
|
||||
|
||||
|
@ -43,8 +51,8 @@ Python, there are several reasons why you might want to consider PLY:
|
|||
- Parsing is based on LR-parsing which is fast, memory efficient,
|
||||
better suited to large grammars, and which has a number of nice
|
||||
properties when dealing with syntax errors and other parsing problems.
|
||||
Currently, PLY builds its parsing tables using the SLR algorithm which
|
||||
is slightly weaker than LALR(1) used in traditional yacc.
|
||||
Currently, PLY builds its parsing tables using the LALR(1)
|
||||
algorithm used in yacc.
|
||||
|
||||
- PLY uses Python introspection features to build lexers and parsers.
|
||||
This greatly simplifies the task of parser construction since it reduces
|
||||
|
@ -56,16 +64,8 @@ Python, there are several reasons why you might want to consider PLY:
|
|||
PLY can be used to parse grammars consisting of several hundred
|
||||
rules (as might be found for a language like C). The lexer and LR
|
||||
parser are also reasonably efficient when parsing typically
|
||||
sized programs.
|
||||
|
||||
The original version of PLY was developed for an Introduction to
|
||||
Compilers course where students used it to build a compiler for a
|
||||
simple Pascal-like language. Their compiler had to include lexical
|
||||
analysis, parsing, type checking, type inference, and generation of
|
||||
assembly code for the SPARC processor. Because of this, the current
|
||||
implementation has been extensively tested and debugged. In addition,
|
||||
most of the API and error checking steps have been adapted to address
|
||||
common usability problems.
|
||||
sized programs. People have used PLY to build parsers for
|
||||
C, C++, ADA, and other real programming languages.
|
||||
|
||||
How to Use
|
||||
==========
|
||||
|
@ -96,10 +96,10 @@ A simple example is found at the end of this document
|
|||
|
||||
Requirements
|
||||
============
|
||||
PLY requires the use of Python 2.1 or greater. However, you should
|
||||
PLY requires the use of Python 2.2 or greater. However, you should
|
||||
use the latest Python release if possible. It should work on just
|
||||
about any platform. PLY has been tested with both CPython and Jython.
|
||||
However, it does not seem to work with IronPython.
|
||||
It also seems to work with IronPython.
|
||||
|
||||
Resources
|
||||
=========
|
||||
|
@ -127,16 +127,13 @@ Elias Ioup did the first implementation of LALR(1) parsing in PLY-1.x.
|
|||
Andrew Waters and Markus Schoepflin were instrumental in reporting bugs
|
||||
and testing a revised LALR(1) implementation for PLY-2.0.
|
||||
|
||||
Special Note for PLY-2.x
|
||||
Special Note for PLY-3.0
|
||||
========================
|
||||
PLY-2.0 is the first in a series of PLY releases that will be adding a
|
||||
variety of significant new features. The first release in this series
|
||||
(Ply-2.0) should be 100% compatible with all previous Ply-1.x releases
|
||||
except for the fact that Ply-2.0 features a correct implementation of
|
||||
LALR(1) table generation.
|
||||
|
||||
If you have suggestions for improving PLY in future 2.x releases, please
|
||||
contact me. - Dave
|
||||
PLY-3.0 the first PLY release to support Python 3. However, backwards
|
||||
compatibility with Python 2.2 is still preserved. PLY provides dual
|
||||
Python 2/3 compatibility by restricting its implementation to a common
|
||||
subset of basic language features. You should not convert PLY using
|
||||
2to3--it is not necessary and may in fact break the implementation.
|
||||
|
||||
Example
|
||||
=======
|
||||
|
@ -169,11 +166,7 @@ t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
|||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
# Ignored characters
|
||||
|
@ -255,12 +248,12 @@ while 1:
|
|||
|
||||
Bug Reports and Patches
|
||||
=======================
|
||||
Because of the extremely specialized and advanced nature of PLY, I
|
||||
rarely spend much time working on it unless I receive very specific
|
||||
bug-reports and/or patches to fix problems. I also try to incorporate
|
||||
submitted feature requests and enhancements into each new version. To
|
||||
contact me about bugs and/or new features, please send email to
|
||||
dave@dabeaz.com.
|
||||
My goal with PLY is to simply have a decent lex/yacc implementation
|
||||
for Python. As a general rule, I don't spend huge amounts of time
|
||||
working on it unless I receive very specific bug reports and/or
|
||||
patches to fix problems. I also try to incorporate submitted feature
|
||||
requests and enhancements into each new version. To contact me about
|
||||
bugs and/or new features, please send email to dave@dabeaz.com.
|
||||
|
||||
In addition there is a Google group for discussing PLY related issues at
|
||||
|
||||
|
|
16
ext/ply/TODO
16
ext/ply/TODO
|
@ -1,14 +1,16 @@
|
|||
The PLY to-do list:
|
||||
|
||||
1. More interesting parsing examples.
|
||||
1. Finish writing the C Preprocessor module. Started in the
|
||||
file ply/cpp.py
|
||||
|
||||
2. Work on the ANSI C grammar so that it can actually parse C programs. To do this,
|
||||
some extra code needs to be added to the lexer to deal with typedef names and enumeration
|
||||
constants.
|
||||
2. Create and document libraries of useful tokens.
|
||||
|
||||
3. More tests in the test directory.
|
||||
3. Expand the examples/yply tool that parses bison/yacc
|
||||
files.
|
||||
|
||||
4. Performance improvements and cleanup in yacc.py.
|
||||
4. Think of various diabolical things to do with the
|
||||
new yacc internals. For example, it is now possible
|
||||
to specify grammrs using completely different schemes
|
||||
than the reflection approach used by PLY.
|
||||
|
||||
5. More documentation (?).
|
||||
|
||||
|
|
874
ext/ply/doc/internal.html
Normal file
874
ext/ply/doc/internal.html
Normal file
|
@ -0,0 +1,874 @@
|
|||
<html>
|
||||
<head>
|
||||
<title>PLY Internals</title>
|
||||
</head>
|
||||
<body bgcolor="#ffffff">
|
||||
|
||||
<h1>PLY Internals</h1>
|
||||
|
||||
<b>
|
||||
David M. Beazley <br>
|
||||
dave@dabeaz.com<br>
|
||||
</b>
|
||||
|
||||
<p>
|
||||
<b>PLY Version: 3.0</b>
|
||||
<p>
|
||||
|
||||
<!-- INDEX -->
|
||||
<div class="sectiontoc">
|
||||
<ul>
|
||||
<li><a href="#internal_nn1">Introduction</a>
|
||||
<li><a href="#internal_nn2">Grammar Class</a>
|
||||
<li><a href="#internal_nn3">Productions</a>
|
||||
<li><a href="#internal_nn4">LRItems</a>
|
||||
<li><a href="#internal_nn5">LRTable</a>
|
||||
<li><a href="#internal_nn6">LRGeneratedTable</a>
|
||||
<li><a href="#internal_nn7">LRParser</a>
|
||||
<li><a href="#internal_nn8">ParserReflect</a>
|
||||
<li><a href="#internal_nn9">High-level operation</a>
|
||||
</ul>
|
||||
</div>
|
||||
<!-- INDEX -->
|
||||
|
||||
|
||||
<H2><a name="internal_nn1"></a>1. Introduction</H2>
|
||||
|
||||
|
||||
This document describes classes and functions that make up the internal
|
||||
operation of PLY. Using this programming interface, it is possible to
|
||||
manually build an parser using a different interface specification
|
||||
than what PLY normally uses. For example, you could build a gramar
|
||||
from information parsed in a completely different input format. Some of
|
||||
these objects may be useful for building more advanced parsing engines
|
||||
such as GLR.
|
||||
|
||||
<p>
|
||||
It should be stressed that using PLY at this level is not for the
|
||||
faint of heart. Generally, it's assumed that you know a bit of
|
||||
the underlying compiler theory and how an LR parser is put together.
|
||||
|
||||
<H2><a name="internal_nn2"></a>2. Grammar Class</H2>
|
||||
|
||||
|
||||
The file <tt>ply.yacc</tt> defines a class <tt>Grammar</tt> that
|
||||
is used to hold and manipulate information about a grammar
|
||||
specification. It encapsulates the same basic information
|
||||
about a grammar that is put into a YACC file including
|
||||
the list of tokens, precedence rules, and grammar rules.
|
||||
Various operations are provided to perform different validations
|
||||
on the grammar. In addition, there are operations to compute
|
||||
the first and follow sets that are needed by the various table
|
||||
generation algorithms.
|
||||
|
||||
<p>
|
||||
<tt><b>Grammar(terminals)</b></tt>
|
||||
|
||||
<blockquote>
|
||||
Creates a new grammar object. <tt>terminals</tt> is a list of strings
|
||||
specifying the terminals for the grammar. An instance <tt>g</tt> of
|
||||
<tt>Grammar</tt> has the following methods:
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.set_precedence(term,assoc,level)</tt></b>
|
||||
<blockquote>
|
||||
Sets the precedence level and associativity for a given terminal <tt>term</tt>.
|
||||
<tt>assoc</tt> is one of <tt>'right'</tt>,
|
||||
<tt>'left'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is a positive integer. The higher
|
||||
the value of <tt>level</tt>, the higher the precedence. Here is an example of typical
|
||||
precedence settings:
|
||||
|
||||
<pre>
|
||||
g.set_precedence('PLUS', 'left',1)
|
||||
g.set_precedence('MINUS', 'left',1)
|
||||
g.set_precedence('TIMES', 'left',2)
|
||||
g.set_precedence('DIVIDE','left',2)
|
||||
g.set_precedence('UMINUS','left',3)
|
||||
</pre>
|
||||
|
||||
This method must be called prior to adding any productions to the
|
||||
grammar with <tt>g.add_production()</tt>. The precedence of individual grammar
|
||||
rules is determined by the precedence of the right-most terminal.
|
||||
|
||||
</blockquote>
|
||||
<p>
|
||||
<b><tt>g.add_production(name,syms,func=None,file='',line=0)</tt></b>
|
||||
<blockquote>
|
||||
Adds a new grammar rule. <tt>name</tt> is the name of the rule,
|
||||
<tt>syms</tt> is a list of symbols making up the right hand
|
||||
side of the rule, <tt>func</tt> is the function to call when
|
||||
reducing the rule. <tt>file</tt> and <tt>line</tt> specify
|
||||
the filename and line number of the rule and are used for
|
||||
generating error messages.
|
||||
|
||||
<p>
|
||||
The list of symbols in <tt>syms</tt> may include character
|
||||
literals and <tt>%prec</tt> specifiers. Here are some
|
||||
examples:
|
||||
|
||||
<pre>
|
||||
g.add_production('expr',['expr','PLUS','term'],func,file,line)
|
||||
g.add_production('expr',['expr','"+"','term'],func,file,line)
|
||||
g.add_production('expr',['MINUS','expr','%prec','UMINUS'],func,file,line)
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
If any kind of error is detected, a <tt>GrammarError</tt> exception
|
||||
is raised with a message indicating the reason for the failure.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.set_start(start=None)</tt></b>
|
||||
<blockquote>
|
||||
Sets the starting rule for the grammar. <tt>start</tt> is a string
|
||||
specifying the name of the start rule. If <tt>start</tt> is omitted,
|
||||
the first grammar rule added with <tt>add_production()</tt> is taken to be
|
||||
the starting rule. This method must always be called after all
|
||||
productions have been added.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.find_unreachable()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of all unreachable non-terminals
|
||||
defined in the grammar. This is used to identify inactive parts of
|
||||
the grammar specification.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.infinite_cycle()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of all non-terminals in the
|
||||
grammar that result in an infinite cycle. This condition occurs if
|
||||
there is no way for a grammar rule to expand to a string containing
|
||||
only terminal symbols.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.undefined_symbols()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of tuples <tt>(name, prod)</tt>
|
||||
corresponding to undefined symbols in the grammar. <tt>name</tt> is the
|
||||
name of the undefined symbol and <tt>prod</tt> is an instance of
|
||||
<tt>Production</tt> which has information about the production rule
|
||||
where the undefined symbol was used.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.unused_terminals()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of terminals that were defined,
|
||||
but never used in the grammar.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.unused_rules()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of <tt>Production</tt> instances
|
||||
corresponding to production rules that were defined in the grammar,
|
||||
but never used anywhere. This is slightly different
|
||||
than <tt>find_unreachable()</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.unused_precedence()</tt></b>
|
||||
<blockquote>
|
||||
Diagnostic function. Returns a list of tuples <tt>(term, assoc)</tt>
|
||||
corresponding to precedence rules that were set, but never used the
|
||||
grammar. <tt>term</tt> is the terminal name and <tt>assoc</tt> is the
|
||||
precedence associativity (e.g., <tt>'left'</tt>, <tt>'right'</tt>,
|
||||
or <tt>'nonassoc'</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.compute_first()</tt></b>
|
||||
<blockquote>
|
||||
Compute all of the first sets for all symbols in the grammar. Returns a dictionary
|
||||
mapping symbol names to a list of all first symbols.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.compute_follow()</tt></b>
|
||||
<blockquote>
|
||||
Compute all of the follow sets for all non-terminals in the grammar.
|
||||
The follow set is the set of all possible symbols that might follow a
|
||||
given non-terminal. Returns a dictionary mapping non-terminal names
|
||||
to a list of symbols.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.build_lritems()</tt></b>
|
||||
<blockquote>
|
||||
Calculates all of the LR items for all productions in the grammar. This
|
||||
step is required before using the grammar for any kind of table generation.
|
||||
See the section on LR items below.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
The following attributes are set by the above methods and may be useful
|
||||
in code that works with the grammar. All of these attributes should be
|
||||
assumed to be read-only. Changing their values directly will likely
|
||||
break the grammar.
|
||||
|
||||
<p>
|
||||
<b><tt>g.Productions</tt></b>
|
||||
<blockquote>
|
||||
A list of all productions added. The first entry is reserved for
|
||||
a production representing the starting rule. The objects in this list
|
||||
are instances of the <tt>Production</tt> class, described shortly.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.Prodnames</tt></b>
|
||||
<blockquote>
|
||||
A dictionary mapping the names of nonterminals to a list of all
|
||||
productions of that nonterminal.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.Terminals</tt></b>
|
||||
<blockquote>
|
||||
A dictionary mapping the names of terminals to a list of the
|
||||
production numbers where they are used.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.Nonterminals</tt></b>
|
||||
<blockquote>
|
||||
A dictionary mapping the names of nonterminals to a list of the
|
||||
production numbers where they are used.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.First</tt></b>
|
||||
<blockquote>
|
||||
A dictionary representing the first sets for all grammar symbols. This is
|
||||
computed and returned by the <tt>compute_first()</tt> method.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.Follow</tt></b>
|
||||
<blockquote>
|
||||
A dictionary representing the follow sets for all grammar rules. This is
|
||||
computed and returned by the <tt>compute_follow()</tt> method.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>g.Start</tt></b>
|
||||
<blockquote>
|
||||
Starting symbol for the grammar. Set by the <tt>set_start()</tt> method.
|
||||
</blockquote>
|
||||
|
||||
For the purposes of debugging, a <tt>Grammar</tt> object supports the <tt>__len__()</tt> and
|
||||
<tt>__getitem__()</tt> special methods. Accessing <tt>g[n]</tt> returns the nth production
|
||||
from the grammar.
|
||||
|
||||
|
||||
<H2><a name="internal_nn3"></a>3. Productions</H2>
|
||||
|
||||
|
||||
<tt>Grammar</tt> objects store grammar rules as instances of a <tt>Production</tt> class. This
|
||||
class has no public constructor--you should only create productions by calling <tt>Grammar.add_production()</tt>.
|
||||
The following attributes are available on a <tt>Production</tt> instance <tt>p</tt>.
|
||||
|
||||
<p>
|
||||
<b><tt>p.name</tt></b>
|
||||
<blockquote>
|
||||
The name of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>'A'</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.prod</tt></b>
|
||||
<blockquote>
|
||||
A tuple of symbols making up the right-hand side of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>('B','C','D')</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.number</tt></b>
|
||||
<blockquote>
|
||||
Production number. An integer containing the index of the production in the grammar's <tt>Productions</tt> list.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.func</tt></b>
|
||||
<blockquote>
|
||||
The name of the reduction function associated with the production.
|
||||
This is the function that will execute when reducing the entire
|
||||
grammar rule during parsing.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.callable</tt></b>
|
||||
<blockquote>
|
||||
The callable object associated with the name in <tt>p.func</tt>. This is <tt>None</tt>
|
||||
unless the production has been bound using <tt>bind()</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.file</tt></b>
|
||||
<blockquote>
|
||||
Filename associated with the production. Typically this is the file where the production was defined. Used for error messages.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.lineno</tt></b>
|
||||
<blockquote>
|
||||
Line number associated with the production. Typically this is the line number in <tt>p.file</tt> where the production was defined. Used for error messages.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.prec</tt></b>
|
||||
<blockquote>
|
||||
Precedence and associativity associated with the production. This is a tuple <tt>(assoc,level)</tt> where
|
||||
<tt>assoc</tt> is one of <tt>'left'</tt>,<tt>'right'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is
|
||||
an integer. This value is determined by the precedence of the right-most terminal symbol in the production
|
||||
or by use of the <tt>%prec</tt> specifier when adding the production.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.usyms</tt></b>
|
||||
<blockquote>
|
||||
A list of all unique symbols found in the production.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.lr_items</tt></b>
|
||||
<blockquote>
|
||||
A list of all LR items for this production. This attribute only has a meaningful value if the
|
||||
<tt>Grammar.build_lritems()</tt> method has been called. The items in this list are
|
||||
instances of <tt>LRItem</tt> described below.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.lr_next</tt></b>
|
||||
<blockquote>
|
||||
The head of a linked-list representation of the LR items in <tt>p.lr_items</tt>.
|
||||
This attribute only has a meaningful value if the <tt>Grammar.build_lritems()</tt>
|
||||
method has been called. Each <tt>LRItem</tt> instance has a <tt>lr_next</tt> attribute
|
||||
to move to the next item. The list is terminated by <tt>None</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.bind(dict)</tt></b>
|
||||
<blockquote>
|
||||
Binds the production function name in <tt>p.func</tt> to a callable object in
|
||||
<tt>dict</tt>. This operation is typically carried out in the last step
|
||||
prior to running the parsing engine and is needed since parsing tables are typically
|
||||
read from files which only include the function names, not the functions themselves.
|
||||
</blockquote>
|
||||
|
||||
<P>
|
||||
<tt>Production</tt> objects support
|
||||
the <tt>__len__()</tt>, <tt>__getitem__()</tt>, and <tt>__str__()</tt>
|
||||
special methods.
|
||||
<tt>len(p)</tt> returns the number of symbols in <tt>p.prod</tt>
|
||||
and <tt>p[n]</tt> is the same as <tt>p.prod[n]</tt>.
|
||||
|
||||
<H2><a name="internal_nn4"></a>4. LRItems</H2>
|
||||
|
||||
|
||||
The construction of parsing tables in an LR-based parser generator is primarily
|
||||
done over a set of "LR Items". An LR item represents a stage of parsing one
|
||||
of the grammar rules. To compute the LR items, it is first necessary to
|
||||
call <tt>Grammar.build_lritems()</tt>. Once this step, all of the productions
|
||||
in the grammar will have their LR items attached to them.
|
||||
|
||||
<p>
|
||||
Here is an interactive example that shows what LR items look like if you
|
||||
interactively experiment. In this example, <tt>g</tt> is a <tt>Grammar</tt>
|
||||
object.
|
||||
|
||||
<blockquote>
|
||||
<pre>
|
||||
>>> <b>g.build_lritems()</b>
|
||||
>>> <b>p = g[1]</b>
|
||||
>>> <b>p</b>
|
||||
Production(statement -> ID = expr)
|
||||
>>>
|
||||
</pre>
|
||||
</blockquote>
|
||||
|
||||
In the above code, <tt>p</tt> represents the first grammar rule. In
|
||||
this case, a rule <tt>'statement -> ID = expr'</tt>.
|
||||
|
||||
<p>
|
||||
Now, let's look at the LR items for <tt>p</tt>.
|
||||
|
||||
<blockquote>
|
||||
<pre>
|
||||
>>> <b>p.lr_items</b>
|
||||
[LRItem(statement -> . ID = expr),
|
||||
LRItem(statement -> ID . = expr),
|
||||
LRItem(statement -> ID = . expr),
|
||||
LRItem(statement -> ID = expr .)]
|
||||
>>>
|
||||
</pre>
|
||||
</blockquote>
|
||||
|
||||
In each LR item, the dot (.) represents a specific stage of parsing. In each LR item, the dot
|
||||
is advanced by one symbol. It is only when the dot reaches the very end that a production
|
||||
is successfully parsed.
|
||||
|
||||
<p>
|
||||
An instance <tt>lr</tt> of <tt>LRItem</tt> has the following
|
||||
attributes that hold information related to that specific stage of
|
||||
parsing.
|
||||
|
||||
<p>
|
||||
<b><tt>lr.name</tt></b>
|
||||
<blockquote>
|
||||
The name of the grammar rule. For example, <tt>'statement'</tt> in the above example.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.prod</tt></b>
|
||||
<blockquote>
|
||||
A tuple of symbols representing the right-hand side of the production, including the
|
||||
special <tt>'.'</tt> character. For example, <tt>('ID','.','=','expr')</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.number</tt></b>
|
||||
<blockquote>
|
||||
An integer representing the production number in the grammar.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.usyms</tt></b>
|
||||
<blockquote>
|
||||
A set of unique symbols in the production. Inherited from the original <tt>Production</tt> instance.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_index</tt></b>
|
||||
<blockquote>
|
||||
An integer representing the position of the dot (.). You should never use <tt>lr.prod.index()</tt>
|
||||
to search for it--the result will be wrong if the grammar happens to also use (.) as a character
|
||||
literal.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_after</tt></b>
|
||||
<blockquote>
|
||||
A list of all productions that can legally appear immediately to the right of the
|
||||
dot (.). This list contains <tt>Production</tt> instances. This attribute
|
||||
represents all of the possible branches a parse can take from the current position.
|
||||
For example, suppose that <tt>lr</tt> represents a stage immediately before
|
||||
an expression like this:
|
||||
|
||||
<pre>
|
||||
>>> <b>lr</b>
|
||||
LRItem(statement -> ID = . expr)
|
||||
>>>
|
||||
</pre>
|
||||
|
||||
Then, the value of <tt>lr.lr_after</tt> might look like this, showing all productions that
|
||||
can legally appear next:
|
||||
|
||||
<pre>
|
||||
>>> <b>lr.lr_after</b>
|
||||
[Production(expr -> expr PLUS expr),
|
||||
Production(expr -> expr MINUS expr),
|
||||
Production(expr -> expr TIMES expr),
|
||||
Production(expr -> expr DIVIDE expr),
|
||||
Production(expr -> MINUS expr),
|
||||
Production(expr -> LPAREN expr RPAREN),
|
||||
Production(expr -> NUMBER),
|
||||
Production(expr -> ID)]
|
||||
>>>
|
||||
</pre>
|
||||
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_before</tt></b>
|
||||
<blockquote>
|
||||
The grammar symbol that appears immediately before the dot (.) or <tt>None</tt> if
|
||||
at the beginning of the parse.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_next</tt></b>
|
||||
<blockquote>
|
||||
A link to the next LR item, representing the next stage of the parse. <tt>None</tt> if <tt>lr</tt>
|
||||
is the last LR item.
|
||||
</blockquote>
|
||||
|
||||
<tt>LRItem</tt> instances also support the <tt>__len__()</tt> and <tt>__getitem__()</tt> special methods.
|
||||
<tt>len(lr)</tt> returns the number of items in <tt>lr.prod</tt> including the dot (.). <tt>lr[n]</tt>
|
||||
returns <tt>lr.prod[n]</tt>.
|
||||
|
||||
<p>
|
||||
It goes without saying that all of the attributes associated with LR
|
||||
items should be assumed to be read-only. Modifications will very
|
||||
likely create a small black-hole that will consume you and your code.
|
||||
|
||||
<H2><a name="internal_nn5"></a>5. LRTable</H2>
|
||||
|
||||
|
||||
The <tt>LRTable</tt> class is used to represent LR parsing table data. This
|
||||
minimally includes the production list, action table, and goto table.
|
||||
|
||||
<p>
|
||||
<b><tt>LRTable()</tt></b>
|
||||
<blockquote>
|
||||
Create an empty LRTable object. This object contains only the information needed to
|
||||
run an LR parser.
|
||||
</blockquote>
|
||||
|
||||
An instance <tt>lrtab</tt> of <tt>LRTable</tt> has the following methods:
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.read_table(module)</tt></b>
|
||||
<blockquote>
|
||||
Populates the LR table with information from the module specified in <tt>module</tt>.
|
||||
<tt>module</tt> is either a module object already loaded with <tt>import</tt> or
|
||||
the name of a Python module. If it's a string containing a module name, it is
|
||||
loaded and parsing data is extracted. Returns the signature value that was used
|
||||
when initially writing the tables. Raises a <tt>VersionError</tt> exception if
|
||||
the module was created using an incompatible version of PLY.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.bind_callables(dict)</tt></b>
|
||||
<blockquote>
|
||||
This binds all of the function names used in productions to callable objects
|
||||
found in the dictionary <tt>dict</tt>. During table generation and when reading
|
||||
LR tables from files, PLY only uses the names of action functions such as <tt>'p_expr'</tt>,
|
||||
<tt>'p_statement'</tt>, etc. In order to actually run the parser, these names
|
||||
have to be bound to callable objects. This method is always called prior to
|
||||
running a parser.
|
||||
</blockquote>
|
||||
|
||||
After <tt>lrtab</tt> has been populated, the following attributes are defined.
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.lr_method</tt></b>
|
||||
<blockquote>
|
||||
The LR parsing method used (e.g., <tt>'LALR'</tt>)
|
||||
</blockquote>
|
||||
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.lr_productions</tt></b>
|
||||
<blockquote>
|
||||
The production list. If the parsing tables have been newly
|
||||
constructed, this will be a list of <tt>Production</tt> instances. If
|
||||
the parsing tables have been read from a file, it's a list
|
||||
of <tt>MiniProduction</tt> instances. This, together
|
||||
with <tt>lr_action</tt> and <tt>lr_goto</tt> contain all of the
|
||||
information needed by the LR parsing engine.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.lr_action</tt></b>
|
||||
<blockquote>
|
||||
The LR action dictionary that implements the underlying state machine.
|
||||
The keys of this dictionary are the LR states.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lrtab.lr_goto</tt></b>
|
||||
<blockquote>
|
||||
The LR goto table that contains information about grammar rule reductions.
|
||||
</blockquote>
|
||||
|
||||
|
||||
<H2><a name="internal_nn6"></a>6. LRGeneratedTable</H2>
|
||||
|
||||
|
||||
The <tt>LRGeneratedTable</tt> class represents constructed LR parsing tables on a
|
||||
grammar. It is a subclass of <tt>LRTable</tt>.
|
||||
|
||||
<p>
|
||||
<b><tt>LRGeneratedTable(grammar, method='LALR',log=None)</tt></b>
|
||||
<blockquote>
|
||||
Create the LR parsing tables on a grammar. <tt>grammar</tt> is an instance of <tt>Grammar</tt>,
|
||||
<tt>method</tt> is a string with the parsing method (<tt>'SLR'</tt> or <tt>'LALR'</tt>), and
|
||||
<tt>log</tt> is a logger object used to write debugging information. The debugging information
|
||||
written to <tt>log</tt> is the same as what appears in the <tt>parser.out</tt> file created
|
||||
by yacc. By supplying a custom logger with a different message format, it is possible to get
|
||||
more information (e.g., the line number in <tt>yacc.py</tt> used for issuing each line of
|
||||
output in the log). The result is an instance of <tt>LRGeneratedTable</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
An instance <tt>lr</tt> of <tt>LRGeneratedTable</tt> has the following attributes.
|
||||
|
||||
<p>
|
||||
<b><tt>lr.grammar</tt></b>
|
||||
<blockquote>
|
||||
A link to the Grammar object used to construct the parsing tables.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_method</tt></b>
|
||||
<blockquote>
|
||||
The LR parsing method used (e.g., <tt>'LALR'</tt>)
|
||||
</blockquote>
|
||||
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_productions</tt></b>
|
||||
<blockquote>
|
||||
A reference to <tt>grammar.Productions</tt>. This, together with <tt>lr_action</tt> and <tt>lr_goto</tt>
|
||||
contain all of the information needed by the LR parsing engine.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_action</tt></b>
|
||||
<blockquote>
|
||||
The LR action dictionary that implements the underlying state machine. The keys of this dictionary are
|
||||
the LR states.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.lr_goto</tt></b>
|
||||
<blockquote>
|
||||
The LR goto table that contains information about grammar rule reductions.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.sr_conflicts</tt></b>
|
||||
<blockquote>
|
||||
A list of tuples <tt>(state,token,resolution)</tt> identifying all shift/reduce conflicts. <tt>state</tt> is the LR state
|
||||
number where the conflict occurred, <tt>token</tt> is the token causing the conflict, and <tt>resolution</tt> is
|
||||
a string describing the resolution taken. <tt>resolution</tt> is either <tt>'shift'</tt> or <tt>'reduce'</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>lr.rr_conflicts</tt></b>
|
||||
<blockquote>
|
||||
A list of tuples <tt>(state,rule,rejected)</tt> identifying all reduce/reduce conflicts. <tt>state</tt> is the
|
||||
LR state number where the conflict occurred, <tt>rule</tt> is the production rule that was selected
|
||||
and <tt>rejected</tt> is the production rule that was rejected. Both <tt>rule</tt> and </tt>rejected</tt> are
|
||||
instances of <tt>Production</tt>. They can be inspected to provide the user with more information.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
There are two public methods of <tt>LRGeneratedTable</tt>.
|
||||
|
||||
<p>
|
||||
<b><tt>lr.write_table(modulename,outputdir="",signature="")</tt></b>
|
||||
<blockquote>
|
||||
Writes the LR parsing table information to a Python module. <tt>modulename</tt> is a string
|
||||
specifying the name of a module such as <tt>"parsetab"</tt>. <tt>outputdir</tt> is the name of a
|
||||
directory where the module should be created. <tt>signature</tt> is a string representing a
|
||||
grammar signature that's written into the output file. This can be used to detect when
|
||||
the data stored in a module file is out-of-sync with the the grammar specification (and that
|
||||
the tables need to be regenerated). If <tt>modulename</tt> is a string <tt>"parsetab"</tt>,
|
||||
this function creates a file called <tt>parsetab.py</tt>. If the module name represents a
|
||||
package such as <tt>"foo.bar.parsetab"</tt>, then only the last component, <tt>"parsetab"</tt> is
|
||||
used.
|
||||
</blockquote>
|
||||
|
||||
|
||||
<H2><a name="internal_nn7"></a>7. LRParser</H2>
|
||||
|
||||
|
||||
The <tt>LRParser</tt> class implements the low-level LR parsing engine.
|
||||
|
||||
|
||||
<p>
|
||||
<b><tt>LRParser(lrtab, error_func)</tt></b>
|
||||
<blockquote>
|
||||
Create an LRParser. <tt>lrtab</tt> is an instance of <tt>LRTable</tt>
|
||||
containing the LR production and state tables. <tt>error_func</tt> is the
|
||||
error function to invoke in the event of a parsing error.
|
||||
</blockquote>
|
||||
|
||||
An instance <tt>p</tt> of <tt>LRParser</tt> has the following methods:
|
||||
|
||||
<p>
|
||||
<b><tt>p.parse(input=None,lexer=None,debug=0,tracking=0,tokenfunc=None)</tt></b>
|
||||
<blockquote>
|
||||
Run the parser. <tt>input</tt> is a string, which if supplied is fed into the
|
||||
lexer using its <tt>input()</tt> method. <tt>lexer</tt> is an instance of the
|
||||
<tt>Lexer</tt> class to use for tokenizing. If not supplied, the last lexer
|
||||
created with the <tt>lex</tt> module is used. <tt>debug</tt> is a boolean flag
|
||||
that enables debugging. <tt>tracking</tt> is a boolean flag that tells the
|
||||
parser to perform additional line number tracking. <tt>tokenfunc</tt> is a callable
|
||||
function that returns the next token. If supplied, the parser will use it to get
|
||||
all tokens.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.restart()</tt></b>
|
||||
<blockquote>
|
||||
Resets the parser state for a parse already in progress.
|
||||
</blockquote>
|
||||
|
||||
<H2><a name="internal_nn8"></a>8. ParserReflect</H2>
|
||||
|
||||
|
||||
<p>
|
||||
The <tt>ParserReflect</tt> class is used to collect parser specification data
|
||||
from a Python module or object. This class is what collects all of the
|
||||
<tt>p_rule()</tt> functions in a PLY file, performs basic error checking,
|
||||
and collects all of the needed information to build a grammar. Most of the
|
||||
high-level PLY interface as used by the <tt>yacc()</tt> function is actually
|
||||
implemented by this class.
|
||||
|
||||
<p>
|
||||
<b><tt>ParserReflect(pdict, log=None)</tt></b>
|
||||
<blockquote>
|
||||
Creates a <tt>ParserReflect</tt> instance. <tt>pdict</tt> is a dictionary
|
||||
containing parser specification data. This dictionary typically corresponds
|
||||
to the module or class dictionary of code that implements a PLY parser.
|
||||
<tt>log</tt> is a logger instance that will be used to report error
|
||||
messages.
|
||||
</blockquote>
|
||||
|
||||
An instance <tt>p</tt> of <tt>ParserReflect</tt> has the following methods:
|
||||
|
||||
<p>
|
||||
<b><tt>p.get_all()</tt></b>
|
||||
<blockquote>
|
||||
Collect and store all required parsing information.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.validate_all()</tt></b>
|
||||
<blockquote>
|
||||
Validate all of the collected parsing information. This is a seprate step
|
||||
from <tt>p.get_all()</tt> as a performance optimization. In order to
|
||||
increase parser start-up time, a parser can elect to only validate the
|
||||
parsing data when regenerating the parsing tables. The validation
|
||||
step tries to collect as much information as possible rather than
|
||||
raising an exception at the first sign of trouble. The attribute
|
||||
<tt>p.error</tt> is set if there are any validation errors. The
|
||||
value of this attribute is also returned.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.signature()</tt></b>
|
||||
<blockquote>
|
||||
Compute a signature representing the contents of the collected parsing
|
||||
data. The signature value should change if anything in the parser
|
||||
specification has changed in a way that would justify parser table
|
||||
regeneration. This method can be called after <tt>p.get_all()</tt>,
|
||||
but before <tt>p.validate_all()</tt>.
|
||||
</blockquote>
|
||||
|
||||
The following attributes are set in the process of collecting data:
|
||||
|
||||
<p>
|
||||
<b><tt>p.start</tt></b>
|
||||
<blockquote>
|
||||
The grammar start symbol, if any. Taken from <tt>pdict['start']</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.error_func</tt></b>
|
||||
<blockquote>
|
||||
The error handling function or <tt>None</tt>. Taken from <tt>pdict['p_error']</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.tokens</tt></b>
|
||||
<blockquote>
|
||||
The token list. Taken from <tt>pdict['tokens']</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.prec</tt></b>
|
||||
<blockquote>
|
||||
The precedence specifier. Taken from <tt>pdict['precedence']</tt>.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.preclist</tt></b>
|
||||
<blockquote>
|
||||
A parsed version of the precedence specified. A list of tuples of the form
|
||||
<tt>(token,assoc,level)</tt> where <tt>token</tt> is the terminal symbol,
|
||||
<tt>assoc</tt> is the associativity (e.g., <tt>'left'</tt>) and <tt>level</tt>
|
||||
is a numeric precedence level.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.grammar</tt></b>
|
||||
<blockquote>
|
||||
A list of tuples <tt>(name, rules)</tt> representing the grammar rules. <tt>name</tt> is the
|
||||
name of a Python function or method in <tt>pdict</tt> that starts with <tt>"p_"</tt>.
|
||||
<tt>rules</tt> is a list of tuples <tt>(filename,line,prodname,syms)</tt> representing
|
||||
the grammar rules found in the documentation string of that function. <tt>filename</tt> and <tt>line</tt> contain location
|
||||
information that can be used for debugging. <tt>prodname</tt> is the name of the
|
||||
production. <tt>syms</tt> is the right-hand side of the production. If you have a
|
||||
function like this
|
||||
|
||||
<pre>
|
||||
def p_expr(p):
|
||||
'''expr : expr PLUS expr
|
||||
| expr MINUS expr
|
||||
| expr TIMES expr
|
||||
| expr DIVIDE expr'''
|
||||
</pre>
|
||||
|
||||
then the corresponding entry in <tt>p.grammar</tt> might look like this:
|
||||
|
||||
<pre>
|
||||
('p_expr', [ ('calc.py',10,'expr', ['expr','PLUS','expr']),
|
||||
('calc.py',11,'expr', ['expr','MINUS','expr']),
|
||||
('calc.py',12,'expr', ['expr','TIMES','expr']),
|
||||
('calc.py',13,'expr', ['expr','DIVIDE','expr'])
|
||||
])
|
||||
</pre>
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.pfuncs</tt></b>
|
||||
<blockquote>
|
||||
A sorted list of tuples <tt>(line, file, name, doc)</tt> representing all of
|
||||
the <tt>p_</tt> functions found. <tt>line</tt> and <tt>file</tt> give location
|
||||
information. <tt>name</tt> is the name of the function. <tt>doc</tt> is the
|
||||
documentation string. This list is sorted in ascending order by line number.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.files</tt></b>
|
||||
<blockquote>
|
||||
A dictionary holding all of the source filenames that were encountered
|
||||
while collecting parser information. Only the keys of this dictionary have
|
||||
any meaning.
|
||||
</blockquote>
|
||||
|
||||
<p>
|
||||
<b><tt>p.error</tt></b>
|
||||
<blockquote>
|
||||
An attribute that indicates whether or not any critical errors
|
||||
occurred in validation. If this is set, it means that that some kind
|
||||
of problem was detected and that no further processing should be
|
||||
performed.
|
||||
</blockquote>
|
||||
|
||||
|
||||
<H2><a name="internal_nn9"></a>9. High-level operation</H2>
|
||||
|
||||
|
||||
Using all of the above classes requires some attention to detail. The <tt>yacc()</tt>
|
||||
function carries out a very specific sequence of operations to create a grammar.
|
||||
This same sequence should be emulated if you build an alternative PLY interface.
|
||||
|
||||
<ol>
|
||||
<li>A <tt>ParserReflect</tt> object is created and raw grammar specification data is
|
||||
collected.
|
||||
<li>A <tt>Grammar</tt> object is created and populated with information
|
||||
from the specification data.
|
||||
<li>A <tt>LRGenerator</tt> object is created to run the LALR algorithm over
|
||||
the <tt>Grammar</tt> object.
|
||||
<li>Productions in the LRGenerator and bound to callables using the <tt>bind_callables()</tt>
|
||||
method.
|
||||
<li>A <tt>LRParser</tt> object is created from from the information in the
|
||||
<tt>LRGenerator</tt> object.
|
||||
</ol>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
1081
ext/ply/doc/ply.html
1081
ext/ply/doc/ply.html
File diff suppressed because it is too large
Load diff
|
@ -4,6 +4,9 @@
|
|||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
import basiclex
|
||||
import basparse
|
||||
import basinterp
|
||||
|
@ -41,7 +44,7 @@ while 1:
|
|||
prog = basparse.parse(line)
|
||||
if not prog: continue
|
||||
|
||||
keys = prog.keys()
|
||||
keys = list(prog)
|
||||
if keys[0] > 0:
|
||||
b.add_statements(prog)
|
||||
else:
|
||||
|
|
|
@ -51,10 +51,10 @@ def t_NEWLINE(t):
|
|||
return t
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character", t.value[0]
|
||||
print("Illegal character %s" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
lex.lex()
|
||||
lex.lex(debug=0)
|
||||
|
||||
|
||||
|
||||
|
|
79
ext/ply/example/BASIC/basiclog.py
Normal file
79
ext/ply/example/BASIC/basiclog.py
Normal file
|
@ -0,0 +1,79 @@
|
|||
# An implementation of Dartmouth BASIC (1964)
|
||||
#
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
import logging
|
||||
logging.basicConfig(
|
||||
level = logging.INFO,
|
||||
filename = "parselog.txt",
|
||||
filemode = "w"
|
||||
)
|
||||
log = logging.getLogger()
|
||||
|
||||
import basiclex
|
||||
import basparse
|
||||
import basinterp
|
||||
|
||||
# If a filename has been specified, we try to run it.
|
||||
# If a runtime error occurs, we bail out and enter
|
||||
# interactive mode below
|
||||
if len(sys.argv) == 2:
|
||||
data = open(sys.argv[1]).read()
|
||||
prog = basparse.parse(data,debug=log)
|
||||
if not prog: raise SystemExit
|
||||
b = basinterp.BasicInterpreter(prog)
|
||||
try:
|
||||
b.run()
|
||||
raise SystemExit
|
||||
except RuntimeError:
|
||||
pass
|
||||
|
||||
else:
|
||||
b = basinterp.BasicInterpreter({})
|
||||
|
||||
# Interactive mode. This incrementally adds/deletes statements
|
||||
# from the program stored in the BasicInterpreter object. In
|
||||
# addition, special commands 'NEW','LIST',and 'RUN' are added.
|
||||
# Specifying a line number with no code deletes that line from
|
||||
# the program.
|
||||
|
||||
while 1:
|
||||
try:
|
||||
line = raw_input("[BASIC] ")
|
||||
except EOFError:
|
||||
raise SystemExit
|
||||
if not line: continue
|
||||
line += "\n"
|
||||
prog = basparse.parse(line,debug=log)
|
||||
if not prog: continue
|
||||
|
||||
keys = list(prog)
|
||||
if keys[0] > 0:
|
||||
b.add_statements(prog)
|
||||
else:
|
||||
stat = prog[keys[0]]
|
||||
if stat[0] == 'RUN':
|
||||
try:
|
||||
b.run()
|
||||
except RuntimeError:
|
||||
pass
|
||||
elif stat[0] == 'LIST':
|
||||
b.list()
|
||||
elif stat[0] == 'BLANK':
|
||||
b.del_line(stat[1])
|
||||
elif stat[0] == 'NEW':
|
||||
b.new()
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -40,10 +40,11 @@ class BasicInterpreter:
|
|||
if self.prog[lineno][0] == 'END' and not has_end:
|
||||
has_end = lineno
|
||||
if not has_end:
|
||||
print "NO END INSTRUCTION"
|
||||
print("NO END INSTRUCTION")
|
||||
self.error = 1
|
||||
return
|
||||
if has_end != lineno:
|
||||
print "END IS NOT LAST"
|
||||
print("END IS NOT LAST")
|
||||
self.error = 1
|
||||
|
||||
# Check loops
|
||||
|
@ -60,7 +61,7 @@ class BasicInterpreter:
|
|||
self.loopend[pc] = i
|
||||
break
|
||||
else:
|
||||
print "FOR WITHOUT NEXT AT LINE" % self.stat[pc]
|
||||
print("FOR WITHOUT NEXT AT LINE %s" % self.stat[pc])
|
||||
self.error = 1
|
||||
|
||||
# Evaluate an expression
|
||||
|
@ -79,33 +80,33 @@ class BasicInterpreter:
|
|||
elif etype == 'VAR':
|
||||
var,dim1,dim2 = expr[1]
|
||||
if not dim1 and not dim2:
|
||||
if self.vars.has_key(var):
|
||||
if var in self.vars:
|
||||
return self.vars[var]
|
||||
else:
|
||||
print "UNDEFINED VARIABLE", var, "AT LINE", self.stat[self.pc]
|
||||
print("UNDEFINED VARIABLE %s AT LINE %s" % (var, self.stat[self.pc]))
|
||||
raise RuntimeError
|
||||
# May be a list lookup or a function evaluation
|
||||
if dim1 and not dim2:
|
||||
if self.functions.has_key(var):
|
||||
if var in self.functions:
|
||||
# A function
|
||||
return self.functions[var](dim1)
|
||||
else:
|
||||
# A list evaluation
|
||||
if self.lists.has_key(var):
|
||||
if var in self.lists:
|
||||
dim1val = self.eval(dim1)
|
||||
if dim1val < 1 or dim1val > len(self.lists[var]):
|
||||
print "LIST INDEX OUT OF BOUNDS AT LINE", self.stat[self.pc]
|
||||
print("LIST INDEX OUT OF BOUNDS AT LINE %s" % self.stat[self.pc])
|
||||
raise RuntimeError
|
||||
return self.lists[var][dim1val-1]
|
||||
if dim1 and dim2:
|
||||
if self.tables.has_key(var):
|
||||
if var in self.tables:
|
||||
dim1val = self.eval(dim1)
|
||||
dim2val = self.eval(dim2)
|
||||
if dim1val < 1 or dim1val > len(self.tables[var]) or dim2val < 1 or dim2val > len(self.tables[var][0]):
|
||||
print "TABLE INDEX OUT OUT BOUNDS AT LINE", self.stat[self.pc]
|
||||
print("TABLE INDEX OUT OUT BOUNDS AT LINE %s" % self.stat[self.pc])
|
||||
raise RuntimeError
|
||||
return self.tables[var][dim1val-1][dim2val-1]
|
||||
print "UNDEFINED VARIABLE", var, "AT LINE", self.stat[self.pc]
|
||||
print("UNDEFINED VARIABLE %s AT LINE %s" % (var, self.stat[self.pc]))
|
||||
raise RuntimeError
|
||||
|
||||
# Evaluate a relational expression
|
||||
|
@ -145,31 +146,31 @@ class BasicInterpreter:
|
|||
elif dim1 and not dim2:
|
||||
# List assignment
|
||||
dim1val = self.eval(dim1)
|
||||
if not self.lists.has_key(var):
|
||||
if not var in self.lists:
|
||||
self.lists[var] = [0]*10
|
||||
|
||||
if dim1val > len(self.lists[var]):
|
||||
print "DIMENSION TOO LARGE AT LINE", self.stat[self.pc]
|
||||
print ("DIMENSION TOO LARGE AT LINE %s" % self.stat[self.pc])
|
||||
raise RuntimeError
|
||||
self.lists[var][dim1val-1] = self.eval(value)
|
||||
elif dim1 and dim2:
|
||||
dim1val = self.eval(dim1)
|
||||
dim2val = self.eval(dim2)
|
||||
if not self.tables.has_key(var):
|
||||
if not var in self.tables:
|
||||
temp = [0]*10
|
||||
v = []
|
||||
for i in range(10): v.append(temp[:])
|
||||
self.tables[var] = v
|
||||
# Variable already exists
|
||||
if dim1val > len(self.tables[var]) or dim2val > len(self.tables[var][0]):
|
||||
print "DIMENSION TOO LARGE AT LINE", self.stat[self.pc]
|
||||
print("DIMENSION TOO LARGE AT LINE %s" % self.stat[self.pc])
|
||||
raise RuntimeError
|
||||
self.tables[var][dim1val-1][dim2val-1] = self.eval(value)
|
||||
|
||||
# Change the current line number
|
||||
def goto(self,linenum):
|
||||
if not self.prog.has_key(linenum):
|
||||
print "UNDEFINED LINE NUMBER %d AT LINE %d" % (linenum, self.stat[self.pc])
|
||||
if not linenum in self.prog:
|
||||
print("UNDEFINED LINE NUMBER %d AT LINE %d" % (linenum, self.stat[self.pc]))
|
||||
raise RuntimeError
|
||||
self.pc = self.stat.index(linenum)
|
||||
|
||||
|
@ -183,7 +184,7 @@ class BasicInterpreter:
|
|||
self.gosub = None # Gosub return point (if any)
|
||||
self.error = 0 # Indicates program error
|
||||
|
||||
self.stat = self.prog.keys() # Ordered list of all line numbers
|
||||
self.stat = list(self.prog) # Ordered list of all line numbers
|
||||
self.stat.sort()
|
||||
self.pc = 0 # Current program counter
|
||||
|
||||
|
@ -284,7 +285,7 @@ class BasicInterpreter:
|
|||
|
||||
elif op == 'NEXT':
|
||||
if not self.loops:
|
||||
print "NEXT WITHOUT FOR AT LINE",line
|
||||
print("NEXT WITHOUT FOR AT LINE %s" % line)
|
||||
return
|
||||
|
||||
nextvar = instr[1]
|
||||
|
@ -292,13 +293,13 @@ class BasicInterpreter:
|
|||
loopinst = self.prog[self.stat[self.pc]]
|
||||
forvar = loopinst[1]
|
||||
if nextvar != forvar:
|
||||
print "NEXT DOESN'T MATCH FOR AT LINE", line
|
||||
print("NEXT DOESN'T MATCH FOR AT LINE %s" % line)
|
||||
return
|
||||
continue
|
||||
elif op == 'GOSUB':
|
||||
newline = instr[1]
|
||||
if self.gosub:
|
||||
print "ALREADY IN A SUBROUTINE AT LINE", line
|
||||
print("ALREADY IN A SUBROUTINE AT LINE %s" % line)
|
||||
return
|
||||
self.gosub = self.stat[self.pc]
|
||||
self.goto(newline)
|
||||
|
@ -306,7 +307,7 @@ class BasicInterpreter:
|
|||
|
||||
elif op == 'RETURN':
|
||||
if not self.gosub:
|
||||
print "RETURN WITHOUT A GOSUB AT LINE",line
|
||||
print("RETURN WITHOUT A GOSUB AT LINE %s" % line)
|
||||
return
|
||||
self.goto(self.gosub)
|
||||
self.gosub = None
|
||||
|
@ -358,69 +359,69 @@ class BasicInterpreter:
|
|||
|
||||
# Create a program listing
|
||||
def list(self):
|
||||
stat = self.prog.keys() # Ordered list of all line numbers
|
||||
stat = list(self.prog) # Ordered list of all line numbers
|
||||
stat.sort()
|
||||
for line in stat:
|
||||
instr = self.prog[line]
|
||||
op = instr[0]
|
||||
if op in ['END','STOP','RETURN']:
|
||||
print line, op
|
||||
print("%s %s" % (line, op))
|
||||
continue
|
||||
elif op == 'REM':
|
||||
print line, instr[1]
|
||||
print("%s %s" % (line, instr[1]))
|
||||
elif op == 'PRINT':
|
||||
print line, op,
|
||||
_out = "%s %s " % (line, op)
|
||||
first = 1
|
||||
for p in instr[1]:
|
||||
if not first: print ",",
|
||||
if p[0] and p[1]: print '"%s"%s' % (p[0],self.expr_str(p[1])),
|
||||
elif p[1]: print self.expr_str(p[1]),
|
||||
else: print '"%s"' % (p[0],),
|
||||
if not first: _out += ", "
|
||||
if p[0] and p[1]: _out += '"%s"%s' % (p[0],self.expr_str(p[1]))
|
||||
elif p[1]: _out += self.expr_str(p[1])
|
||||
else: _out += '"%s"' % (p[0],)
|
||||
first = 0
|
||||
if instr[2]: print instr[2]
|
||||
else: print
|
||||
if instr[2]: _out += instr[2]
|
||||
print(_out)
|
||||
elif op == 'LET':
|
||||
print line,"LET",self.var_str(instr[1]),"=",self.expr_str(instr[2])
|
||||
print("%s LET %s = %s" % (line,self.var_str(instr[1]),self.expr_str(instr[2])))
|
||||
elif op == 'READ':
|
||||
print line,"READ",
|
||||
_out = "%s READ " % line
|
||||
first = 1
|
||||
for r in instr[1]:
|
||||
if not first: print ",",
|
||||
print self.var_str(r),
|
||||
if not first: _out += ","
|
||||
_out += self.var_str(r)
|
||||
first = 0
|
||||
print ""
|
||||
print(_out)
|
||||
elif op == 'IF':
|
||||
print line,"IF %s THEN %d" % (self.relexpr_str(instr[1]),instr[2])
|
||||
print("%s IF %s THEN %d" % (line,self.relexpr_str(instr[1]),instr[2]))
|
||||
elif op == 'GOTO' or op == 'GOSUB':
|
||||
print line, op, instr[1]
|
||||
print("%s %s %s" % (line, op, instr[1]))
|
||||
elif op == 'FOR':
|
||||
print line,"FOR %s = %s TO %s" % (instr[1],self.expr_str(instr[2]),self.expr_str(instr[3])),
|
||||
if instr[4]: print "STEP %s" % (self.expr_str(instr[4])),
|
||||
print
|
||||
_out = "%s FOR %s = %s TO %s" % (line,instr[1],self.expr_str(instr[2]),self.expr_str(instr[3]))
|
||||
if instr[4]: _out += " STEP %s" % (self.expr_str(instr[4]))
|
||||
print(_out)
|
||||
elif op == 'NEXT':
|
||||
print line,"NEXT", instr[1]
|
||||
print("%s NEXT %s" % (line, instr[1]))
|
||||
elif op == 'FUNC':
|
||||
print line,"DEF %s(%s) = %s" % (instr[1],instr[2],self.expr_str(instr[3]))
|
||||
print("%s DEF %s(%s) = %s" % (line,instr[1],instr[2],self.expr_str(instr[3])))
|
||||
elif op == 'DIM':
|
||||
print line,"DIM",
|
||||
_out = "%s DIM " % line
|
||||
first = 1
|
||||
for vname,x,y in instr[1]:
|
||||
if not first: print ",",
|
||||
if not first: _out += ","
|
||||
first = 0
|
||||
if y == 0:
|
||||
print "%s(%d)" % (vname,x),
|
||||
_out += "%s(%d)" % (vname,x)
|
||||
else:
|
||||
print "%s(%d,%d)" % (vname,x,y),
|
||||
_out += "%s(%d,%d)" % (vname,x,y)
|
||||
|
||||
print
|
||||
print(_out)
|
||||
elif op == 'DATA':
|
||||
print line,"DATA",
|
||||
_out = "%s DATA " % line
|
||||
first = 1
|
||||
for v in instr[1]:
|
||||
if not first: print ",",
|
||||
if not first: _out += ","
|
||||
first = 0
|
||||
print v,
|
||||
print
|
||||
_out += v
|
||||
print(_out)
|
||||
|
||||
# Erase the current program
|
||||
def new(self):
|
||||
|
|
|
@ -44,7 +44,7 @@ def p_program_error(p):
|
|||
def p_statement(p):
|
||||
'''statement : INTEGER command NEWLINE'''
|
||||
if isinstance(p[2],str):
|
||||
print p[2],"AT LINE", p[1]
|
||||
print("%s %s %s" % (p[2],"AT LINE", p[1]))
|
||||
p[0] = None
|
||||
p.parser.error = 1
|
||||
else:
|
||||
|
@ -68,7 +68,7 @@ def p_statement_blank(p):
|
|||
|
||||
def p_statement_bad(p):
|
||||
'''statement : INTEGER error NEWLINE'''
|
||||
print "MALFORMED STATEMENT AT LINE", p[1]
|
||||
print("MALFORMED STATEMENT AT LINE %s" % p[1])
|
||||
p[0] = None
|
||||
p.parser.error = 1
|
||||
|
||||
|
@ -399,13 +399,13 @@ def p_empty(p):
|
|||
#### Catastrophic error handler
|
||||
def p_error(p):
|
||||
if not p:
|
||||
print "SYNTAX ERROR AT EOF"
|
||||
print("SYNTAX ERROR AT EOF")
|
||||
|
||||
bparser = yacc.yacc()
|
||||
|
||||
def parse(data):
|
||||
def parse(data,debug=0):
|
||||
bparser.error = 0
|
||||
p = bparser.parse(data)
|
||||
p = bparser.parse(data,debug=debug)
|
||||
if bparser.error: return None
|
||||
return p
|
||||
|
||||
|
|
|
@ -142,16 +142,16 @@ t_CCONST = r'(L)?\'([^\\\n]|(\\.))*?\''
|
|||
|
||||
# Comments
|
||||
def t_comment(t):
|
||||
r' /\*(.|\n)*?\*/'
|
||||
t.lineno += t.value.count('\n')
|
||||
r'/\*(.|\n)*?\*/'
|
||||
t.lexer.lineno += t.value.count('\n')
|
||||
|
||||
# Preprocessor directive (ignored)
|
||||
def t_preprocessor(t):
|
||||
r'\#(.)*?\n'
|
||||
t.lineno += 1
|
||||
t.lexer.lineno += 1
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character %s" % repr(t.value[0])
|
||||
print("Illegal character %s" % repr(t.value[0]))
|
||||
t.lexer.skip(1)
|
||||
|
||||
lexer = lex.lex(optimize=1)
|
||||
|
|
|
@ -155,7 +155,7 @@ def p_struct_declaration_list_1(t):
|
|||
pass
|
||||
|
||||
def p_struct_declaration_list_2(t):
|
||||
'struct_declaration_list : struct_declarator_list struct_declaration'
|
||||
'struct_declaration_list : struct_declaration_list struct_declaration'
|
||||
pass
|
||||
|
||||
# init-declarator-list:
|
||||
|
@ -849,7 +849,7 @@ def p_empty(t):
|
|||
pass
|
||||
|
||||
def p_error(t):
|
||||
print "Whoa. We're hosed"
|
||||
print("Whoa. We're hosed")
|
||||
|
||||
import profile
|
||||
# Build the grammar
|
||||
|
|
|
@ -8,6 +8,9 @@
|
|||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
)
|
||||
|
@ -20,11 +23,7 @@ t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
|||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
@ -34,7 +33,7 @@ def t_newline(t):
|
|||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
|
@ -58,7 +57,7 @@ def p_statement_assign(p):
|
|||
|
||||
def p_statement_expr(p):
|
||||
'statement : expression'
|
||||
print p[1]
|
||||
print(p[1])
|
||||
|
||||
def p_expression_binop(p):
|
||||
'''expression : expression '+' expression
|
||||
|
@ -87,11 +86,14 @@ def p_expression_name(p):
|
|||
try:
|
||||
p[0] = names[p[1]]
|
||||
except LookupError:
|
||||
print "Undefined name '%s'" % p[1]
|
||||
print("Undefined name '%s'" % p[1])
|
||||
p[0] = 0
|
||||
|
||||
def p_error(p):
|
||||
print "Syntax error at '%s'" % p.value
|
||||
if p:
|
||||
print("Syntax error at '%s'" % p.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
import ply.yacc as yacc
|
||||
yacc.yacc()
|
||||
|
|
113
ext/ply/example/calcdebug/calc.py
Normal file
113
ext/ply/example/calcdebug/calc.py
Normal file
|
@ -0,0 +1,113 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# calc.py
|
||||
#
|
||||
# This example shows how to run the parser in a debugging mode
|
||||
# with output routed to a logging object.
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
)
|
||||
|
||||
literals = ['=','+','-','*','/', '(',')']
|
||||
|
||||
# Tokens
|
||||
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
t.value = int(t.value)
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
import ply.lex as lex
|
||||
lex.lex()
|
||||
|
||||
# Parsing rules
|
||||
|
||||
precedence = (
|
||||
('left','+','-'),
|
||||
('left','*','/'),
|
||||
('right','UMINUS'),
|
||||
)
|
||||
|
||||
# dictionary of names
|
||||
names = { }
|
||||
|
||||
def p_statement_assign(p):
|
||||
'statement : NAME "=" expression'
|
||||
names[p[1]] = p[3]
|
||||
|
||||
def p_statement_expr(p):
|
||||
'statement : expression'
|
||||
print(p[1])
|
||||
|
||||
def p_expression_binop(p):
|
||||
'''expression : expression '+' expression
|
||||
| expression '-' expression
|
||||
| expression '*' expression
|
||||
| expression '/' expression'''
|
||||
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||
|
||||
def p_expression_uminus(p):
|
||||
"expression : '-' expression %prec UMINUS"
|
||||
p[0] = -p[2]
|
||||
|
||||
def p_expression_group(p):
|
||||
"expression : '(' expression ')'"
|
||||
p[0] = p[2]
|
||||
|
||||
def p_expression_number(p):
|
||||
"expression : NUMBER"
|
||||
p[0] = p[1]
|
||||
|
||||
def p_expression_name(p):
|
||||
"expression : NAME"
|
||||
try:
|
||||
p[0] = names[p[1]]
|
||||
except LookupError:
|
||||
print("Undefined name '%s'" % p[1])
|
||||
p[0] = 0
|
||||
|
||||
def p_error(p):
|
||||
if p:
|
||||
print("Syntax error at '%s'" % p.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
import ply.yacc as yacc
|
||||
yacc.yacc()
|
||||
|
||||
import logging
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
filename="parselog.txt"
|
||||
)
|
||||
|
||||
while 1:
|
||||
try:
|
||||
s = raw_input('calc > ')
|
||||
except EOFError:
|
||||
break
|
||||
if not s: continue
|
||||
yacc.parse(s,debug=logging.getLogger())
|
17
ext/ply/example/classcalc/calc.py
Normal file → Executable file
17
ext/ply/example/classcalc/calc.py
Normal file → Executable file
|
@ -12,7 +12,9 @@
|
|||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
import readline
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
import ply.lex as lex
|
||||
import ply.yacc as yacc
|
||||
import os
|
||||
|
@ -77,7 +79,7 @@ class Calc(Parser):
|
|||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
#print "parsed number %s" % repr(t.value)
|
||||
return t
|
||||
|
@ -89,7 +91,7 @@ class Calc(Parser):
|
|||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(self, t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Parsing rules
|
||||
|
@ -107,7 +109,7 @@ class Calc(Parser):
|
|||
|
||||
def p_statement_expr(self, p):
|
||||
'statement : expression'
|
||||
print p[1]
|
||||
print(p[1])
|
||||
|
||||
def p_expression_binop(self, p):
|
||||
"""
|
||||
|
@ -141,11 +143,14 @@ class Calc(Parser):
|
|||
try:
|
||||
p[0] = self.names[p[1]]
|
||||
except LookupError:
|
||||
print "Undefined name '%s'" % p[1]
|
||||
print("Undefined name '%s'" % p[1])
|
||||
p[0] = 0
|
||||
|
||||
def p_error(self, p):
|
||||
print "Syntax error at '%s'" % p.value
|
||||
if p:
|
||||
print("Syntax error at '%s'" % p.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
if __name__ == '__main__':
|
||||
calc = Calc()
|
||||
|
|
0
ext/ply/example/cleanup.sh
Normal file → Executable file
0
ext/ply/example/cleanup.sh
Normal file → Executable file
130
ext/ply/example/closurecalc/calc.py
Normal file
130
ext/ply/example/closurecalc/calc.py
Normal file
|
@ -0,0 +1,130 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# calc.py
|
||||
#
|
||||
# A calculator parser that makes use of closures. The function make_calculator()
|
||||
# returns a function that accepts an input string and returns a result. All
|
||||
# lexing rules, parsing rules, and internal state are held inside the function.
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
# Make a calculator function
|
||||
|
||||
def make_calculator():
|
||||
import ply.lex as lex
|
||||
import ply.yacc as yacc
|
||||
|
||||
# ------- Internal calculator state
|
||||
|
||||
variables = { } # Dictionary of stored variables
|
||||
|
||||
# ------- Calculator tokenizing rules
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
)
|
||||
|
||||
literals = ['=','+','-','*','/', '(',')']
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
t.value = int(t.value)
|
||||
return t
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
lexer = lex.lex()
|
||||
|
||||
# ------- Calculator parsing rules
|
||||
|
||||
precedence = (
|
||||
('left','+','-'),
|
||||
('left','*','/'),
|
||||
('right','UMINUS'),
|
||||
)
|
||||
|
||||
def p_statement_assign(p):
|
||||
'statement : NAME "=" expression'
|
||||
variables[p[1]] = p[3]
|
||||
p[0] = None
|
||||
|
||||
def p_statement_expr(p):
|
||||
'statement : expression'
|
||||
p[0] = p[1]
|
||||
|
||||
def p_expression_binop(p):
|
||||
'''expression : expression '+' expression
|
||||
| expression '-' expression
|
||||
| expression '*' expression
|
||||
| expression '/' expression'''
|
||||
if p[2] == '+' : p[0] = p[1] + p[3]
|
||||
elif p[2] == '-': p[0] = p[1] - p[3]
|
||||
elif p[2] == '*': p[0] = p[1] * p[3]
|
||||
elif p[2] == '/': p[0] = p[1] / p[3]
|
||||
|
||||
def p_expression_uminus(p):
|
||||
"expression : '-' expression %prec UMINUS"
|
||||
p[0] = -p[2]
|
||||
|
||||
def p_expression_group(p):
|
||||
"expression : '(' expression ')'"
|
||||
p[0] = p[2]
|
||||
|
||||
def p_expression_number(p):
|
||||
"expression : NUMBER"
|
||||
p[0] = p[1]
|
||||
|
||||
def p_expression_name(p):
|
||||
"expression : NAME"
|
||||
try:
|
||||
p[0] = variables[p[1]]
|
||||
except LookupError:
|
||||
print("Undefined name '%s'" % p[1])
|
||||
p[0] = 0
|
||||
|
||||
def p_error(p):
|
||||
if p:
|
||||
print("Syntax error at '%s'" % p.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
|
||||
# Build the parser
|
||||
parser = yacc.yacc()
|
||||
|
||||
# ------- Input function
|
||||
|
||||
def input(text):
|
||||
result = parser.parse(text,lexer=lexer)
|
||||
return result
|
||||
|
||||
return input
|
||||
|
||||
# Make a calculator object and use it
|
||||
calc = make_calculator()
|
||||
|
||||
while True:
|
||||
try:
|
||||
s = raw_input("calc > ")
|
||||
except EOFError:
|
||||
break
|
||||
r = calc(s)
|
||||
if r:
|
||||
print(r)
|
||||
|
||||
|
|
@ -37,7 +37,7 @@ def t_H_EDIT_DESCRIPTOR(t):
|
|||
return t
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
|
|
17
ext/ply/example/newclasscalc/calc.py
Normal file → Executable file
17
ext/ply/example/newclasscalc/calc.py
Normal file → Executable file
|
@ -14,7 +14,9 @@
|
|||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
import readline
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
import ply.lex as lex
|
||||
import ply.yacc as yacc
|
||||
import os
|
||||
|
@ -80,7 +82,7 @@ class Calc(Parser):
|
|||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
#print "parsed number %s" % repr(t.value)
|
||||
return t
|
||||
|
@ -92,7 +94,7 @@ class Calc(Parser):
|
|||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(self, t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Parsing rules
|
||||
|
@ -110,7 +112,7 @@ class Calc(Parser):
|
|||
|
||||
def p_statement_expr(self, p):
|
||||
'statement : expression'
|
||||
print p[1]
|
||||
print(p[1])
|
||||
|
||||
def p_expression_binop(self, p):
|
||||
"""
|
||||
|
@ -144,11 +146,14 @@ class Calc(Parser):
|
|||
try:
|
||||
p[0] = self.names[p[1]]
|
||||
except LookupError:
|
||||
print "Undefined name '%s'" % p[1]
|
||||
print("Undefined name '%s'" % p[1])
|
||||
p[0] = 0
|
||||
|
||||
def p_error(self, p):
|
||||
print "Syntax error at '%s'" % p.value
|
||||
if p:
|
||||
print("Syntax error at '%s'" % p.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
if __name__ == '__main__':
|
||||
calc = Calc()
|
||||
|
|
|
@ -5,5 +5,5 @@ To run:
|
|||
|
||||
- Then run 'python -OO calc.py'
|
||||
|
||||
If working corretly, the second version should run the
|
||||
If working correctly, the second version should run the
|
||||
same way.
|
||||
|
|
|
@ -8,6 +8,9 @@
|
|||
import sys
|
||||
sys.path.insert(0,"../..")
|
||||
|
||||
if sys.version_info[0] >= 3:
|
||||
raw_input = input
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
|
@ -30,7 +33,7 @@ def t_NUMBER(t):
|
|||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
|
@ -41,7 +44,7 @@ def t_newline(t):
|
|||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
|
@ -65,7 +68,7 @@ def p_statement_assign(t):
|
|||
|
||||
def p_statement_expr(t):
|
||||
'statement : expression'
|
||||
print t[1]
|
||||
print(t[1])
|
||||
|
||||
def p_expression_binop(t):
|
||||
'''expression : expression PLUS expression
|
||||
|
@ -95,11 +98,14 @@ def p_expression_name(t):
|
|||
try:
|
||||
t[0] = names[t[1]]
|
||||
except LookupError:
|
||||
print "Undefined name '%s'" % t[1]
|
||||
print("Undefined name '%s'" % t[1])
|
||||
t[0] = 0
|
||||
|
||||
def p_error(t):
|
||||
print "Syntax error at '%s'" % t.value
|
||||
if t:
|
||||
print("Syntax error at '%s'" % t.value)
|
||||
else:
|
||||
print("Syntax error at EOF")
|
||||
|
||||
import ply.yacc as yacc
|
||||
yacc.yacc(optimize=1)
|
||||
|
|
|
@ -100,7 +100,10 @@ def p_expression_name(p):
|
|||
p[0] = 0
|
||||
|
||||
def p_error(p):
|
||||
if p:
|
||||
print "Syntax error at '%s'" % p.value
|
||||
else:
|
||||
print "Syntax error at EOF"
|
||||
|
||||
import ply.yacc as yacc
|
||||
yacc.yacc()
|
||||
|
|
|
@ -42,7 +42,7 @@ def t_SECTION(t):
|
|||
# Comments
|
||||
def t_ccomment(t):
|
||||
r'/\*(.|\n)*?\*/'
|
||||
t.lineno += t.value.count('\n')
|
||||
t.lexer.lineno += t.value.count('\n')
|
||||
|
||||
t_ignore_cppcomment = r'//.*'
|
||||
|
||||
|
@ -95,7 +95,7 @@ def t_code_error(t):
|
|||
raise RuntimeError
|
||||
|
||||
def t_error(t):
|
||||
print "%d: Illegal character '%s'" % (t.lineno, t.value[0])
|
||||
print "%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0])
|
||||
print t.value
|
||||
t.lexer.skip(1)
|
||||
|
||||
|
|
0
ext/ply/example/yply/yply.py
Normal file → Executable file
0
ext/ply/example/yply/yply.py
Normal file → Executable file
898
ext/ply/ply/cpp.py
Normal file
898
ext/ply/ply/cpp.py
Normal file
|
@ -0,0 +1,898 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# cpp.py
|
||||
#
|
||||
# Author: David Beazley (http://www.dabeaz.com)
|
||||
# Copyright (C) 2007
|
||||
# All rights reserved
|
||||
#
|
||||
# This module implements an ANSI-C style lexical preprocessor for PLY.
|
||||
# -----------------------------------------------------------------------------
|
||||
from __future__ import generators
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Default preprocessor lexer definitions. These tokens are enough to get
|
||||
# a basic preprocessor working. Other modules may import these if they want
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
tokens = (
|
||||
'CPP_ID','CPP_INTEGER', 'CPP_FLOAT', 'CPP_STRING', 'CPP_CHAR', 'CPP_WS', 'CPP_COMMENT', 'CPP_POUND','CPP_DPOUND'
|
||||
)
|
||||
|
||||
literals = "+-*/%|&~^<>=!?()[]{}.,;:\\\'\""
|
||||
|
||||
# Whitespace
|
||||
def t_CPP_WS(t):
|
||||
r'\s+'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
return t
|
||||
|
||||
t_CPP_POUND = r'\#'
|
||||
t_CPP_DPOUND = r'\#\#'
|
||||
|
||||
# Identifier
|
||||
t_CPP_ID = r'[A-Za-z_][\w_]*'
|
||||
|
||||
# Integer literal
|
||||
def CPP_INTEGER(t):
|
||||
r'(((((0x)|(0X))[0-9a-fA-F]+)|(\d+))([uU]|[lL]|[uU][lL]|[lL][uU])?)'
|
||||
return t
|
||||
|
||||
t_CPP_INTEGER = CPP_INTEGER
|
||||
|
||||
# Floating literal
|
||||
t_CPP_FLOAT = r'((\d+)(\.\d+)(e(\+|-)?(\d+))? | (\d+)e(\+|-)?(\d+))([lL]|[fF])?'
|
||||
|
||||
# String literal
|
||||
def t_CPP_STRING(t):
|
||||
r'\"([^\\\n]|(\\(.|\n)))*?\"'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
return t
|
||||
|
||||
# Character constant 'c' or L'c'
|
||||
def t_CPP_CHAR(t):
|
||||
r'(L)?\'([^\\\n]|(\\(.|\n)))*?\''
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
return t
|
||||
|
||||
# Comment
|
||||
def t_CPP_COMMENT(t):
|
||||
r'(/\*(.|\n)*?\*/)|(//.*?\n)'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
return t
|
||||
|
||||
def t_error(t):
|
||||
t.type = t.value[0]
|
||||
t.value = t.value[0]
|
||||
t.lexer.skip(1)
|
||||
return t
|
||||
|
||||
import re
|
||||
import copy
|
||||
import time
|
||||
import os.path
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# trigraph()
|
||||
#
|
||||
# Given an input string, this function replaces all trigraph sequences.
|
||||
# The following mapping is used:
|
||||
#
|
||||
# ??= #
|
||||
# ??/ \
|
||||
# ??' ^
|
||||
# ??( [
|
||||
# ??) ]
|
||||
# ??! |
|
||||
# ??< {
|
||||
# ??> }
|
||||
# ??- ~
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
_trigraph_pat = re.compile(r'''\?\?[=/\'\(\)\!<>\-]''')
|
||||
_trigraph_rep = {
|
||||
'=':'#',
|
||||
'/':'\\',
|
||||
"'":'^',
|
||||
'(':'[',
|
||||
')':']',
|
||||
'!':'|',
|
||||
'<':'{',
|
||||
'>':'}',
|
||||
'-':'~'
|
||||
}
|
||||
|
||||
def trigraph(input):
|
||||
return _trigraph_pat.sub(lambda g: _trigraph_rep[g.group()[-1]],input)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Macro object
|
||||
#
|
||||
# This object holds information about preprocessor macros
|
||||
#
|
||||
# .name - Macro name (string)
|
||||
# .value - Macro value (a list of tokens)
|
||||
# .arglist - List of argument names
|
||||
# .variadic - Boolean indicating whether or not variadic macro
|
||||
# .vararg - Name of the variadic parameter
|
||||
#
|
||||
# When a macro is created, the macro replacement token sequence is
|
||||
# pre-scanned and used to create patch lists that are later used
|
||||
# during macro expansion
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
class Macro(object):
|
||||
def __init__(self,name,value,arglist=None,variadic=False):
|
||||
self.name = name
|
||||
self.value = value
|
||||
self.arglist = arglist
|
||||
self.variadic = variadic
|
||||
if variadic:
|
||||
self.vararg = arglist[-1]
|
||||
self.source = None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Preprocessor object
|
||||
#
|
||||
# Object representing a preprocessor. Contains macro definitions,
|
||||
# include directories, and other information
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
class Preprocessor(object):
|
||||
def __init__(self,lexer=None):
|
||||
if lexer is None:
|
||||
lexer = lex.lexer
|
||||
self.lexer = lexer
|
||||
self.macros = { }
|
||||
self.path = []
|
||||
self.temp_path = []
|
||||
|
||||
# Probe the lexer for selected tokens
|
||||
self.lexprobe()
|
||||
|
||||
tm = time.localtime()
|
||||
self.define("__DATE__ \"%s\"" % time.strftime("%b %d %Y",tm))
|
||||
self.define("__TIME__ \"%s\"" % time.strftime("%H:%M:%S",tm))
|
||||
self.parser = None
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# tokenize()
|
||||
#
|
||||
# Utility function. Given a string of text, tokenize into a list of tokens
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
def tokenize(self,text):
|
||||
tokens = []
|
||||
self.lexer.input(text)
|
||||
while True:
|
||||
tok = self.lexer.token()
|
||||
if not tok: break
|
||||
tokens.append(tok)
|
||||
return tokens
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# error()
|
||||
#
|
||||
# Report a preprocessor error/warning of some kind
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def error(self,file,line,msg):
|
||||
print >>sys.stderr,"%s:%d %s" % (file,line,msg)
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# lexprobe()
|
||||
#
|
||||
# This method probes the preprocessor lexer object to discover
|
||||
# the token types of symbols that are important to the preprocessor.
|
||||
# If this works right, the preprocessor will simply "work"
|
||||
# with any suitable lexer regardless of how tokens have been named.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def lexprobe(self):
|
||||
|
||||
# Determine the token type for identifiers
|
||||
self.lexer.input("identifier")
|
||||
tok = self.lexer.token()
|
||||
if not tok or tok.value != "identifier":
|
||||
print "Couldn't determine identifier type"
|
||||
else:
|
||||
self.t_ID = tok.type
|
||||
|
||||
# Determine the token type for integers
|
||||
self.lexer.input("12345")
|
||||
tok = self.lexer.token()
|
||||
if not tok or int(tok.value) != 12345:
|
||||
print "Couldn't determine integer type"
|
||||
else:
|
||||
self.t_INTEGER = tok.type
|
||||
self.t_INTEGER_TYPE = type(tok.value)
|
||||
|
||||
# Determine the token type for strings enclosed in double quotes
|
||||
self.lexer.input("\"filename\"")
|
||||
tok = self.lexer.token()
|
||||
if not tok or tok.value != "\"filename\"":
|
||||
print "Couldn't determine string type"
|
||||
else:
|
||||
self.t_STRING = tok.type
|
||||
|
||||
# Determine the token type for whitespace--if any
|
||||
self.lexer.input(" ")
|
||||
tok = self.lexer.token()
|
||||
if not tok or tok.value != " ":
|
||||
self.t_SPACE = None
|
||||
else:
|
||||
self.t_SPACE = tok.type
|
||||
|
||||
# Determine the token type for newlines
|
||||
self.lexer.input("\n")
|
||||
tok = self.lexer.token()
|
||||
if not tok or tok.value != "\n":
|
||||
self.t_NEWLINE = None
|
||||
print "Couldn't determine token for newlines"
|
||||
else:
|
||||
self.t_NEWLINE = tok.type
|
||||
|
||||
self.t_WS = (self.t_SPACE, self.t_NEWLINE)
|
||||
|
||||
# Check for other characters used by the preprocessor
|
||||
chars = [ '<','>','#','##','\\','(',')',',','.']
|
||||
for c in chars:
|
||||
self.lexer.input(c)
|
||||
tok = self.lexer.token()
|
||||
if not tok or tok.value != c:
|
||||
print "Unable to lex '%s' required for preprocessor" % c
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# add_path()
|
||||
#
|
||||
# Adds a search path to the preprocessor.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def add_path(self,path):
|
||||
self.path.append(path)
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# group_lines()
|
||||
#
|
||||
# Given an input string, this function splits it into lines. Trailing whitespace
|
||||
# is removed. Any line ending with \ is grouped with the next line. This
|
||||
# function forms the lowest level of the preprocessor---grouping into text into
|
||||
# a line-by-line format.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def group_lines(self,input):
|
||||
lex = self.lexer.clone()
|
||||
lines = [x.rstrip() for x in input.splitlines()]
|
||||
for i in xrange(len(lines)):
|
||||
j = i+1
|
||||
while lines[i].endswith('\\') and (j < len(lines)):
|
||||
lines[i] = lines[i][:-1]+lines[j]
|
||||
lines[j] = ""
|
||||
j += 1
|
||||
|
||||
input = "\n".join(lines)
|
||||
lex.input(input)
|
||||
lex.lineno = 1
|
||||
|
||||
current_line = []
|
||||
while True:
|
||||
tok = lex.token()
|
||||
if not tok:
|
||||
break
|
||||
current_line.append(tok)
|
||||
if tok.type in self.t_WS and '\n' in tok.value:
|
||||
yield current_line
|
||||
current_line = []
|
||||
|
||||
if current_line:
|
||||
yield current_line
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# tokenstrip()
|
||||
#
|
||||
# Remove leading/trailing whitespace tokens from a token list
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def tokenstrip(self,tokens):
|
||||
i = 0
|
||||
while i < len(tokens) and tokens[i].type in self.t_WS:
|
||||
i += 1
|
||||
del tokens[:i]
|
||||
i = len(tokens)-1
|
||||
while i >= 0 and tokens[i].type in self.t_WS:
|
||||
i -= 1
|
||||
del tokens[i+1:]
|
||||
return tokens
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# collect_args()
|
||||
#
|
||||
# Collects comma separated arguments from a list of tokens. The arguments
|
||||
# must be enclosed in parenthesis. Returns a tuple (tokencount,args,positions)
|
||||
# where tokencount is the number of tokens consumed, args is a list of arguments,
|
||||
# and positions is a list of integers containing the starting index of each
|
||||
# argument. Each argument is represented by a list of tokens.
|
||||
#
|
||||
# When collecting arguments, leading and trailing whitespace is removed
|
||||
# from each argument.
|
||||
#
|
||||
# This function properly handles nested parenthesis and commas---these do not
|
||||
# define new arguments.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def collect_args(self,tokenlist):
|
||||
args = []
|
||||
positions = []
|
||||
current_arg = []
|
||||
nesting = 1
|
||||
tokenlen = len(tokenlist)
|
||||
|
||||
# Search for the opening '('.
|
||||
i = 0
|
||||
while (i < tokenlen) and (tokenlist[i].type in self.t_WS):
|
||||
i += 1
|
||||
|
||||
if (i < tokenlen) and (tokenlist[i].value == '('):
|
||||
positions.append(i+1)
|
||||
else:
|
||||
self.error(self.source,tokenlist[0].lineno,"Missing '(' in macro arguments")
|
||||
return 0, [], []
|
||||
|
||||
i += 1
|
||||
|
||||
while i < tokenlen:
|
||||
t = tokenlist[i]
|
||||
if t.value == '(':
|
||||
current_arg.append(t)
|
||||
nesting += 1
|
||||
elif t.value == ')':
|
||||
nesting -= 1
|
||||
if nesting == 0:
|
||||
if current_arg:
|
||||
args.append(self.tokenstrip(current_arg))
|
||||
positions.append(i)
|
||||
return i+1,args,positions
|
||||
current_arg.append(t)
|
||||
elif t.value == ',' and nesting == 1:
|
||||
args.append(self.tokenstrip(current_arg))
|
||||
positions.append(i+1)
|
||||
current_arg = []
|
||||
else:
|
||||
current_arg.append(t)
|
||||
i += 1
|
||||
|
||||
# Missing end argument
|
||||
self.error(self.source,tokenlist[-1].lineno,"Missing ')' in macro arguments")
|
||||
return 0, [],[]
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# macro_prescan()
|
||||
#
|
||||
# Examine the macro value (token sequence) and identify patch points
|
||||
# This is used to speed up macro expansion later on---we'll know
|
||||
# right away where to apply patches to the value to form the expansion
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def macro_prescan(self,macro):
|
||||
macro.patch = [] # Standard macro arguments
|
||||
macro.str_patch = [] # String conversion expansion
|
||||
macro.var_comma_patch = [] # Variadic macro comma patch
|
||||
i = 0
|
||||
while i < len(macro.value):
|
||||
if macro.value[i].type == self.t_ID and macro.value[i].value in macro.arglist:
|
||||
argnum = macro.arglist.index(macro.value[i].value)
|
||||
# Conversion of argument to a string
|
||||
if i > 0 and macro.value[i-1].value == '#':
|
||||
macro.value[i] = copy.copy(macro.value[i])
|
||||
macro.value[i].type = self.t_STRING
|
||||
del macro.value[i-1]
|
||||
macro.str_patch.append((argnum,i-1))
|
||||
continue
|
||||
# Concatenation
|
||||
elif (i > 0 and macro.value[i-1].value == '##'):
|
||||
macro.patch.append(('c',argnum,i-1))
|
||||
del macro.value[i-1]
|
||||
continue
|
||||
elif ((i+1) < len(macro.value) and macro.value[i+1].value == '##'):
|
||||
macro.patch.append(('c',argnum,i))
|
||||
i += 1
|
||||
continue
|
||||
# Standard expansion
|
||||
else:
|
||||
macro.patch.append(('e',argnum,i))
|
||||
elif macro.value[i].value == '##':
|
||||
if macro.variadic and (i > 0) and (macro.value[i-1].value == ',') and \
|
||||
((i+1) < len(macro.value)) and (macro.value[i+1].type == self.t_ID) and \
|
||||
(macro.value[i+1].value == macro.vararg):
|
||||
macro.var_comma_patch.append(i-1)
|
||||
i += 1
|
||||
macro.patch.sort(key=lambda x: x[2],reverse=True)
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# macro_expand_args()
|
||||
#
|
||||
# Given a Macro and list of arguments (each a token list), this method
|
||||
# returns an expanded version of a macro. The return value is a token sequence
|
||||
# representing the replacement macro tokens
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def macro_expand_args(self,macro,args):
|
||||
# Make a copy of the macro token sequence
|
||||
rep = [copy.copy(_x) for _x in macro.value]
|
||||
|
||||
# Make string expansion patches. These do not alter the length of the replacement sequence
|
||||
|
||||
str_expansion = {}
|
||||
for argnum, i in macro.str_patch:
|
||||
if argnum not in str_expansion:
|
||||
str_expansion[argnum] = ('"%s"' % "".join([x.value for x in args[argnum]])).replace("\\","\\\\")
|
||||
rep[i] = copy.copy(rep[i])
|
||||
rep[i].value = str_expansion[argnum]
|
||||
|
||||
# Make the variadic macro comma patch. If the variadic macro argument is empty, we get rid
|
||||
comma_patch = False
|
||||
if macro.variadic and not args[-1]:
|
||||
for i in macro.var_comma_patch:
|
||||
rep[i] = None
|
||||
comma_patch = True
|
||||
|
||||
# Make all other patches. The order of these matters. It is assumed that the patch list
|
||||
# has been sorted in reverse order of patch location since replacements will cause the
|
||||
# size of the replacement sequence to expand from the patch point.
|
||||
|
||||
expanded = { }
|
||||
for ptype, argnum, i in macro.patch:
|
||||
# Concatenation. Argument is left unexpanded
|
||||
if ptype == 'c':
|
||||
rep[i:i+1] = args[argnum]
|
||||
# Normal expansion. Argument is macro expanded first
|
||||
elif ptype == 'e':
|
||||
if argnum not in expanded:
|
||||
expanded[argnum] = self.expand_macros(args[argnum])
|
||||
rep[i:i+1] = expanded[argnum]
|
||||
|
||||
# Get rid of removed comma if necessary
|
||||
if comma_patch:
|
||||
rep = [_i for _i in rep if _i]
|
||||
|
||||
return rep
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# expand_macros()
|
||||
#
|
||||
# Given a list of tokens, this function performs macro expansion.
|
||||
# The expanded argument is a dictionary that contains macros already
|
||||
# expanded. This is used to prevent infinite recursion.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def expand_macros(self,tokens,expanded=None):
|
||||
if expanded is None:
|
||||
expanded = {}
|
||||
i = 0
|
||||
while i < len(tokens):
|
||||
t = tokens[i]
|
||||
if t.type == self.t_ID:
|
||||
if t.value in self.macros and t.value not in expanded:
|
||||
# Yes, we found a macro match
|
||||
expanded[t.value] = True
|
||||
|
||||
m = self.macros[t.value]
|
||||
if not m.arglist:
|
||||
# A simple macro
|
||||
ex = self.expand_macros([copy.copy(_x) for _x in m.value],expanded)
|
||||
for e in ex:
|
||||
e.lineno = t.lineno
|
||||
tokens[i:i+1] = ex
|
||||
i += len(ex)
|
||||
else:
|
||||
# A macro with arguments
|
||||
j = i + 1
|
||||
while j < len(tokens) and tokens[j].type in self.t_WS:
|
||||
j += 1
|
||||
if tokens[j].value == '(':
|
||||
tokcount,args,positions = self.collect_args(tokens[j:])
|
||||
if not m.variadic and len(args) != len(m.arglist):
|
||||
self.error(self.source,t.lineno,"Macro %s requires %d arguments" % (t.value,len(m.arglist)))
|
||||
i = j + tokcount
|
||||
elif m.variadic and len(args) < len(m.arglist)-1:
|
||||
if len(m.arglist) > 2:
|
||||
self.error(self.source,t.lineno,"Macro %s must have at least %d arguments" % (t.value, len(m.arglist)-1))
|
||||
else:
|
||||
self.error(self.source,t.lineno,"Macro %s must have at least %d argument" % (t.value, len(m.arglist)-1))
|
||||
i = j + tokcount
|
||||
else:
|
||||
if m.variadic:
|
||||
if len(args) == len(m.arglist)-1:
|
||||
args.append([])
|
||||
else:
|
||||
args[len(m.arglist)-1] = tokens[j+positions[len(m.arglist)-1]:j+tokcount-1]
|
||||
del args[len(m.arglist):]
|
||||
|
||||
# Get macro replacement text
|
||||
rep = self.macro_expand_args(m,args)
|
||||
rep = self.expand_macros(rep,expanded)
|
||||
for r in rep:
|
||||
r.lineno = t.lineno
|
||||
tokens[i:j+tokcount] = rep
|
||||
i += len(rep)
|
||||
del expanded[t.value]
|
||||
continue
|
||||
elif t.value == '__LINE__':
|
||||
t.type = self.t_INTEGER
|
||||
t.value = self.t_INTEGER_TYPE(t.lineno)
|
||||
|
||||
i += 1
|
||||
return tokens
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# evalexpr()
|
||||
#
|
||||
# Evaluate an expression token sequence for the purposes of evaluating
|
||||
# integral expressions.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def evalexpr(self,tokens):
|
||||
# tokens = tokenize(line)
|
||||
# Search for defined macros
|
||||
i = 0
|
||||
while i < len(tokens):
|
||||
if tokens[i].type == self.t_ID and tokens[i].value == 'defined':
|
||||
j = i + 1
|
||||
needparen = False
|
||||
result = "0L"
|
||||
while j < len(tokens):
|
||||
if tokens[j].type in self.t_WS:
|
||||
j += 1
|
||||
continue
|
||||
elif tokens[j].type == self.t_ID:
|
||||
if tokens[j].value in self.macros:
|
||||
result = "1L"
|
||||
else:
|
||||
result = "0L"
|
||||
if not needparen: break
|
||||
elif tokens[j].value == '(':
|
||||
needparen = True
|
||||
elif tokens[j].value == ')':
|
||||
break
|
||||
else:
|
||||
self.error(self.source,tokens[i].lineno,"Malformed defined()")
|
||||
j += 1
|
||||
tokens[i].type = self.t_INTEGER
|
||||
tokens[i].value = self.t_INTEGER_TYPE(result)
|
||||
del tokens[i+1:j+1]
|
||||
i += 1
|
||||
tokens = self.expand_macros(tokens)
|
||||
for i,t in enumerate(tokens):
|
||||
if t.type == self.t_ID:
|
||||
tokens[i] = copy.copy(t)
|
||||
tokens[i].type = self.t_INTEGER
|
||||
tokens[i].value = self.t_INTEGER_TYPE("0L")
|
||||
elif t.type == self.t_INTEGER:
|
||||
tokens[i] = copy.copy(t)
|
||||
# Strip off any trailing suffixes
|
||||
tokens[i].value = str(tokens[i].value)
|
||||
while tokens[i].value[-1] not in "0123456789abcdefABCDEF":
|
||||
tokens[i].value = tokens[i].value[:-1]
|
||||
|
||||
expr = "".join([str(x.value) for x in tokens])
|
||||
expr = expr.replace("&&"," and ")
|
||||
expr = expr.replace("||"," or ")
|
||||
expr = expr.replace("!"," not ")
|
||||
try:
|
||||
result = eval(expr)
|
||||
except StandardError:
|
||||
self.error(self.source,tokens[0].lineno,"Couldn't evaluate expression")
|
||||
result = 0
|
||||
return result
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# parsegen()
|
||||
#
|
||||
# Parse an input string/
|
||||
# ----------------------------------------------------------------------
|
||||
def parsegen(self,input,source=None):
|
||||
|
||||
# Replace trigraph sequences
|
||||
t = trigraph(input)
|
||||
lines = self.group_lines(t)
|
||||
|
||||
if not source:
|
||||
source = ""
|
||||
|
||||
self.define("__FILE__ \"%s\"" % source)
|
||||
|
||||
self.source = source
|
||||
chunk = []
|
||||
enable = True
|
||||
iftrigger = False
|
||||
ifstack = []
|
||||
|
||||
for x in lines:
|
||||
for i,tok in enumerate(x):
|
||||
if tok.type not in self.t_WS: break
|
||||
if tok.value == '#':
|
||||
# Preprocessor directive
|
||||
|
||||
for tok in x:
|
||||
if tok in self.t_WS and '\n' in tok.value:
|
||||
chunk.append(tok)
|
||||
|
||||
dirtokens = self.tokenstrip(x[i+1:])
|
||||
if dirtokens:
|
||||
name = dirtokens[0].value
|
||||
args = self.tokenstrip(dirtokens[1:])
|
||||
else:
|
||||
name = ""
|
||||
args = []
|
||||
|
||||
if name == 'define':
|
||||
if enable:
|
||||
for tok in self.expand_macros(chunk):
|
||||
yield tok
|
||||
chunk = []
|
||||
self.define(args)
|
||||
elif name == 'include':
|
||||
if enable:
|
||||
for tok in self.expand_macros(chunk):
|
||||
yield tok
|
||||
chunk = []
|
||||
oldfile = self.macros['__FILE__']
|
||||
for tok in self.include(args):
|
||||
yield tok
|
||||
self.macros['__FILE__'] = oldfile
|
||||
self.source = source
|
||||
elif name == 'undef':
|
||||
if enable:
|
||||
for tok in self.expand_macros(chunk):
|
||||
yield tok
|
||||
chunk = []
|
||||
self.undef(args)
|
||||
elif name == 'ifdef':
|
||||
ifstack.append((enable,iftrigger))
|
||||
if enable:
|
||||
if not args[0].value in self.macros:
|
||||
enable = False
|
||||
iftrigger = False
|
||||
else:
|
||||
iftrigger = True
|
||||
elif name == 'ifndef':
|
||||
ifstack.append((enable,iftrigger))
|
||||
if enable:
|
||||
if args[0].value in self.macros:
|
||||
enable = False
|
||||
iftrigger = False
|
||||
else:
|
||||
iftrigger = True
|
||||
elif name == 'if':
|
||||
ifstack.append((enable,iftrigger))
|
||||
if enable:
|
||||
result = self.evalexpr(args)
|
||||
if not result:
|
||||
enable = False
|
||||
iftrigger = False
|
||||
else:
|
||||
iftrigger = True
|
||||
elif name == 'elif':
|
||||
if ifstack:
|
||||
if ifstack[-1][0]: # We only pay attention if outer "if" allows this
|
||||
if enable: # If already true, we flip enable False
|
||||
enable = False
|
||||
elif not iftrigger: # If False, but not triggered yet, we'll check expression
|
||||
result = self.evalexpr(args)
|
||||
if result:
|
||||
enable = True
|
||||
iftrigger = True
|
||||
else:
|
||||
self.error(self.source,dirtokens[0].lineno,"Misplaced #elif")
|
||||
|
||||
elif name == 'else':
|
||||
if ifstack:
|
||||
if ifstack[-1][0]:
|
||||
if enable:
|
||||
enable = False
|
||||
elif not iftrigger:
|
||||
enable = True
|
||||
iftrigger = True
|
||||
else:
|
||||
self.error(self.source,dirtokens[0].lineno,"Misplaced #else")
|
||||
|
||||
elif name == 'endif':
|
||||
if ifstack:
|
||||
enable,iftrigger = ifstack.pop()
|
||||
else:
|
||||
self.error(self.source,dirtokens[0].lineno,"Misplaced #endif")
|
||||
else:
|
||||
# Unknown preprocessor directive
|
||||
pass
|
||||
|
||||
else:
|
||||
# Normal text
|
||||
if enable:
|
||||
chunk.extend(x)
|
||||
|
||||
for tok in self.expand_macros(chunk):
|
||||
yield tok
|
||||
chunk = []
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# include()
|
||||
#
|
||||
# Implementation of file-inclusion
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def include(self,tokens):
|
||||
# Try to extract the filename and then process an include file
|
||||
if not tokens:
|
||||
return
|
||||
if tokens:
|
||||
if tokens[0].value != '<' and tokens[0].type != self.t_STRING:
|
||||
tokens = self.expand_macros(tokens)
|
||||
|
||||
if tokens[0].value == '<':
|
||||
# Include <...>
|
||||
i = 1
|
||||
while i < len(tokens):
|
||||
if tokens[i].value == '>':
|
||||
break
|
||||
i += 1
|
||||
else:
|
||||
print "Malformed #include <...>"
|
||||
return
|
||||
filename = "".join([x.value for x in tokens[1:i]])
|
||||
path = self.path + [""] + self.temp_path
|
||||
elif tokens[0].type == self.t_STRING:
|
||||
filename = tokens[0].value[1:-1]
|
||||
path = self.temp_path + [""] + self.path
|
||||
else:
|
||||
print "Malformed #include statement"
|
||||
return
|
||||
for p in path:
|
||||
iname = os.path.join(p,filename)
|
||||
try:
|
||||
data = open(iname,"r").read()
|
||||
dname = os.path.dirname(iname)
|
||||
if dname:
|
||||
self.temp_path.insert(0,dname)
|
||||
for tok in self.parsegen(data,filename):
|
||||
yield tok
|
||||
if dname:
|
||||
del self.temp_path[0]
|
||||
break
|
||||
except IOError,e:
|
||||
pass
|
||||
else:
|
||||
print "Couldn't find '%s'" % filename
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# define()
|
||||
#
|
||||
# Define a new macro
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def define(self,tokens):
|
||||
if isinstance(tokens,(str,unicode)):
|
||||
tokens = self.tokenize(tokens)
|
||||
|
||||
linetok = tokens
|
||||
try:
|
||||
name = linetok[0]
|
||||
if len(linetok) > 1:
|
||||
mtype = linetok[1]
|
||||
else:
|
||||
mtype = None
|
||||
if not mtype:
|
||||
m = Macro(name.value,[])
|
||||
self.macros[name.value] = m
|
||||
elif mtype.type in self.t_WS:
|
||||
# A normal macro
|
||||
m = Macro(name.value,self.tokenstrip(linetok[2:]))
|
||||
self.macros[name.value] = m
|
||||
elif mtype.value == '(':
|
||||
# A macro with arguments
|
||||
tokcount, args, positions = self.collect_args(linetok[1:])
|
||||
variadic = False
|
||||
for a in args:
|
||||
if variadic:
|
||||
print "No more arguments may follow a variadic argument"
|
||||
break
|
||||
astr = "".join([str(_i.value) for _i in a])
|
||||
if astr == "...":
|
||||
variadic = True
|
||||
a[0].type = self.t_ID
|
||||
a[0].value = '__VA_ARGS__'
|
||||
variadic = True
|
||||
del a[1:]
|
||||
continue
|
||||
elif astr[-3:] == "..." and a[0].type == self.t_ID:
|
||||
variadic = True
|
||||
del a[1:]
|
||||
# If, for some reason, "." is part of the identifier, strip off the name for the purposes
|
||||
# of macro expansion
|
||||
if a[0].value[-3:] == '...':
|
||||
a[0].value = a[0].value[:-3]
|
||||
continue
|
||||
if len(a) > 1 or a[0].type != self.t_ID:
|
||||
print "Invalid macro argument"
|
||||
break
|
||||
else:
|
||||
mvalue = self.tokenstrip(linetok[1+tokcount:])
|
||||
i = 0
|
||||
while i < len(mvalue):
|
||||
if i+1 < len(mvalue):
|
||||
if mvalue[i].type in self.t_WS and mvalue[i+1].value == '##':
|
||||
del mvalue[i]
|
||||
continue
|
||||
elif mvalue[i].value == '##' and mvalue[i+1].type in self.t_WS:
|
||||
del mvalue[i+1]
|
||||
i += 1
|
||||
m = Macro(name.value,mvalue,[x[0].value for x in args],variadic)
|
||||
self.macro_prescan(m)
|
||||
self.macros[name.value] = m
|
||||
else:
|
||||
print "Bad macro definition"
|
||||
except LookupError:
|
||||
print "Bad macro definition"
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# undef()
|
||||
#
|
||||
# Undefine a macro
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def undef(self,tokens):
|
||||
id = tokens[0].value
|
||||
try:
|
||||
del self.macros[id]
|
||||
except LookupError:
|
||||
pass
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# parse()
|
||||
#
|
||||
# Parse input text.
|
||||
# ----------------------------------------------------------------------
|
||||
def parse(self,input,source=None,ignore={}):
|
||||
self.ignore = ignore
|
||||
self.parser = self.parsegen(input,source)
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# token()
|
||||
#
|
||||
# Method to return individual tokens
|
||||
# ----------------------------------------------------------------------
|
||||
def token(self):
|
||||
try:
|
||||
while True:
|
||||
tok = self.parser.next()
|
||||
if tok.type not in self.ignore: return tok
|
||||
except StopIteration:
|
||||
self.parser = None
|
||||
return None
|
||||
|
||||
if __name__ == '__main__':
|
||||
import ply.lex as lex
|
||||
lexer = lex.lex()
|
||||
|
||||
# Run a preprocessor
|
||||
import sys
|
||||
f = open(sys.argv[1])
|
||||
input = f.read()
|
||||
|
||||
p = Preprocessor(lexer)
|
||||
p.parse(input,sys.argv[1])
|
||||
while True:
|
||||
tok = p.token()
|
||||
if not tok: break
|
||||
print p.source, tok
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
133
ext/ply/ply/ctokens.py
Normal file
133
ext/ply/ply/ctokens.py
Normal file
|
@ -0,0 +1,133 @@
|
|||
# ----------------------------------------------------------------------
|
||||
# ctokens.py
|
||||
#
|
||||
# Token specifications for symbols in ANSI C and C++. This file is
|
||||
# meant to be used as a library in other tokenizers.
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
# Reserved words
|
||||
|
||||
tokens = [
|
||||
# Literals (identifier, integer constant, float constant, string constant, char const)
|
||||
'ID', 'TYPEID', 'ICONST', 'FCONST', 'SCONST', 'CCONST',
|
||||
|
||||
# Operators (+,-,*,/,%,|,&,~,^,<<,>>, ||, &&, !, <, <=, >, >=, ==, !=)
|
||||
'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'MOD',
|
||||
'OR', 'AND', 'NOT', 'XOR', 'LSHIFT', 'RSHIFT',
|
||||
'LOR', 'LAND', 'LNOT',
|
||||
'LT', 'LE', 'GT', 'GE', 'EQ', 'NE',
|
||||
|
||||
# Assignment (=, *=, /=, %=, +=, -=, <<=, >>=, &=, ^=, |=)
|
||||
'EQUALS', 'TIMESEQUAL', 'DIVEQUAL', 'MODEQUAL', 'PLUSEQUAL', 'MINUSEQUAL',
|
||||
'LSHIFTEQUAL','RSHIFTEQUAL', 'ANDEQUAL', 'XOREQUAL', 'OREQUAL',
|
||||
|
||||
# Increment/decrement (++,--)
|
||||
'PLUSPLUS', 'MINUSMINUS',
|
||||
|
||||
# Structure dereference (->)
|
||||
'ARROW',
|
||||
|
||||
# Ternary operator (?)
|
||||
'TERNARY',
|
||||
|
||||
# Delimeters ( ) [ ] { } , . ; :
|
||||
'LPAREN', 'RPAREN',
|
||||
'LBRACKET', 'RBRACKET',
|
||||
'LBRACE', 'RBRACE',
|
||||
'COMMA', 'PERIOD', 'SEMI', 'COLON',
|
||||
|
||||
# Ellipsis (...)
|
||||
'ELLIPSIS',
|
||||
]
|
||||
|
||||
# Operators
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_MODULO = r'%'
|
||||
t_OR = r'\|'
|
||||
t_AND = r'&'
|
||||
t_NOT = r'~'
|
||||
t_XOR = r'\^'
|
||||
t_LSHIFT = r'<<'
|
||||
t_RSHIFT = r'>>'
|
||||
t_LOR = r'\|\|'
|
||||
t_LAND = r'&&'
|
||||
t_LNOT = r'!'
|
||||
t_LT = r'<'
|
||||
t_GT = r'>'
|
||||
t_LE = r'<='
|
||||
t_GE = r'>='
|
||||
t_EQ = r'=='
|
||||
t_NE = r'!='
|
||||
|
||||
# Assignment operators
|
||||
|
||||
t_EQUALS = r'='
|
||||
t_TIMESEQUAL = r'\*='
|
||||
t_DIVEQUAL = r'/='
|
||||
t_MODEQUAL = r'%='
|
||||
t_PLUSEQUAL = r'\+='
|
||||
t_MINUSEQUAL = r'-='
|
||||
t_LSHIFTEQUAL = r'<<='
|
||||
t_RSHIFTEQUAL = r'>>='
|
||||
t_ANDEQUAL = r'&='
|
||||
t_OREQUAL = r'\|='
|
||||
t_XOREQUAL = r'^='
|
||||
|
||||
# Increment/decrement
|
||||
t_INCREMENT = r'\+\+'
|
||||
t_DECREMENT = r'--'
|
||||
|
||||
# ->
|
||||
t_ARROW = r'->'
|
||||
|
||||
# ?
|
||||
t_TERNARY = r'\?'
|
||||
|
||||
# Delimeters
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_LBRACKET = r'\['
|
||||
t_RBRACKET = r'\]'
|
||||
t_LBRACE = r'\{'
|
||||
t_RBRACE = r'\}'
|
||||
t_COMMA = r','
|
||||
t_PERIOD = r'\.'
|
||||
t_SEMI = r';'
|
||||
t_COLON = r':'
|
||||
t_ELLIPSIS = r'\.\.\.'
|
||||
|
||||
# Identifiers
|
||||
t_ID = r'[A-Za-z_][A-Za-z0-9_]*'
|
||||
|
||||
# Integer literal
|
||||
t_INTEGER = r'\d+([uU]|[lL]|[uU][lL]|[lL][uU])?'
|
||||
|
||||
# Floating literal
|
||||
t_FLOAT = r'((\d+)(\.\d+)(e(\+|-)?(\d+))? | (\d+)e(\+|-)?(\d+))([lL]|[fF])?'
|
||||
|
||||
# String literal
|
||||
t_STRING = r'\"([^\\\n]|(\\.))*?\"'
|
||||
|
||||
# Character constant 'c' or L'c'
|
||||
t_CHARACTER = r'(L)?\'([^\\\n]|(\\.))*?\''
|
||||
|
||||
# Comment (C-Style)
|
||||
def t_COMMENT(t):
|
||||
r'/\*(.|\n)*?\*/'
|
||||
t.lexer.lineno += t.value.count('\n')
|
||||
return t
|
||||
|
||||
# Comment (C++-Style)
|
||||
def t_CPPCOMMENT(t):
|
||||
r'//.*\n'
|
||||
t.lexer.lineno += 1
|
||||
return t
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
File diff suppressed because it is too large
Load diff
3790
ext/ply/ply/yacc.py
3790
ext/ply/ply/yacc.py
File diff suppressed because it is too large
Load diff
|
@ -1,23 +1,22 @@
|
|||
from distutils.core import setup
|
||||
try:
|
||||
from setuptools import setup
|
||||
except ImportError:
|
||||
from distutils.core import setup
|
||||
|
||||
setup(name = "ply",
|
||||
description="Python Lex & Yacc",
|
||||
long_description = """
|
||||
PLY is yet another implementation of lex and yacc for Python. Although several other
|
||||
parsing tools are available for Python, there are several reasons why you might
|
||||
want to take a look at PLY:
|
||||
|
||||
It's implemented entirely in Python.
|
||||
|
||||
It uses LR-parsing which is reasonably efficient and well suited for larger grammars.
|
||||
PLY is yet another implementation of lex and yacc for Python. Some notable
|
||||
features include the fact that its implemented entirely in Python and it
|
||||
uses LALR(1) parsing which is efficient and well suited for larger grammars.
|
||||
|
||||
PLY provides most of the standard lex/yacc features including support for empty
|
||||
productions, precedence rules, error recovery, and support for ambiguous grammars.
|
||||
|
||||
PLY is extremely easy to use and provides very extensive error checking.
|
||||
""",
|
||||
license="""Lesser GPL (LGPL)""",
|
||||
version = "2.3",
|
||||
license="""BSD""",
|
||||
version = "3.2",
|
||||
author = "David Beazley",
|
||||
author_email = "dave@dabeaz.com",
|
||||
maintainer = "David Beazley",
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
sys.path.append("..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = (
|
||||
|
@ -28,7 +28,7 @@ def t_NUMBER(t):
|
|||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print "Integer value too large", t.value
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
|
@ -39,7 +39,7 @@ def t_newline(t):
|
|||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
|
|
2
ext/ply/test/cleanup.sh
Normal file → Executable file
2
ext/ply/test/cleanup.sh
Normal file → Executable file
|
@ -1,4 +1,4 @@
|
|||
#!/bin/sh
|
||||
|
||||
rm -f *~ *.pyc *.dif *.out
|
||||
rm -f *~ *.pyc *.pyo *.dif *.out
|
||||
|
||||
|
|
54
ext/ply/test/lex_closure.py
Normal file
54
ext/ply/test/lex_closure.py
Normal file
|
@ -0,0 +1,54 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_closure.py
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
def make_calc():
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
return lex.lex()
|
||||
|
||||
make_calc()
|
||||
lex.runmain(data="3+4")
|
||||
|
||||
|
||||
|
|
@ -1 +0,0 @@
|
|||
./lex_doc1.py:18: No regular expression defined for rule 't_NUMBER'
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_doc1.py
|
||||
#
|
||||
# Missing documentation string
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -21,10 +21,6 @@ def t_NUMBER(t):
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
|
||||
import sys
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
./lex_dup1.py:20: Rule t_NUMBER redefined. Previously defined on line 18
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_dup1.py
|
||||
#
|
||||
# Duplicated rule specifiers
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -22,7 +22,7 @@ t_NUMBER = r'\d+'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
./lex_dup2.py:22: Rule t_NUMBER redefined. Previously defined on line 18
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_dup2.py
|
||||
#
|
||||
# Duplicated rule specifiers
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -26,7 +26,7 @@ def t_NUMBER(t):
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
./lex_dup3.py:20: Rule t_NUMBER redefined. Previously defined on line 18
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_dup3.py
|
||||
#
|
||||
# Duplicated rule specifiers
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -24,7 +24,7 @@ def t_NUMBER(t):
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
SyntaxError: lex: no rules of the form t_rulename are defined.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_empty.py
|
||||
#
|
||||
# No rules defined
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,7 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
lex: Warning. no t_error rule is defined.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_error1.py
|
||||
#
|
||||
# Missing t_error() rule
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -17,7 +17,7 @@ t_PLUS = r'\+'
|
|||
t_MINUS = r'-'
|
||||
t_NUMBER = r'\d+'
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
SyntaxError: lex: Rule 't_error' must be defined as a function
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_error2.py
|
||||
#
|
||||
# t_error defined, but not function
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -19,7 +19,7 @@ t_NUMBER = r'\d+'
|
|||
|
||||
t_error = "foo"
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
./lex_error3.py:20: Rule 't_error' requires an argument.
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_error3.py
|
||||
#
|
||||
# t_error defined as function, but with wrong # args
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -20,7 +20,7 @@ t_NUMBER = r'\d+'
|
|||
def t_error():
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
./lex_error4.py:20: Rule 't_error' has too many arguments.
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_error4.py
|
||||
#
|
||||
# t_error defined as function, but too many args
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -20,7 +20,7 @@ t_NUMBER = r'\d+'
|
|||
def t_error(t,s):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,3 +0,0 @@
|
|||
(H_EDIT_DESCRIPTOR,'abc',1,0)
|
||||
(H_EDIT_DESCRIPTOR,'abcdefghij',1,6)
|
||||
(H_EDIT_DESCRIPTOR,'xy',1,20)
|
|
@ -14,7 +14,7 @@
|
|||
# such tokens
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -36,7 +36,7 @@ def t_H_EDIT_DESCRIPTOR(t):
|
|||
return t
|
||||
|
||||
def t_error(t):
|
||||
print "Illegal character '%s'" % t.value[0]
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
./lex_ignore.py:20: Rule 't_ignore' must be defined as a string.
|
||||
Traceback (most recent call last):
|
||||
File "./lex_ignore.py", line 29, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_ignore.py
|
||||
#
|
||||
# Improperly specific ignore declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
lex: Warning. t_ignore contains a literal backslash '\'
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_ignore2.py
|
||||
#
|
||||
# ignore declaration as a raw string
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -22,7 +22,7 @@ t_ignore = r' \t'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
25
ext/ply/test/lex_literal1.py
Normal file
25
ext/ply/test/lex_literal1.py
Normal file
|
@ -0,0 +1,25 @@
|
|||
# lex_literal1.py
|
||||
#
|
||||
# Bad literal specification
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = [
|
||||
"NUMBER",
|
||||
]
|
||||
|
||||
literals = ["+","-","**"]
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
return t
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
25
ext/ply/test/lex_literal2.py
Normal file
25
ext/ply/test/lex_literal2.py
Normal file
|
@ -0,0 +1,25 @@
|
|||
# lex_literal2.py
|
||||
#
|
||||
# Bad literal specification
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = [
|
||||
"NUMBER",
|
||||
]
|
||||
|
||||
literals = 23
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
return t
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
27
ext/ply/test/lex_many_tokens.py
Normal file
27
ext/ply/test/lex_many_tokens.py
Normal file
|
@ -0,0 +1,27 @@
|
|||
# lex_many_tokens.py
|
||||
#
|
||||
# Test lex's ability to handle a large number of tokens (beyond the
|
||||
# 100-group limit of the re module)
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = ["TOK%d" % i for i in range(1000)]
|
||||
|
||||
for tok in tokens:
|
||||
if sys.version_info[0] < 3:
|
||||
exec("t_%s = '%s:'" % (tok,tok))
|
||||
else:
|
||||
exec("t_%s = '%s:'" % (tok,tok), globals())
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
lex.lex(optimize=1,lextab="manytab")
|
||||
lex.runmain(data="TOK34: TOK143: TOK269: TOK372: TOK452: TOK561: TOK999:")
|
||||
|
||||
|
10
ext/ply/test/lex_module.py
Normal file
10
ext/ply/test/lex_module.py
Normal file
|
@ -0,0 +1,10 @@
|
|||
# lex_module.py
|
||||
#
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
import lex_module_import
|
||||
lex.lex(module=lex_module_import)
|
||||
lex.runmain(data="3+4")
|
42
ext/ply/test/lex_module_import.py
Normal file
42
ext/ply/test/lex_module_import.py
Normal file
|
@ -0,0 +1,42 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_module_import.py
|
||||
#
|
||||
# A lexer defined in a module, but built in lex_module.py
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
|
@ -1,30 +0,0 @@
|
|||
# lex_token.py
|
||||
#
|
||||
# Missing t_error() rule
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = [
|
||||
"PLUS",
|
||||
"MINUS",
|
||||
"NUMBER",
|
||||
"NUMBER",
|
||||
]
|
||||
|
||||
states = (('foo','exclusive'),)
|
||||
|
||||
t_ignore = ' \t'
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_NUMBER = r'\d+'
|
||||
|
||||
t_foo_NUMBER = r'\d+'
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
lex.lex(nowarn=1)
|
||||
|
||||
|
55
ext/ply/test/lex_object.py
Normal file
55
ext/ply/test/lex_object.py
Normal file
|
@ -0,0 +1,55 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_object.py
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
class CalcLexer:
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(self,t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(self,t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(self,t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
|
||||
calc = CalcLexer()
|
||||
|
||||
# Build the lexer
|
||||
lex.lex(object=calc)
|
||||
lex.runmain(data="3+4")
|
||||
|
||||
|
||||
|
||||
|
54
ext/ply/test/lex_opt_alias.py
Normal file
54
ext/ply/test/lex_opt_alias.py
Normal file
|
@ -0,0 +1,54 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_opt_alias.py
|
||||
#
|
||||
# Tests ability to match up functions with states, aliases, and
|
||||
# lexing tables.
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
)
|
||||
|
||||
states = (('instdef','inclusive'),('spam','exclusive'))
|
||||
|
||||
literals = ['=','+','-','*','/', '(',')']
|
||||
|
||||
# Tokens
|
||||
|
||||
def t_instdef_spam_BITS(t):
|
||||
r'[01-]+'
|
||||
return t
|
||||
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ANY_NUMBER = NUMBER
|
||||
|
||||
t_ignore = " \t"
|
||||
t_spam_ignore = t_ignore
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lexer.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
t_spam_error = t_error
|
||||
|
||||
# Build the lexer
|
||||
import ply.lex as lex
|
||||
lex.lex(optimize=1,lextab="aliastab")
|
||||
lex.runmain(data="3+4")
|
50
ext/ply/test/lex_optimize.py
Normal file
50
ext/ply/test/lex_optimize.py
Normal file
|
@ -0,0 +1,50 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_optimize.py
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
lex.lex(optimize=1)
|
||||
lex.runmain(data="3+4")
|
||||
|
||||
|
||||
|
50
ext/ply/test/lex_optimize2.py
Normal file
50
ext/ply/test/lex_optimize2.py
Normal file
|
@ -0,0 +1,50 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_optimize2.py
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
lex.lex(optimize=1,lextab="opt2tab")
|
||||
lex.runmain(data="3+4")
|
||||
|
||||
|
||||
|
52
ext/ply/test/lex_optimize3.py
Normal file
52
ext/ply/test/lex_optimize3.py
Normal file
|
@ -0,0 +1,52 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# lex_optimize3.py
|
||||
#
|
||||
# Writes table in a subdirectory structure.
|
||||
# -----------------------------------------------------------------------------
|
||||
import sys
|
||||
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = (
|
||||
'NAME','NUMBER',
|
||||
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||||
'LPAREN','RPAREN',
|
||||
)
|
||||
|
||||
# Tokens
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
t_TIMES = r'\*'
|
||||
t_DIVIDE = r'/'
|
||||
t_EQUALS = r'='
|
||||
t_LPAREN = r'\('
|
||||
t_RPAREN = r'\)'
|
||||
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||||
|
||||
def t_NUMBER(t):
|
||||
r'\d+'
|
||||
try:
|
||||
t.value = int(t.value)
|
||||
except ValueError:
|
||||
print("Integer value too large %s" % t.value)
|
||||
t.value = 0
|
||||
return t
|
||||
|
||||
t_ignore = " \t"
|
||||
|
||||
def t_newline(t):
|
||||
r'\n+'
|
||||
t.lineno += t.value.count("\n")
|
||||
|
||||
def t_error(t):
|
||||
print("Illegal character '%s'" % t.value[0])
|
||||
t.lexer.skip(1)
|
||||
|
||||
# Build the lexer
|
||||
lex.lex(optimize=1,lextab="lexdir.sub.calctab",outputdir="lexdir/sub")
|
||||
lex.runmain(data="3+4")
|
||||
|
||||
|
||||
|
|
@ -1,7 +0,0 @@
|
|||
lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis
|
||||
Traceback (most recent call last):
|
||||
File "./lex_re1.py", line 25, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_re1.py
|
||||
#
|
||||
# Bad regular expression in a string
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -20,7 +20,7 @@ t_NUMBER = r'(\d+'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
lex: Regular expression for rule 't_PLUS' matches empty string.
|
||||
Traceback (most recent call last):
|
||||
File "./lex_re2.py", line 25, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_re2.py
|
||||
#
|
||||
# Regular expression rule matches empty string
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -20,7 +20,7 @@ t_NUMBER = r'(\d+)'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,8 +0,0 @@
|
|||
lex: Invalid regular expression for rule 't_POUND'. unbalanced parenthesis
|
||||
lex: Make sure '#' in rule 't_POUND' is escaped with '\#'.
|
||||
Traceback (most recent call last):
|
||||
File "./lex_re3.py", line 27, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_re3.py
|
||||
#
|
||||
# Regular expression rule matches empty string
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -22,7 +22,7 @@ t_POUND = r'#'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
lex: t_NUMBER not defined as a function or string
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_rule1.py
|
||||
#
|
||||
# Rule defined as some other type
|
||||
# Rule function with incorrect number of arguments
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -20,7 +20,7 @@ t_NUMBER = 1
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
29
ext/ply/test/lex_rule2.py
Normal file
29
ext/ply/test/lex_rule2.py
Normal file
|
@ -0,0 +1,29 @@
|
|||
# lex_rule2.py
|
||||
#
|
||||
# Rule function with incorrect number of arguments
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = [
|
||||
"PLUS",
|
||||
"MINUS",
|
||||
"NUMBER",
|
||||
]
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
def t_NUMBER():
|
||||
r'\d+'
|
||||
return t
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
27
ext/ply/test/lex_rule3.py
Normal file
27
ext/ply/test/lex_rule3.py
Normal file
|
@ -0,0 +1,27 @@
|
|||
# lex_rule3.py
|
||||
#
|
||||
# Rule function with incorrect number of arguments
|
||||
|
||||
import sys
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
tokens = [
|
||||
"PLUS",
|
||||
"MINUS",
|
||||
"NUMBER",
|
||||
]
|
||||
|
||||
t_PLUS = r'\+'
|
||||
t_MINUS = r'-'
|
||||
def t_NUMBER(t,s):
|
||||
r'\d+'
|
||||
return t
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
@ -1,7 +0,0 @@
|
|||
lex: states must be defined as a tuple or list.
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state1.py", line 38, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -3,7 +3,7 @@
|
|||
# Bad state declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -23,17 +23,17 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,8 +0,0 @@
|
|||
lex: invalid state specifier 'comment'. Must be a tuple (statename,'exclusive|inclusive')
|
||||
lex: invalid state specifier 'example'. Must be a tuple (statename,'exclusive|inclusive')
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state2.py", line 38, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -3,7 +3,7 @@
|
|||
# Bad state declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -23,17 +23,17 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,8 +0,0 @@
|
|||
lex: state name 1 must be a string
|
||||
lex: No rules defined for state 'example'
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state3.py", line 40, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state3.py
|
||||
#
|
||||
# Bad state declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -25,17 +25,17 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
lex: state type for state comment must be 'inclusive' or 'exclusive'
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state4.py", line 39, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state4.py
|
||||
#
|
||||
# Bad state declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,7 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
comment = 1
|
||||
|
||||
states = (('comment', 'exclsive'),)
|
||||
|
||||
t_PLUS = r'\+'
|
||||
|
@ -24,17 +24,17 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
lex: state 'comment' already defined.
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state5.py", line 40, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state5.py
|
||||
#
|
||||
# Bad state declaration
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,6 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
comment = 1
|
||||
states = (('comment', 'exclusive'),
|
||||
('comment', 'exclusive'))
|
||||
|
||||
|
@ -25,17 +24,16 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
lex: Warning. no error rule is defined for exclusive state 'comment'
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state_noerror.py
|
||||
#
|
||||
# Declaration of a state for which no rules are defined
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,6 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
comment = 1
|
||||
states = (('comment', 'exclusive'),)
|
||||
|
||||
t_PLUS = r'\+'
|
||||
|
@ -24,17 +23,16 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
lex: No rules defined for state 'example'
|
||||
Traceback (most recent call last):
|
||||
File "./lex_state_norule.py", line 40, in <module>
|
||||
lex.lex()
|
||||
File "../ply/lex.py", line 759, in lex
|
||||
raise SyntaxError,"lex: Unable to build lexer."
|
||||
SyntaxError: lex: Unable to build lexer.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state_norule.py
|
||||
#
|
||||
# Declaration of a state for which no rules are defined
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,6 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
comment = 1
|
||||
states = (('comment', 'exclusive'),
|
||||
('example', 'exclusive'))
|
||||
|
||||
|
@ -25,17 +24,16 @@ t_NUMBER = r'\d+'
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
pass
|
||||
|
||||
import sys
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
(NUMBER,'3',1,0)
|
||||
(PLUS,'+',1,2)
|
||||
(NUMBER,'4',1,4)
|
||||
Entering comment state
|
||||
comment body LexToken(body_part,'This is a comment */',1,9)
|
||||
(PLUS,'+',1,30)
|
||||
(NUMBER,'10',1,32)
|
|
@ -1,9 +1,9 @@
|
|||
# lex_state2.py
|
||||
# lex_state_try.py
|
||||
#
|
||||
# Declaration of a state for which no rules are defined
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -13,7 +13,6 @@ tokens = [
|
|||
"NUMBER",
|
||||
]
|
||||
|
||||
comment = 1
|
||||
states = (('comment', 'exclusive'),)
|
||||
|
||||
t_PLUS = r'\+'
|
||||
|
@ -26,11 +25,11 @@ t_ignore = " \t"
|
|||
def t_comment(t):
|
||||
r'/\*'
|
||||
t.lexer.begin('comment')
|
||||
print "Entering comment state"
|
||||
print("Entering comment state")
|
||||
|
||||
def t_comment_body_part(t):
|
||||
r'(.|\n)*\*/'
|
||||
print "comment body", t
|
||||
print("comment body %s" % t)
|
||||
t.lexer.begin('INITIAL')
|
||||
|
||||
def t_error(t):
|
||||
|
@ -39,8 +38,6 @@ def t_error(t):
|
|||
t_comment_error = t_error
|
||||
t_comment_ignore = t_ignore
|
||||
|
||||
import sys
|
||||
|
||||
lex.lex()
|
||||
|
||||
data = "3 + 4 /* This is a comment */ + 10"
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
SyntaxError: lex: module does not define 'tokens'
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_token1.py
|
||||
#
|
||||
# Tests for absence of tokens variable
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -14,8 +14,6 @@ t_NUMBER = r'\d+'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
SyntaxError: lex: tokens must be a list or tuple.
|
|
@ -1,9 +1,9 @@
|
|||
# lex_token.py
|
||||
# lex_token2.py
|
||||
#
|
||||
# Tests for tokens of wrong type
|
||||
|
||||
import sys
|
||||
sys.path.insert(0,"..")
|
||||
if ".." not in sys.path: sys.path.insert(0,"..")
|
||||
|
||||
import ply.lex as lex
|
||||
|
||||
|
@ -16,7 +16,6 @@ t_NUMBER = r'\d+'
|
|||
def t_error(t):
|
||||
pass
|
||||
|
||||
sys.tracebacklimit = 0
|
||||
|
||||
lex.lex()
|
||||
|
||||
|
|
|
@ -1,2 +0,0 @@
|
|||
lex: Rule 't_MINUS' defined for an unspecified token MINUS.
|
||||
SyntaxError: lex: Unable to build lexer.
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue