250 lines
7 KiB
Text
250 lines
7 KiB
Text
|
PLY (Python Lex-Yacc) Version 1.2 (November 27, 2002)
|
||
|
|
||
|
David M. Beazley
|
||
|
Department of Computer Science
|
||
|
University of Chicago
|
||
|
Chicago, IL 60637
|
||
|
beazley@cs.uchicago.edu
|
||
|
|
||
|
Copyright (C) 2001 David M. Beazley
|
||
|
|
||
|
$Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $
|
||
|
|
||
|
This library is free software; you can redistribute it and/or
|
||
|
modify it under the terms of the GNU Lesser General Public
|
||
|
License as published by the Free Software Foundation; either
|
||
|
version 2.1 of the License, or (at your option) any later version.
|
||
|
|
||
|
This library is distributed in the hope that it will be useful,
|
||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||
|
Lesser General Public License for more details.
|
||
|
|
||
|
You should have received a copy of the GNU Lesser General Public
|
||
|
License along with this library; if not, write to the Free Software
|
||
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||
|
|
||
|
See the file COPYING for a complete copy of the LGPL.
|
||
|
|
||
|
Introduction
|
||
|
============
|
||
|
|
||
|
PLY is a 100% Python implementation of the common parsing tools lex
|
||
|
and yacc. Although several other parsing tools are available for
|
||
|
Python, there are several reasons why you might want to consider PLY:
|
||
|
|
||
|
- The tools are very closely modeled after traditional lex/yacc.
|
||
|
If you know how to use these tools in C, you will find PLY
|
||
|
to be similar.
|
||
|
|
||
|
- PLY provides *very* extensive error reporting and diagnostic
|
||
|
information to assist in parser construction. The original
|
||
|
implementation was developed for instructional purposes. As
|
||
|
a result, the system tries to identify the most common types
|
||
|
of errors made by novice users.
|
||
|
|
||
|
- PLY provides full support for empty productions, error recovery,
|
||
|
precedence specifiers, and moderately ambiguous grammars.
|
||
|
|
||
|
- Parsing is based on LR-parsing which is fast, memory efficient,
|
||
|
better suited to large grammars, and which has a number of nice
|
||
|
properties when dealing with syntax errors and other parsing problems.
|
||
|
Currently, PLY builds its parsing tables using the SLR algorithm which
|
||
|
is slightly weaker than LALR(1) used in traditional yacc.
|
||
|
|
||
|
- Like John Aycock's excellent SPARK toolkit, PLY uses Python
|
||
|
reflection to build lexers and parsers. This greatly simplifies
|
||
|
the task of parser construction since it reduces the number of files
|
||
|
and eliminates the need to run a separate lex/yacc tool before
|
||
|
running your program.
|
||
|
|
||
|
- PLY can be used to build parsers for "real" programming languages.
|
||
|
Although it is not ultra-fast due to its Python implementation,
|
||
|
PLY can be used to parse grammars consisting of several hundred
|
||
|
rules (as might be found for a language like C). The lexer and LR
|
||
|
parser are also reasonably efficient when parsing typically
|
||
|
sized programs.
|
||
|
|
||
|
The original version of PLY was developed for an Introduction to
|
||
|
Compilers course where students used it to build a compiler for a
|
||
|
simple Pascal-like language. Their compiler had to include lexical
|
||
|
analysis, parsing, type checking, type inference, and generation of
|
||
|
assembly code for the SPARC processor. Because of this, the current
|
||
|
implementation has been extensively tested and debugged. In addition,
|
||
|
most of the API and error checking steps have been adapted to address
|
||
|
common usability problems.
|
||
|
|
||
|
How to Use
|
||
|
==========
|
||
|
|
||
|
PLY consists of two files : lex.py and yacc.py. To use the system,
|
||
|
simply copy these files to your project and import them like standard
|
||
|
Python modules.
|
||
|
|
||
|
The file doc/ply.html contains complete documentation on how to use
|
||
|
the system.
|
||
|
|
||
|
The example directory contains several different examples including a
|
||
|
PLY specification for ANSI C as given in K&R 2nd Ed. Note: To use
|
||
|
the examples, you will need to copy the lex.py and yacc.py files to
|
||
|
the example directory.
|
||
|
|
||
|
A simple example is found at the end of this document
|
||
|
|
||
|
Requirements
|
||
|
============
|
||
|
PLY requires the use of Python 2.0 or greater. It should work on
|
||
|
just about any platform.
|
||
|
|
||
|
Resources
|
||
|
=========
|
||
|
|
||
|
More information about PLY can be obtained on the PLY webpage at:
|
||
|
|
||
|
http://systems.cs.uchicago.edu/ply
|
||
|
|
||
|
For a detailed overview of parsing theory, consult the excellent
|
||
|
book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
|
||
|
Ullman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown
|
||
|
may also be useful.
|
||
|
|
||
|
Given that this is the first release, I welcome your comments on how
|
||
|
to improve the current implementation. See the TODO file for things that
|
||
|
still need to be done.
|
||
|
|
||
|
Acknowledgments
|
||
|
===============
|
||
|
|
||
|
A special thanks is in order for all of the students in CS326 who
|
||
|
suffered through about 25 different versions of these tools :-).
|
||
|
|
||
|
Example
|
||
|
=======
|
||
|
|
||
|
Here is a simple example showing a PLY implementation of a calculator with variables.
|
||
|
|
||
|
# -----------------------------------------------------------------------------
|
||
|
# calc.py
|
||
|
#
|
||
|
# A simple calculator with variables.
|
||
|
# -----------------------------------------------------------------------------
|
||
|
|
||
|
tokens = (
|
||
|
'NAME','NUMBER',
|
||
|
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
|
||
|
'LPAREN','RPAREN',
|
||
|
)
|
||
|
|
||
|
# Tokens
|
||
|
|
||
|
t_PLUS = r'\+'
|
||
|
t_MINUS = r'-'
|
||
|
t_TIMES = r'\*'
|
||
|
t_DIVIDE = r'/'
|
||
|
t_EQUALS = r'='
|
||
|
t_LPAREN = r'\('
|
||
|
t_RPAREN = r'\)'
|
||
|
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
|
||
|
|
||
|
def t_NUMBER(t):
|
||
|
r'\d+'
|
||
|
try:
|
||
|
t.value = int(t.value)
|
||
|
except ValueError:
|
||
|
print "Integer value too large", t.value
|
||
|
t.value = 0
|
||
|
return t
|
||
|
|
||
|
# Ignored characters
|
||
|
t_ignore = " \t"
|
||
|
|
||
|
def t_newline(t):
|
||
|
r'\n+'
|
||
|
t.lineno += t.value.count("\n")
|
||
|
|
||
|
def t_error(t):
|
||
|
print "Illegal character '%s'" % t.value[0]
|
||
|
t.skip(1)
|
||
|
|
||
|
# Build the lexer
|
||
|
import lex
|
||
|
lex.lex()
|
||
|
|
||
|
# Precedence rules for the arithmetic operators
|
||
|
precedence = (
|
||
|
('left','PLUS','MINUS'),
|
||
|
('left','TIMES','DIVIDE'),
|
||
|
('right','UMINUS'),
|
||
|
)
|
||
|
|
||
|
# dictionary of names (for storing variables)
|
||
|
names = { }
|
||
|
|
||
|
def p_statement_assign(t):
|
||
|
'statement : NAME EQUALS expression'
|
||
|
names[t[1]] = t[3]
|
||
|
|
||
|
def p_statement_expr(t):
|
||
|
'statement : expression'
|
||
|
print t[1]
|
||
|
|
||
|
def p_expression_binop(t):
|
||
|
'''expression : expression PLUS expression
|
||
|
| expression MINUS expression
|
||
|
| expression TIMES expression
|
||
|
| expression DIVIDE expression'''
|
||
|
if t[2] == '+' : t[0] = t[1] + t[3]
|
||
|
elif t[2] == '-': t[0] = t[1] - t[3]
|
||
|
elif t[2] == '*': t[0] = t[1] * t[3]
|
||
|
elif t[2] == '/': t[0] = t[1] / t[3]
|
||
|
|
||
|
def p_expression_uminus(t):
|
||
|
'expression : MINUS expression %prec UMINUS'
|
||
|
t[0] = -t[2]
|
||
|
|
||
|
def p_expression_group(t):
|
||
|
'expression : LPAREN expression RPAREN'
|
||
|
t[0] = t[2]
|
||
|
|
||
|
def p_expression_number(t):
|
||
|
'expression : NUMBER'
|
||
|
t[0] = t[1]
|
||
|
|
||
|
def p_expression_name(t):
|
||
|
'expression : NAME'
|
||
|
try:
|
||
|
t[0] = names[t[1]]
|
||
|
except LookupError:
|
||
|
print "Undefined name '%s'" % t[1]
|
||
|
t[0] = 0
|
||
|
|
||
|
def p_error(t):
|
||
|
print "Syntax error at '%s'" % t.value
|
||
|
|
||
|
import yacc
|
||
|
yacc.yacc()
|
||
|
|
||
|
while 1:
|
||
|
try:
|
||
|
s = raw_input('calc > ')
|
||
|
except EOFError:
|
||
|
break
|
||
|
yacc.parse(s)
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|