1. Introduction
This document aims at providing some insight
on the eBNF parser developped for the
ebnf module.
The
high-level functions are called externally by the user.
The other routines are documented to make undestanding of the code smooth and facilitate further changes.
2. High-level BNF functions
The
main routine is the top level user routine.
The
bnf routine is the actual processing rotine,
the
parse routine matches the input sequence to a given grammar,
the
dump routine
the
grammar routines are used to build a grammar from different input formats.
2.1 Debug
The
ebnf package already includes an extensive debug mode to help
developers.
A
debug variable can be set to values ranging from
0
produce no debug information, to
5, the highest debug level.
pyburg.debug=0 # no debug
A
debug value higher than
0 will report errors while
processing the grammer, no match for start symbol in input processing,
and print the tree final cost.
A
debug value higher than
1 will also print the reduced
rules and reports a missing
goal variable or if it unable to produce
grammar from the input arguments.
A
debug value higher than
2 include labeling information
about tree node and rules.
A
debug value higher than
3 reports costs.
A
debug value higher than
4 prints reduce state information and closure setup.
2.2 Main
The
main(argv) function is the high level function called when the
module is directly invoqued.
When the
list, or
tuple, contains two arguments, the first
is taken as the grammar filename, and its contents is processed, and the
second argument is an input data file.
The rotine processes the input, given the grammar, and exits the process
with a 0 (zero) code if the input is accept by the grammar, or exits
the process with a code 2 (two) if the input is rejected by the grammar.
2.3 BNF
The
bnf(filename, data, debug) function matches an input
data
sequence to a grammar, given the grammar's
filename.
The option
debug parameter activates a multi-level verbose mode.
def bnf(filename, data, debug=False)
2.4 parse
The
def parse(data, gram, nterm) function matches an input
data
sequence to a grammar (
gram), given a starting nonterminal symbol,
nterm.
When invoqued externaly, the nonterminal (
nterm) should be the grammar's start symbol.
However, internaly, the routine is recursively invoqued for every potential
nonterminal, and the input
data sequence adjusted accordingly.
If no
gram or
nterm are given, the previous values returned from
the
bnf are used.
The routine uses a global variable
recurs in order to keep track
of ilimited recursion and, therefore, is not reentrant.
def parse(data, gram=None, nterm=None)
2.5 dump
The
dump(gram, start) function is a debug routine that prints the
parsed grammar and start symbol.
If no
gram or
nterm are given, the previous values returned from
the
bnf are used.
def dump(gram=None, start=None)
2.6 grammar
The
grammar(data) function builds a grammar structure and
determines the its start symbol, given its textual description
data as a character string.
A grammar structure is a python's dictionary where the keys are
nonterminal symbols as strings and its values are python's lists of
rules. Each rule is a python's list of terminal and nonterminal symbols,
tagged by type and represented as strings.
def grammar(data)
2. High-level eBNF functions
The
eBNF parser uses the same structure as the
BNF and
the routines have the same names.
The grammar internal representation format is the same, only the
syntactic sugar is different.
The module containing the routines is called
ebnf.py.