1. Introduction
The
BNF package provides two modules (
bnf and
ebnf) for grammar descriptions in the
BNF and
eBNF
format, with the same operators.
The two descriptions are very similar, and a quoting scheme was introduced
in order to allow the definition of any input sequence.
The grammar description can be parsed, for integrity check, and then
input sequences can be tested to check whether they conform to the
given grammar, or not.
The processing of each input sequence provides a boolean result
(
True or
False).
The input sequence matching is performed by a simple LL parser with
backtracking.
This implies that not every grammar is accepted, but there is an
equivalent LL grammar that can be used to describe the target language.
The user must adapt its initial grammar in order to be able to
use this tool.
2. Overview
BNF syntax:
- non-terminals between <>
- rules end at newline \n
- assign with ::=
- operators:
- alternative derivations separated by |
- group items between ()
- optional items between {} or with postfix ? operator
- zero or more repetitions with postfix * operator
- one or more repetitions with postfix + operator
- set of terminal values between []: in set [aeiou], not in set [^aeiou] or ranges [a-z]
The BNF compiler uses a LL parser with backtracking:
- no left-recursion: <X> ::= <X> ...
- no a+ a alike sequences
- longest rule first: rule <X> ::= a | a b must be replaced by <X> ::= a b | a
- special chars <>(){}[]|+*?:= each must be quoted with \
eBNF syntax:
- terminal symbols must be quoted between "": "if"
- rules end with ; not a newline
3. Parsing language descriptions
Grammar example in BNF for a python tuple of integer literals (
tuple.bnf):
<tuple> ::= \( <body> \) | \( \)
<body> ::= <elem> <num> | <elem>
<elem> ::= <num> , <elem> | <num> ,
<num> ::= <dig> <num> | <dig>
<dig> ::= [0-9]
The tuple example in eBNF becomes (
tuple.ebnf):
tuple ::= '(' body ')' | '(' ')' ;
body ::= elem num | elem ;
elem ::= num ',' elem | num ',' ;
num ::= dig num | dig ;
dig ::= [0-9] ;
4. Matching input sequences
Test if an input sequence matches the above grammar with:
echo -n "(12,34,)" | python3 -m ebnf tuple.ebnf
The printed result should be
True or
False whether
the input sequence is accepted by the grammar, or not, respectively.
If the input sequence is store in file (
sequence.txt), use it as a second argument:
python3 -m ebnf tuple.ebnf sequence.txt
Note: input sequence must not contain a newline (
\n) if grammar does not support it (use
echo -n)
When no arguments are given, the grammar is read from the terminal and,
after a first EOF (End-of-file:
ctrl-D in unix or
ctrl-Z in windows),
the input sequence:
prompt$ python3 -m bnf
<x> ::= a b+ c
input sequence: end with EOF (^D) or use ^D^D to end with no EOL
abbc
True
prompt$
Use the environment
DEBUG=1 for a verbose output
(
DEBUG=2 for a more verbose output):
echo -n "(12,34,)" | DEBUG=1 python3 -m ebnf tuple.ebnf
In interactive mode:
>>> from bnf import grammar, parse
>>> grammar("<x> ::= a b+ c\n")
>>> parse("abbc")