Yacc(1) reads the grammar specification in the file
filename and generates an LR(1) parser for it. The parsers
consist of a set of LALR(1) parsing tables and a driver routine
written in the C programming language. Yacc(1) normally
writes the parse tables and the driver routine to the file
y.tab.c.
The following options are available:
-bfile_prefix
Use file_prefix as the prefix prepended to the output file
names instead of the character y.
-d
Write the header file option causes the header file
y.tab.h.
-l
Do not insert #line directives in the generated code.
The #line directives let the C compiler relate errors in the
generated code to the user's original code. Any #line
directives specified by the user will be retained. By default, the
directives are inserted.
-psymbol_prefix
use symbol_prefix as the prefix prepended to
yacc(1)-generated symbols. The default prefix is the string
yy.
-r
Produce separate files for code and tables. The code file is
named y.code.c, and the tables file is named
y.tab.c.
-t
Change the preprocessor directives generated by yacc(1)
so that debugging statements will be incorporated in the compiled
code.
-v
Write a human-readable description of the generated parser to
the file y.output.
The yacc(1) input file consists of three sections,
separated by a line with just %% in it:
definitions %% rules %% user code
The definitions section contains declarations; it can also
contain C comments (delimited by /* and */), or a
literal block of C code, copied to the beginning of the generated
file. This literal block usually contains declaration and #include
lines. The following keywords can be used in the definitions
section:
%{...%}
Delimits a literal block of C code. The literal block usually
contains declarations of variables and functions used by the code
in the rules section.
%leftoperator
Declares an operator left-associative. Operators must be declared
in increasing order of precedence.
%nonassocoperator
Declares an operator non-associative. Operators must be declared in
increasing order of precedence.
%rightoperator
Declares an operator right-associative. Operators must be declared
in increasing order of precedence.
%startrulename
Declares the first rule the parser should start parsing. Normally,
this is the first rule in the rules section, but this declaration
explicitly labels a different rule.
%tokenname ...
Defines a symbolic token, a terminal symbol (one that the parser
will not attempt to reduce). They are represented internally by
integer values; you can assign a value directly to a token with the
%token directive, but this is not recommended. Other tokens
are individual characters in single quotes or are defined by
%left, %right, and %nonassoc.
%typetypename,name,...
Declares a non-terminal token as a particular type. The
type must already have been defined by a %union.
%union
Identify all possible C types a symbol value can have; this union
is declared as type YYSTYPE in the generated source file. The
format is:
%union {
...field declarations
}
The rules section is the heart of the grammar. Each rule starts
with a non-terminal symbol and a colon, followed by a list of
symbols, literal tokens, and actions. The list can be empty. For
example, this rule states that a time is an hour, a minute, and a
second value, joined by colons:
time: hour ':' minute ':' second ;
(This assumes we've already defined hour, minute, and second.) The
semicolon at the end is traditional, but optional. If consecutive
rules have the same left-hand side, rules after the first can start
with a vertical bar, rather than the name and a colon. This is
somewhat easier to read but the semicolon must be omitted before
the vertical bar.
An action in a rule is a C compound statement to be
executed whenever the parser reaches that point in the grammar. In
the action, the name $$ stands for the symbol on the
left-hand side, and a dollar sign followed by a digit n
stands for the n symbol on the right-hand side. In the
example above:
time: hour ':' minute ':' second
{printf("time is %d:%d:%d\n",$1, $3, $5); }
;
If no action is defined, the action is:
{ $$ = $1; }
The user subroutines section contains routines called from the
actions. It is copied directly into the C file.
If there are rules that are never reduced, the number of such
rules is reported on standard error. If there are any LALR(1)
conflicts, the number of conflicts is reported on standard
error.