Writing a JBurg Specification

Example

tl1.jburg is a simple BURG specification with many comments. Take a moment to study it; the 1.1.1.1 branch is the simplest.

General File Layout

File Naming Conventions

JBurg specifications commonly take the file extension .jburg, e.g., tl1.jburg.

Elements of a JBurg Specification

A JBurg specification is made up of comments, a directives section, a rules section, and a cost functions section.

The sections must be supplied in order, but within a section, the directives, rules, and cost functions may be written in any order. Comments may appear anywhere in the specification.

Comments

JBurg specifications may contain // to-end-of-line style comments
and /C-style comments. */

JBurg Directives

Required Directives

The required directives supply the Java class types of the BURM's input and output.

INodeType Directive

EBNF	`"INodeType" inode_type:multipart_identifier SEMI`
Usage	One INodeType directive may be specified. If multiple INodeType directives are specified, the last one wins. The INodeType directive specifies the Java class of the input tree.
Example	`INodeType TestINode;`

ReturnType Directive

EBNF	`"ReturnType" return_type:multipart_identifier SEMI`
Usage	One ReturnType directive per specification. Last one wins. The ReturnType directive specifies the type of objects that JBurg expects to be returned by reductions' action code.
Example directive	`ReturnType InstructionList;`
Example action	`return new InstructionList ( new LDC ( cp.addString( "Hello world" ) ) );`

Optional Directives

BURMProperty Directive

EBNF	`"BURMProperty" property_type:multipart_identifier property_name:IDENTIFIER SEMI`
Usage	Optional. As many BURMProperty directives may be specified as necessary. A BURMProperty directive inserts a property into the generated BURM. A property consists of: A private field of the specified type. A getProperty_name() method. The property name specified is lower-cased with a leading upper-case character, with the literal "get" prepended to form the method name. A setProperty_name() method. The property name specified is lower-cased with a leading upper-case character, with the literal "set" prepended to form the method name.
Example JBurg directive	`BURMProperty org.apache.bcel.generic.ClassGen classGen;`
Example code that sets the property	`ClassGen classGen = new ClassGen ( className, superclassName, inputName, ACC_PUBLIC \| ACC_FINAL \| ACC_SUPER, null ); MyReducer emitter = new MyReducer(); emitter.setClassgen ( classGen );`

Header Directive

EBNF	`"header" { Java code }`
Usage	Optional, but usually present. One per specification; if multiple header directives are present, the last one wins. The header directive supplies a block of Java code that is copied verbatim (including braces) into the generated BURM's Java source file, before the `class` definition. This allows the BURG specification to pass in `import` directives, etc.
See also	Package directive
Example	`header { /** This code generator recognizes assignment statements, simple expressions, and function calls. / / We're using the BCEL bytcode library in this example. / import org.apache.bcel.generic.; }`

Implements Directive

EBNF	`"implements" implements_interface_name:multipart_identifier SEMI`
Usage	As many implements directives may be specified as necessary. The implements directive specifies an interface that the generated BURM is to implement. This is often necessary to give the BURM multiple interfaces that contain the symbolic names of the INodes' operators.
Example	`implements MyParserConstants; implements MyNodeTypes; implements org.apache.bcel.Constants;`

Package Directive

EBNF	`"package" package_name:multipart_identifier SEMI`
Usage	One Package directive per specification; last one wins. The Package directive specifies the name of the package (if any) that is to contain the generated BURM.
See also	Header directive. The package name can also be coded into the header block; recommended practice is to use the Package directive, in case future versions of JBurg learn specialized processing procedures for a specific package.
Example	`package jburg.test.tl1.reducer;`

Rules Section

Each rule produces a goal. A goal is similar to a reduction in a parser generator, in reverse: a successful goal replaces the input AST with an output object. In the case of a code generator, the output is usually a code fragment, e.g., a BCEL InstructionList or an ABC InstructionList.

Each rule is also associated with a cost. In most cases, a cost is a simple integer; it can also be computed via a function, with the AST as a parameter.

Pattern-Matching Rules

EBNF	`IDENTIFIER EQUALS operator_specification cost_specification { Java code }`
Usage	This rule reduces an AST node of a particular kind (NODE_KIND), with either no, one, or two subgoals. The subgoals are similar to non-terminals in a parse generator.
Example 1	`int = PLUS(int i1, int i2): 1 { /* code to add two int values */ }` This pattern-matching rule produces an int, given a PLUS node with two children that can both satisfy the "int" goal.

Terminal Rules

EBNF	`IDENTIFIER EQUALS IDENTIFER LPAREN VOID RPAREN { Java code }`
Usage	These are trivial pattern rules, where the pattern consists solely of the leaf.
Example	`int = INTEGER_LITERAL(void) { code to implement integer literal }`

Simple Transformation Rules

EBNF	`IDENTIFIER EQUALS IDENTIFIER SEMI`
Usage	Transformation rules allow the code generator to use one goal to satisfy another goal.
Example	`numeric_value = int;`
	This transformation tells the BURM that the "int" goal can satisfy the "numeric_value" goal. Since this is a simple transformation, the cost will be carried over from the int goal.

Complex Transformation Rules

EBNF	`IDENTIFIER EQUALS IDENTIFIER cost_specification { Java code }`
Usage	This transformation rule also allows the code generator to use one goal to satisfy another, but specifies some additional processing that accomplishes the transformation. The cost spec should only consider the cost of the transformation code; the code generator will add in the cost of the original node.
Example	`int = numeric_value : 1 { code to convert an arbitrary number to an int}`

Cost Functions Section

EBNF IDENTIFIER LPAREN RPAREN { Java code }

Usage

EBNF	`IDENTIFIER LPAREN RPAREN { Java code }`
Usage	Cost functions are Java code that returns an int value. The value is used to compute the cost of a particular candidate reduction. The BURM searches for the lowest total cost sequence of reductions to rewrite an input subtree, so low values mean "good cost," higher values mean "less desirable." The cost function has a single implicit parameter, `p`, the input node that is to be analyzed.
Example	`/** * @return "true" if the given node's int value is within * the range representable in a byte. */ canBIPUSH() { return (p.intValue() < 256)? 1: 1000000; }`

Cost functions are Java code that returns an int value. The value is used to compute the cost of a particular candidate reduction. The BURM searches for the lowest total cost sequence of reductions to rewrite an input subtree, so low values mean "good cost," higher values mean "less desirable."

The cost function has a single implicit parameter, p, the input node that is to be analyzed.

Example


/**
 *  @return "true" if the given node's int value is within
 *      the range representable in a byte.
 */
canBIPUSH()
{
    return (p.intValue() < 256)? 1: 1000000;
}