Index

EPP 1.1.0 Architecture Overview


Required Knowledge

In order to understand EPP documents, you will need basic knowledge of the Java lanaguage as well as basic knowledge of compilers. Specifically, you must understand the following words.
	pass
	recursive descent parser
	token
	literal
	AST (Abstract Syntax Tree)
	non-terminal
	bootstrap
However, advanced expert knowledge should not be required.

Furthermore, the following concepts familiar to lisp programmers appear.

macro, immutable object, symbol, S-expression, backquote macro, dynamic variable
I shall explain these concepts as much as I can, so that readers who are not experienced in lisp can understand them.

EPP Description Language, Ld-2

The source code of EPP is written using Java language that is extended with EPP. Specifically, the following five plug-ins are used. The source code for these five plug-ins are also written using these five plug-ins. Other than this, we have the Common Lisp source code for the EPP main portion, and source codes for five plug-ins. Using these to bootstrap, EPP can work even if you only have Java.

Of the five plug-ins, the SystemMixin plug-in is especially important. The SystemMixin plug-in implements a new object oriented language called Ld-2 on top of Java.

(Application Programs)
EPP Plug-ins
EPP Core / Java Grammar definition
Object-Oriented Language Ld-2
JavaVM

The Ld-2 lanaguage has a special inheritance mechanism that is different from that of the Java language. With this mechanism, a single class can be divided into multiple "components" called "mixins" which can be described separately. A class is built up by merging the multiple mixins.

In the current implementation, Ld-2 classes created by merging mixins are not compatible with classes of the Java language, and are defined using a different syntax. The syntax for calling methods is also different.


EPP Main Routine

When EPP is invoked, a program named EPP main routine executes. The program is a Java class with the following name.
      jp.go.etl.epp.epp.Epp 
The EPP main routine builds an EPP preprocessor with different configurations for each file to be processed, and invokes those preprocessors.

An EPP preprocessor is an Ld-2 class created by merging multiple mixins. A preprocessor for a specific file is built by creating an Ld-2 class combining the mixin for the plug-in that was specified at the beginning of the file, and the "mixin that defines the standard preprocessor".

Plug-ins can only extend the behavior of EPP preprocessors. They cannot extend the behavior of the EPP main routine. However, by creating a subclass of the jp.go.etl.epp.epp.Epp Java class, you can create a customized EPP main routine.

EPP executes the translation process on a file-by-file basis by default. However, if you specify the -global option, EPP goes into Global Processing Mode and will process all files globally.


EPP Preprocessor

When the EPP preprocessor is invoked from the EPP main routine, the initialization method is called. After that, the input file is processed through the following four passes: parsing pass, macro expanssion pass, type checking pass, and code emitting pass.

The parsing pass will call the lexical analyzer as required. The lexical analyzer will read character-by-character from EppInputStream, a class that is the input stream for the EPP.

Plug-ins can extend the EppInputStream, lexical analyzer, parsing pass, macro expanssion pass, type checking pass, and code emitting pass.

You can also add additional passes prior to, and after the four passes. For further information on adding passes, refer to EPP Preprocessor Core.


Data Structure

Within the preprocessor, three data structures that describe the token, abstract syntax tree, and type are particularly important.

Tokens are data types that are returned by the lexical analyzer.

The abstract syntax tree is created by the parsing pass, and then translated by the type checking pass, passed to the code emitting pass for conversion to character strings and finally written to the output file.

The abstract syntax tree has nodes that have type information and those that do not. The type checking pass converts an abstract syntax tree without type information into an abstract syntax tree with type information.

The three data types mentioned above are all immutable objects. That is, you cannot modify their internal state from within a program.

Plug-ins cannot define subclasses of a class that describes tokens or abstract syntax trees. The data structure of tokens and abstract syntax trees are very versatile and new tokens and syntaxes can be expressed without adding new subclasses.


The Principle of the Extendable Parser

The parser is basically written in recursive descend style. You can add a mixin and extend the method of the parser in order to add a new alternative to the non-terminal. By backtracking and proceeding with context sensitive processing, the parser can handle non LL(1) type syntaxes. It can also handle left recursive rules.

For further information please refer to the following paper:
"高いモジュラリティと拡張性を持つ構文解析器"


Related Information

For further information regarding writing plug-ins, please refer to the following.
Error Handling
Dynamic Variables



The Role of Each Source File

The source code for the EPP main portion is located under epp/src/level0/epp of the distributio package. The major fuctions of the files are shown below.
Epp.java		EPP main routine definition
EppCore.java		EPP preprocessor main portion
EppInputStream.java	EPP input stream
Lex.java		lexical analyzer
CompUnit.java		Java program top level syntax definition
TypeDecl.java		class, interface, method, field syntax defintion
Statement.java		statement syntax definition
ExpNonTerm.java		non-terminal definition related to expressions
Exp.java		expression syntax definition
TypeSystem.java		definition of the Type class and definition of type semantics
TypeCheck.java		definition of the standard Java type checking object
FileSig.java		definition of class types and seperate compilation processing

Index