The Minimal Imperative Language XIM
We can regard an XIM program as an abstract syntax tree representation of a program written in a high-level concrete imperative language with variables of type float, expressions involving arithmetic operators, the usual Boolean expressions, an assignment statement, one conditional construct (if-then-else), and one iteration construct (while). The formal syntax of XIM is specified using an XML Schema (available electronically; see "Resource Center," page 5). XIM constructs are straightforward. Every statement element in XIM has a @seq attribute, which acts like the symbolic address of the element. The @next attribute gives the address of the next instruction to be executed. In <while> and <if> constructs, attributes @true_next (@false_next) determine control flow when the condition is true (false). The values of these attributes are automatically generated, using XSLT, before execution starts. Listing One is an XIM program that computes 5!, demonstrating the use of assignment and while constructs. Listing Two is the same program in pseudocode.
<?xml version="1.0"?> <program> <vars> <var_declare name="fact"> 1 </var_declare> <var_declare name="last"> 0 </var_declare> <var_declare name="nb"> 5 </var_declare> </vars> <main> <assign varn="last"> <var_use name="nb"/> </assign> <while> <condition> <boolop opname="gt"> <var_use name="last"/> <num> 1</num> </boolop> </condition> <statement_list> <assign varn="fact"> <op opname="*"> <var_use name="fact"/> <var_use name="last"/> </op> </assign> <assign varn="last"> <op opname="-"> <var_use name="last"/> <num> 1</num> </op> </assign> </statement_list> </while> <end/> <!-- program termination --> </main> </program>
var fact <-- 1 var last <-- 0 var nb <-- 5 begin last <-- nb while (last>1) do fact <-- fact*last last <-- last-1 end while end
Operational Semantics of XIM in XSLT
Operational semantics specify the meaning of a program in a source language by giving rules for how the state of an abstract (or real) machine changes for each construct of the source language. The states of the abstract machine can be represented as < code, memory > pairs (called "configurations"), where code represents the remaining computation, and memory is the current contents of the memory of the abstract machine. A transition function specifies the transition from one configuration of the machine to the next. The transition function defines the operational semantics of the language and can also act as an interpreter for it. Using XSLT for specifying the operational semantics of programming languages, a configuration consists of the memory, program, and current value of the "program counter," all inside an XML document. The transition function is specified as XSLT templates.
An XSLT stylesheet implements the operational semantics of XIM. Before execution starts, the stylesheet transforms the XIM program to make it ready for interpretation. First the "program counter" is introduced as a variable in the program, and symbolic addresses are inserted into instructions as @seq attributes. Then the values in the @next, @true_next, and @false_next attributes of the instructions are determined and inserted into the code. These attributes specify the address of the next instruction to execute, as in attribute grammars used in syntax-directed compilation. After the initialization stage, other templates are applied to the transformed XIM program to execute it. The application of templates is an iterative process, which stops when the next instruction is the <end> element.
The top-level template (available electronically) of the stylesheet matches the root and performs the application of templates for initialization and interpretation. Templates for inserting the program counter and calling the Sequencer template (available electronically) insert the "program counter" as an XIM variable named PC and symbolic addresses as @seq attributes. The Sequencer template (available electronically) achieves the insertion of symbolic addresses through the use of the <number> element of XSLT. Values in the @next, @true_next, and @false_next attributes are then inserted by templates that perform a case-by-case analysis of a given node to determine its type, inspect its position relative to other instructions, and set the control flow information as jump addresses accordingly. The template for the implementation of the iterative steps is Interpret and the template for determining the type of current instruction and calling the corresponding template for its execution is Execute; both templates are available electronically. The Assignment template (also available electronically) receives an assignment statement as an XML subtree in the $c parameter. The template also has access to the XML document as a whole via the $prg parameter. The name of the program variable to whom a new value is to be assigned is copied into the XSLT variable $varname. The whole <memory> element of the XML document is recreated to reflect the updated value of the variable, as well as the updated value of the "program counter" PC variable. The template Construct (available electronically) handles the execution of <while> and <if> statements. The instruction, which is an <if> or <while> construct, comes into parameter $c and the code of the program as a whole comes in parameter $prg. The truth value of the condition is determined and assigned to the XSLT variable $condition. Then the <memory> element, with all its children except the PC, is copied. The value of the PC variable is updated depending on the truth value of $condition. When it is "true," the value of the @true_next attribute of the <condition> child of the context node is assigned to the PC variable. Otherwise, the value of the @false_next attribute is assigned to it.
The Evaluate template (available electronically), takes a parameter $n holding an XML subtree representing an expression. Figures 1 and 2 are arithmetic and Boolean expressions, respectively. This template also has access to the whole XIM program in the parameter $p. It first checks whether the incoming parameter is a numerical literal (<num>), a variable (<var_use>), or a complex expression (<op> or <boolop>). If the incoming expression in parameter $n is a <num>, it returns the content of the <num> element. If it is a <var_use> element (an r-value), the value of the <var_declare> element referenced by <var_use> is returned. If the incoming expression is an <op> or <boolop>, the type of operation required is determined from the @opname attribute. The possible values in the @opname attribute for the <op> element are *, +, -, /, intdiv, and mod, while the options for <boolop> are or, and, not, lt, gt, eq, ne, ge, and le, which have their conventional meanings.
Boolean expressions are evaluated fully in the same manner as arithmetic expressions (see the code fragment from template Evaluate that handles Boolean expressions, available electronically).