The Parasol Programming Language
An object-oriented language that supports network and parallel computing
Robert Jervis
Bob is an independent consultant and can be reached at Wizard Consulting Services Inc., 17645 Via Sereno, Monte Sereno, CA 95030 or [email protected].
A programming language is a manifesto from its creator declaring what's good and bad in programming. The good becomes a feature, the bad an error.
I created Parasol (Parallel Systems Object Language) to implement an operating system. In 1982 I tried to build an extensible version of UNIX, using a spare PDP-11 at work after hours. I started coding the system in C and got a multitasking kernel running, but extensibility proved to be a problem. I set aside the project until I owned my own computer and had the time to go further. By the time I had resumed the project in 1989, I had become convinced C wasn't up to the task, so I designed Parasol.
Although C and Smalltalk are its primary sources, Parasol's design was influenced by many languages, including C++, CLU, Algol, and Turbo Pascal. Parasol had to be as efficient as C, while incorporating some aspect of the object-oriented capabilities of Smalltalk. When I designed Parasol I was working on a C++ compiler project, so I knew that C++ implemented classes in a way that avoided the performance issues of Smalltalk.
I made two decisions at the outset which determined the general outline of Parasol: While using C as a starting point for ideas, Parasol did not have to accept ANSI C code; and secondly, instances of classes did not have to be "first class" objects.
This latter point is important. In making a programming language, it would be nice if all objects could be treated as uniformly as possible. In APL, arrays are considered first class because almost any operator that can be applied to a scalar value can be applied to an array. In Smalltalk, all objects are first class because they have a type derived from a single common ancestor and, other than some necessary magic glue in some of the low-level classes to do arithmetic and control flow efficiently, all Smalltalk classes are written in Smalltalk itself.
Smalltalk lets you add classes that are just as capable as built-ins because the language syntax is very simple. By contrast, C has a complex type declaration and expression syntax and a rich set of scalar numeric types and operators. So to make user-defined classes first class, a language such as C++ must add many features like references, operator overloading, constructors, and destructors. C++ is made more complicated by the need for all that syntax to define new types.
I avoided this with Parasol. All structures in Parasol are considered classes and can have method functions defined for them. Since they're structures, they can't be used with arithmetic operators. Consequently, a minimum of new concepts are needed in Parasol beyond what is already found in C.
In the last three years I've added distributed and parallel programming constructs to Parasol, including interprocess messages and multithreaded applications.
The Parasol language itself (including the name) is in the public domain. The current implementation is for a 32-bit stand-alone operating system that uses DOS disk files. It is available as unsupported shareware and includes the operating system and all source code for the compiler and libraries. This implementation is still a research project, so no promises about bugs. It does run on most desktop 80386/486 DOS systems, but it doesn't recognize DOS 6 compressed disk partitions. A SPARC/UNIX implementation is in the works.
Parasol 0.5 is available electronically from DDJ (see "Availability," page 3). Alternatively, registered versions can be purchased directly from me.
Declaration Syntax
Parasol declaration syntax is closer to Pascal than to C. Example 1(a) is a simplified version of the general form of a Parasol declaration. Functions are declared in almost the same way, see Example 1(b). A new type name can be introduced with the code in Example 1(c). For objects, you can declare more than one name in a single declaration by simply using a list of identifiers separated by commas; see Example 1(d).
Like many Algol-like languages, Parasol is block structured (although more in the spirit of C than Pascal, since functions can't be nested). All symbols must be unique within their own scope of definition, but symbol names can have distinct declarations in any number of different scopes. A reference in an expression always refers to the symbol definition in the "closest" enclosing scope. If you refer to a symbol defined in the same scope, it doesn't matter whether the definition occurs before or after the reference. Thus the pairs of statements in Examples 1(e) and 1(f) are equivalent in the body of a function.
Local declarations in a function body don't have to occur at the top of each block. Like C++, they can occur anywhere in the block.
Exposure determines how accessible a symbol is outside its own scope. A Parasol exposure can be public, private, or visible. For example, an object declared at file scope can be global (accessible in other modules) by using public or visible exposure, or local to its own file by using private exposure. In most circumstances, the exposure of an object will default to private if you don't specify otherwise. This encourages encapsulation, since you must explicitly decide which symbols are public to the outside world. Public objects can be read or written anywhere, but visible objects can only be modified from within their own scope. Private objects can neither be read nor written outside their own scope. A public integer might be declared as in Example 2(a).
Parasol's numeric types include two integral and one floating-point type: signed, unsigned, and float. These are the only truly built-in Parasol numeric types. You can specify a size in bits for each of these types. A variety of type synonyms are predefined for commonly used sizes. The actual amount of memory an object consumes is left up to the compiler, however. For example, Example 2(b) shows some of the predefined type names and their sizes for the Intel 32-bit implementation.
Note that if you omit a size, the compiler picks a default. For example, plain signed, unsigned, and float objects all happen to be 32 bits wide on the Intel implementation.
You should use the type synonyms, especially for the floating-point types, because the exact sizes will vary from one machine to another. The actual compiled sizes are chosen to be at least as large as declared, but, at the very least, an efficient fit to the performance of the machine. Thus, int would be typically either 16, 32, or 64 bits wide. The long type must be the widest integer size available on that machine and is typically either 32 or 64 bits wide.
You can declare integral bit fields with exact sizes by defining them within a packed structure. You should only resort to exact bit sizes in declarations for externally specified data formats, like system control blocks.
The ref keyword means "pointer-to." Parentheses enclose the arguments to a function. Unlike C, empty parentheses mean that no arguments are allowed. For example, an integer absolute-value function might be declared as in Example 2(c). Square brackets, on the other hand, declare an array. A buffer of 512 bytes might be declared as in Example 2(d).
More complex data types are constructed by stringing declarators together in left-to-right order. For example, Example 2(e) declares x as a pointer to a function returning a pointer to an array of ten singles.
Classes
You declare a class by enclosing a list of declarations inside curly braces, with some optional modifiers in front of the curly braces. This enclosed list, as in Example 3(a), creates a new structure or class type.
Example 3(b) declares a structure named point (for a 2-D graphics package). In this example, the symbols x and y are members of the class. The whole declaration gives this new type a name: point. Objects of type point can now be declared and manipulated.
Class modifiers are union, packed, or inherit. The union keyword declares a C-like union, where all object members overlap one another. The packed keyword signals that bit-sized members should be packed into words as densely as possible. The inherit keyword (followed by the name of a class type) declares that this is a derived class.
You can declare anything inside a class, including other classes. There are some differences between declarations made inside a class and outside. Objects declared inside a class are not static by default, but instead are fields of a structure. Functions can be declared inside a class as well, where they are called "methods." Methods are not called in quite the same way as normal functions.
You call a method by designating an object of its class in the call expression. The syntax requires that you name the object to the left of the method name; see Example 3(c). Listing One (page 110) shows an example of an object, O1, with one public method, func. The main routine contains a call to that method.
Inheritance
Parasol supports only single inheritance so that when you declare a derived class, you can name only one base class. The memory for an object is laid out fairly simply, with the memory for the members of the base class first, followed by the memory for the derived class members; see Figure 1. Space is allocated in a derived class for newly defined members, even if they have the same name as a member of the base class.
Parasol is like C++ in that you can refer to exposed (public or visible) members of a class using the dot or arrow operators (depending on whether you have an object or a pointer to the object). You can also refer to members from within method function code.
Just as local block scopes nest within the body of a function, in Parasol the body of a class forms a scope, which all of the enclosed methods share. Listing Two (page 110) illustrates a method, hypot, which computes the hypotenuse of a right-triangle for the coordinates of the point object. This method refers to the two members, x and y, of the enclosing class. Remember that members are just fields in a structure, so these references must be to some object. Since you must mention an object in a call to any method, it is this object that you are actually referring to. In effect, the address of the mentioned object is passed as a hidden parameter in a call to a method.
In more complicated situations (such as where you have derived classes), the chain of base classes form a set of nested scopes as well. Thus, when matching names to variables, after the compiler has exhausted all the local block scopes inside a function, it next looks in the list of members of the enclosing class. If the desired symbol is not found there, before the compiler moves to examine the scope enclosing the type (usually file scope), it looks along the chain of base classes. The effect of this is that you can redefine methods in subclasses.
You can use two keywords to access the hidden object parameter. The self keyword is a pointer to the object passed to the method. Its type is a pointer to the enclosing class. In a derived class, you can also use the keyword super to refer to the same object. This keyword has the same value as self but is a pointer to the base class of the object.
The super keyword comes in handy if you want to call methods in the base class that have been redeclared in the subclass.
Inheritance provides an excellent way to help organize and document interfaces. By exploiting the redefinition of methods in derived classes, you can design a much more structured and well-organized program than with C.
Polymorphism
The Parasol windowing library defines a set of common capabilities that all windows share. Thus all windows have a redraw function that gets called when a window is moved or resized. In C, you would have to implement such a capability in a couple of different ways.
In one way, you would have some master redraw function, that is, a huge switch statement. For every type of window in the program, you would have a case that controls the redraw of that window. This means that every time you add a new window type to your program, you need to modify this function. Since the window library has several functions beside redraw, there are several different switch statements in different places, each requiring updating whenever a new window type is added.
A better way, in C, is to define a set of function pointers that you store with each window. At run time, when you actually create a window, the appropriate function pointers are copied to the object. That way, when you need to redraw a window, you simply call through the pointer. This has the advantage of allowing you to cleanly add window types without having to change existing code.
Parasol (like C++) provides convenient syntax to make the function pointer solution easy to implement. In Parasol, you simply declare a method with the keyword dynamic. Then the compiler arranges for an object of the type containing the method to use a run time pointer in all calls of the method. In Listing Three (page 110), a window class is defined, containing a redraw function that accepts no arguments and produces no return value. Then, an editor class is created that defines a new version of redraw that does the specific redrawing operations for an editor.
There are some restrictions on dynamic functions in Parasol. First of all, for statically-bound methods, there are no restrictions on how you redefine the arguments or the return value of the function. They can be arbitrarily changed in a subclass. A dynamic function, on the other hand, must be redeclared with the same arguments and return type as it was originally defined in the base class.
If you look again at Listing Three, the function at the bottom is passed a pointer to a window object, and it calls the redraw function. If the argument passed actually points to an editor and not a window, the call will automatically go to the version of the redraw method for an editor at run time. Because the caller doesn't know the actual object it is calling, at compile time, the arguments and return type must be fixed for all versions of the redraw function.
Note also that the body of the editor's redraw function uses super seemingly to call itself. In fact, because this call uses super, the code generated is a call to the redraw of the window class (the base for editor). This is a common construct in the windowing library.
C++ provides essentially the same capabilities as dynamic functions, but just calls them "virtual" functions instead. Parasol does allow one element of flexibility that I haven't found in C++ compilers: If you assign a pointer to a subclass object to a pointer to the base class, C++ rejects this assignment. Parasol accepts it. As long as you are copying from a more specific subclass to the more general base, this copy is considered legal in Parasol (though not in the other direction, of course). In Listing Three, you can assign the address of an editor object to a pointer to window, but not the address of a window object to a pointer to editor.
Units
One of the real shortcomings of C++ is that class definitions are written in header files and included in each compilation unit. Consequently, common information describing a class is replicated in all these separate modules. Methods have to be written outside the class so that they can be placed outside the headers. A great deal of effort has been put into C++ compilers to overcome the inefficiencies and complexities that arise from these constraints.
Parasol overcomes these limitations by changing the program structure. In Parasol, you simply write source units (analogous to modules in Modula-2 or units in Turbo Pascal). By declaring an object public in a unit, that object's definition is available in other compiles. To gain access to a unit's public types, (such as objects and functions) another unit must explicitly include the definition; see Example 4(a).
One advantage of this scheme is that public symbols can be duplicated in different units of the same program. This means that libraries obtained from different sources won't have public symbols that clash. If two units sharing a common public symbol are both included into a third source, you can still disambiguate references using the :: operator; see Example 4(b).
Messages and Threads
Parasol allows you to define special objects that can exchange messages with other processes. An object receiving messages must be defined as a subclass of the built-in class called "external." The subclass then defines a set of methods, each marked with the gate keyword. A client can send messages to the object by first obtaining a special far pointer to it from the operating system or messaging library. You actually send a message by simply calling a gate method using the far pointer. The Parasol compiler generates code to send the arguments (as the body of the message) and wait for a reply, which then becomes the return value.
These capabilities make for a natural scheme for defining client/server applications. The server is simply an object subclassed from external, and the client is any Parasol program.
External objects are designed to operate as separate processes. The Parasol library includes facilities to start and control these threads. The object-oriented capabilities of Parasol have proven useful even for multithreaded applications. Since Parasol's libraries are designed to use objects, there are few if any static variables (which tend to cause trouble for multithreaded applications).
Conclusion
I began designing Parasol with the idea of fixing some syntax problems in C and adding a minimum of new features. The changed syntax certainly presents a barrier for people with large bodies of C code, but the binary import/export mechanism of units and the object-oriented extensions are real enhancements to C. Parasol is simpler than C++, so I spend my time coding solutions, not exploring exotic features.
Parasol has a number of features I haven't even mentioned here, including exceptions, but altogether it is still a fairly compact language. The compiler I've written is fast (compiling over 60,000 lines per minute on my 66MHz 486), and the code is as good as unoptimized C. Parasol should optimize at least as easily as C, I just haven't had time to write an optimizer. I'm now working on a Parasol-to-C translator to make the language more readily available to people.
Parasol began as yet another object-oriented language, and while it has advantages, who needs another one of those? Now that Parasol has messages and threads, it is more than just another OO language. Network and parallel computing need new languages that give the programmer some help. I think Parasol does just that.
Example 1: (a) A typical Parasol declaration; (b) declaring a function; (c) introducing a new type name; (d) declaring more than one name in a single declaration; (e) and (f) these statements are equivalent in the body of a function
(a) name: exposure type-declaration = initializer; (b) name: exposure type-declaration = { statements; } (c) name: exposure type type-declaration; (d) i, j, k: int; (e) i: int; i = 7; (f) i = 7; i: int;
Example 2: (a) delcaring a public integer; (b) pre-defined type names and their sizes; (c) declaring an integer absolute value function; (d) declaring a bugger of 512 bytes; (e) declaring a pointer to a function and returning a pointer to an array.
(a) i: public int; (b) byte: public type unsigned[8]; short: public type signed[16]; int: public type signed[32]; long: public type signed[64]; single: public type float[32]; double: public type float[64]; extended: public type float[80]; (c) abs: (x: int) int = { return x >= 0 ? x : -x; } (d) buffer: [512] byte; (e) x: ref () ref [10] single;
Example 3: (a) Creating a new structure or class type; (b) declaring a structure named point; (c) Parasol syntax requires that you name the object to the left of the method name
(a) class-modifiers { declarations; } (b) point: type { public: x, y: short; }; (c) object method ( arguments );
Example 4:
(a) Unit a: xyz: public int; Unit b: include a; ... xyz ... (b) Unit aa: xyz: public int; Unit bb: xyz: public double; Unit cc: include aa, bb; ... aa::xyz ... ... bb::xyz ...
Figure 1: Memory layout for a Parasol object.
_THE PARASOL PROGRAMMING LANGUAGE_ by Bob Jervis [LISTING ONE] // O1 is an object with class type. // The type of O1 is anonymous. O1: { hidden: int; public: record: (i: int) = { hidden = i; } func: (i: int) int = { return i * 3 + hidden; } }; main: entry () = { x: int; O1 record(3); x = O1 func(5); // Method call printf("Value is %d\n", x); // Prints 'Value is 18' }[LISTING TWO]
<a name="02bc_0012"> include math; point: type { x, y: short; hypot: () single = { f, g: short; f = x; g = y; return sqrt(f * f + g * g); } <a name="02bc_0013"><a name="02bc_0014">[LISTING THREE]
<a name="02bc_0014"> window: type { public: redraw: dynamic () = { ... } }; editor: type inherit window { public: redraw: dynamic () = { super redraw(); ... } }; func: (p: ref window) = { p redraw(); }
Copyright © 1993, Dr. Dobb's Journal