Adding support for DSPs
Pete is a programmer and can be reached at [email protected].
Small-C is a subset of the full C language. Originally introduced by Ron Cain (DDJ, July 1980), Small-C is based on the same concepts/syntax as K&R C. Foremost among its features, the Small-C compiler is open source and available for a variety of platforms/microcontrollers. All in all, there are probably a dozen or more flavors of Small-C, ranging from versions for the original 8086 to the 6502. But one target platform Small-C hasn't been available for is Motorola's DSP56800 series of digital signal processors (http://e-www.motorola.com/webapp/ sps/site/taxonomy.jsp?nodeId=0127956292). In this article, I present my port of Small-C to the DSP56800. The Small-C compiler, the DSP56800 port, and additional tools and information are available electronically from DDJ (see "Resource Center," page 5) and at http://petegray.newmicros.com/.
For the host development platform, I selected MS-DOS because both Small-C 2.2 for the 8086 and Motorola's JTAG flash loader run under it. The development workspace was a tougher choice, however. With the original Small-C, the host and target machines were the same, and the compiler could compile itself. However, when host and target machines differas with cross compilersthis isn't possible, so the host required an additional compiler to compile the cross compiler. In this case, I settled on DJGPP, which is based on gcc, and includes the RHIDE integrated development environment (IDE).
The DSP56800-based target I had in mind was the IsoPod, a controller board from New Micros (http://www.newmicros .com/) that includes a variety of general-purpose digital I/O lines, serial channels, timers, outputs, and decoders that let it control motors, robots, and the like. Compared to the 8086, the DSP56F805 has a memory map that resembles a maze. It also has a much different architecture and instruction set. Hence, assembly language for the DSP looks nothing like the assembly language for the 8086.
Thanks to the book A Small-C Compiler: Language, Usage, Theory, and Design, by James E. Hendrix (also available electronically; see http://www.ddj.com/), understanding how the the compiler worksparsing, frame construction, p-codes, and so onwas straightforward. However, the relationship between the compiler, generated assembly language, and target system required more investigation. For one thing, documentation for the assembly language and 8086 processor is difficult to come by. Some of the original p-codes and related assembly language techniques are based on knowledge and experience of the assembly language and other features of the assembler or the 8086 hardware itself.
Having previously decided to leave as much of the original compiler intact as possible (and because I prefer to work using the "minimum effort for maximum gain" principle), I limited changes to only those things that required changing. Obviously, this meant most of the initial development would be around any OS-specifics in the original, all of the p-codes, and target hardware specifics.
From a coding perspective, the first task was to compile the original compiler. GCC complained about some of the sequencing (lack of function prototypes), and one or two of the function and variable names, but those were easily fixed. Calls to the CLIB routines were commented-out until I decided what to do about them.
Framed and Reframed
C allows the use of localized variables. This means, for example, that a variable is not necessarily visible outside of the area of declaration. Consequently, a variable declared inside a function may not be accessible to the main program. In C, this is known as "scope," which means that variables may be transientthat is, they are temporary and discarded when a function has completed execution. A smart compiler would, therefore, be designed with an effective mechanism for dealing with this; otherwise, memory usage would be inefficient.
The Small-C compiler deals with this by generating assembly-language code that constructs an area of memory (a "stack frame") each time it encounters a function call. So a function call causes the compiler to generate code that reserves memory used to pass parameters, hold "local scope" variables, and release the memory when the function completes execution. It does this by adjusting the stack pointer.
The 8086 stack grows downward towards address 0000 (that is, the stack pointer is decremented for a PUSH operation). Consequently, as you might expect from a compiler designed for this architecture, the previously mentioned stack frames also grow downward towards 0000.
Alas, the architecture of the DSP56800 mandates that the DSP's stack grow in the opposite direction to the 8086the stack pointer is incremented for a PUSH operation. Obviously, I couldn't leave the design as it wasan asynchronous (hardware) interrupt could potentially destroy data constructed in a downward stack frame design. The apparent solutions were to either change the compiler to generate upward frames (that grow away from address 0000), or create a "software stack pointer," which would be independent from the "real" stack pointer.
There are two main issues concerning the modification of the compiler to generate upward frames. First, it's a complex operation that requires a great deal of work. Second, subsequent ports based upon a compiler modified this way might actually require that the whole scheme be reversed if, for example, a subsequent port was to be performed for an architecture supporting the original downward stack.
Although the creation of a software stack pointer would be difficult, it would not be as difficult as the upward frame option. In addition, the separation of the software stack pointer from the real stack pointer would facilitate easier subsequent ports to architectures with either an upward or downward stack.
This implementation lets me keep the existing downward stack frame design, using the software stack pointer and avoid potential data corruptioncaused by asynchronous interruptsby keeping a separate (real) stack pointer that operates according to the architectural constraints of the hardware.
Isolation
Another consideration with the compiler is that of the CLIB.LIB library, which contains many Standard C functionssome of which are suited for compilersrather than cross compilers targeting a microcontroller. Given this suitability, and the fact that the IsoPod package contained just such a cross compiler, assembler, and flash loader, I decided to remove the dependency of the compiler on the library. This was fairly straightforward and involved the removal of a few of the the optional calls, such as the poll() function and the inclusion of the getarg() function directly into CC1.C (one of the compiler source modules).
Words, Bytes, and P-Codes
In simple terms, the 8086 is an 8-bit microprocessor and the DSP56800 a 16-bit microcontroller. I took a quick-and-dirty approach to this difference, and forced the compiler to deal with a word (16 bits) whenever it thought that it was dealing with a byte (8 bits). Hence, the generated assembly language only references words. Admittedly, this isn't the greatest solution, but it is certainly the fastest and easiest to implement.
Generally speaking, each generated pseudocode causes the production of one or more assembly language statements or directives. For example, in the 8086 compiler, the p-code ADD12 is described like this:
code[ADD12]="\211;ADD12\nADD AX,BX\n";
while in the DSP cross compiler, the same p-code is described like:
code[ADD12]="\211;ADD12\nmove\ t"SECO ", X0\nadd \tX0," PRIM "\n";
First, the assembler notation of the 8086 is the reverse of the notation of the DSP. Hence, OP DST, SRC becomes OP SRC, DST. Second, substitution is used for flexibility. SECO and PRIM are defined in CC4.C in this manner#define PRIM "A". This makes even more sense because the compiler sees the CPU as a pair of registersprimary and secondaryand a stack pointer.
Another issue relating to the reversed assembler notation is that of partial instructions. Some p-codes generate a partial assembly language statement, where the remainder of the statement would be generated by a subsequent p-code. To overcome the notation reversal, I had to add a couple of new p-codes to the cross compiler and modify the optimizer to generate them appropriately:
code[MOVn] = "\000;MOVn\nmove\t<N>,";
code[MEM] = "\000X:<m>\t ; MEM\n";
One further complication with p-codes is that the DSP doesn't allow certain instructions to follow certain other instructions (it invalidates the pipeline). Due to this restriction, nop instructions had to be inserted into a few of the p-codes at the appropriate place.
Testing
To test the basic functionality of the cross compiler, I wrote a suite of more than 20 Small-C programs, each exercising one or more features of the language. These programs tested variable scope and pointers, generation of the assembly language for if...else and switch statements, and so on. Because of the removal of the library dependency, constraints of the DSP instruction set, and good coding practices, I ended up creating a new support module (vecinit.asm) that contains code required for the correct operation of the DSP and various routines needed by the cross compiler. Also, testsci2.c, a Small-C test program utilizing the SCI interface (the RS-232 and hyperterm running on a PC), is available electronically.
Assembly
I wrote the assembler from scratch, designing it specifically to support the Small-C cross compiler. It's basically a subset of Metrowerks CodeWarrior (http://www.metrowerks.com/MW/Develop/Embedded/ DSP56800/Default.htm). In fact, assembly language generated by the compiler assembles without modification if pasted directly into CodeWarrior. The primary difference from CodeWarrior is that the assembler (sa568) doesn't recognize sections. The effect is that there's no "data hiding" due to scope rules in the assemblerall code and data belong to a common section. Many of the assembler directives simply aren't required for Small-C, so they weren't included.
The assembler takes the output from the compiler and generates the hex code for the s-records, which are then flashed onto the DSP using Motorola's free JTAG flash loader program. The assembler also generates a file (sa568.tab) that contains details of what it actually did.
Since I'd never ported a compiler or written an assembler before, it was only natural that I'd overlook something. It was at this phase of the project where I realized that I had no way of knowingother than flashing the microcontroller and trying to run a programwhether the output from the assembler was valid. In other words, I'd forgotten about a disassembler.
The disassembler was written from scratch, using absolutely none of the code used in the assembler. This means that any bugs in the assembler are more likely to be revealed in the disassembler.
The Secondary Port
Once the software was stable, I ported the cross compiler, assembler, and JTAG flash loader from a DOS host to a Linux host. Because I'd used DJGPP to port and develop the cross compiler and assembler, the secondary port for these programs was relatively painless, consisting mainly of line terminator differences from the perspective of each operating system. The JTAG port was a little more interesting.
Linux Flasher
Before even looking at the source code for Motorola's JTAG flash loader, I knew that the level of difficulty would be directly proportional to the tightness of the coupling between the program and the operating system.
It turned out that the primary difference at the OS level involved how to interface directly with the parallel portthe interface for the JTAG cable and physical transport for flashing programs onto the DSP. The original DOS program uses _inp and _outp for (hardware) port access. Linux uses inb and outb, and the parameters to outb are reversed, compared to _outp. This was achieved by simple substitution in the code (in the DOS version, inp and outp were defined to be _inp and _outp):
#define outp(x,y) outb(y,x)
#define inp(x) inb(x)
However, there's a subtle problem associated with this technique. Linux does not allow direct access to hardware (the parallel port) from a normal user space program. Being a pretty smart OS, Linux allows privileged programs to grant access to I/O ports via functions like ioperm, which is how I modified the flash loader to run under Linux.
Windows 2000/NT/XP suffer from a similar restriction. The program UserPort.exe (and the associated UserPort.sys) let the JTAG flash loader function in a DOS box from within each of these operating systems.
Style and Fashion
Where possible, my modifications to the compiler are accompanied by the original code (which has simply been commented out). There are two reasons for this. First, as I change someone else's code, I like to keep the original source available for reference. This helps me track any problems I may have accidentally introduced, and it generally helps in the testing and debugging effort. Second, I want anyone else who feels the urge to enhance the compiler to see what I've done and even the things I've attempted that haven't quite worked.
A Brighter Future
With respect to the cross compiler, the supporting module (vecinit.asm), although functionally adequate, would benefit from an overhaul. Register usage could be tidied up to make more registers available for general use. When I receive more feedback, a FAQ document will be created. As I obtain more peripheral devices, I'll add more code examples. Because of feedback I received after the original release, I added preliminary support for interrupts. I also merged the two (DOS and Linux) code trees, to decrease future development time. Subsequent to the initial port, I've joined forces with NMI to maintain and support the package according to the needs of the community of users. This has made it possible for me to extend support to a wide range of the 56800 chips, and a good selection of code is now available from http://petegray.newmicros.com/, including many sensor and servo examples. I'm currently enhancing the assembler so it can be used as a standalone, which involves implementing instructions that weren't actually required in order for Small-C to function, as well as allowing many new variants of previously implemented instructions.
Conclusion
Porting Small-C, writing the assembler, and completing the port has been a wonderful learning experience and has given me a fascinating introduction to the previously undiscovered domain of the microcontroller. I would recommend this approach to any determined software engineer who is prepared to make the effort, research, plan, and produce something useful.
DDJ