An STL Error Message Decryptor for Visual C++
Leor Zolman
Believe it or not, theres helpful information buried in those template error messages. It just takes a little Perl to dig it out.
This article describes a tool I wrote for the Microsoft Visual C++ environment to radically simplify STL-related error messages. The tool is implemented as a C++ program that wraps the standard Visual C++ compiler command, CL.EXE, transparently filtering its diagnostic output through a Perl script and displaying the results of the filtering. The Perl script supports both the stock VC++ 5/6 library and the drop-in replacement library available from Dinkumware, Ltd. Once installed, operation of the VC++ compiler is unaffected when performing both command-line and IDE-based compilations.
The Background
As both an author and instructor of various programming courses for the corporate environment, Ive tried to shape each five-day course I develop by asking myself the question What are the most important things to be said about this language within the constraints of a five-day format? In the case of C++ for non-C Programmers, my answer has been to take the historical (and conventional) approach of teaching C with iostreams in the first half of the course, the better C features of C++ as an interlude, and C++s object-oriented features for the rest of the week. Moving at a rapid clip, with time for only one or two labs per day, we barely got to polymorphism by Friday afternoon; time had run out before wed had a chance to discuss exception handling and templates, let alone the STL.
But, change is in the air. It seems to me that the C++ community is now solidly in the throes of yet another paradigm shift, and its name is the STL. Just as the object-oriented capabilities of C++ took a while to sink into the brains of procedural-minded C programmers (theyre still sinking into mine), the generic programming approach so fundamental to the STL portion of the Standard C++ library is finally getting some serious consideration.
I recall a conversation with Scott Meyers, Dan Saks, and others at a C/C++ conference over five years ago, debating the merits of Bjarne Stroustrups new approach to teaching C++: begin with the high-level constructs, such as classes and templates, and work down to the low-level implementation details (like pointers) much later on. I didnt buy it at the time and continued teaching C++ the old, plodding C-first way.
Then, much later, I finally learned the STL. All of a sudden everything Bjarne had been saying began to make sense to me. There are an awful lot of fairly complex and interesting things you can do with the STL without ever having to write a single asterisk in your programs unless you need to do some multiplication, of course.
The final nail in the coffin of old-style C++ education, as I see it, is Andrew Koenig and Barbara Moos recent book, Accelerated C++. I read half of this book in flight from Boston to Portland, Oregon, on my way to assist in Scott Meyers Effective STL seminar. Upon arrival in Portland, I had already decided that the approach taken by Koenig and Moo was the way I wanted to teach C++ for non-C Programmers, and perhaps C++ for C Programmers as well.
There was one obstacle, however, and it was a major one. The compiler that virtually all of my clients use for training C++ programmers under Windows is Microsofts Visual C++. Ive been happily using this IDE (albeit not for STL) in the classroom for years, and have always considered it both a friendly training environment and an effective development environment for my personal programming projects.
The problem arises when you are programming with the STL and make an error. Since the STL is composed mostly of templates taking lots of type-name arguments (some required, some optional via default arguments), data types that are simple to declare and use at the application level yield the diagnostics from hell [1] when used incorrectly, even if the mistakes are trivial. It would be putting it mildly to suggest that beginning C++ programmers might be discouraged from using STL due to the daunting complexity of many of the compilers error messages.
That left me with a choice of two evils: either teach C++ the old way to keep the error messages straightforward, or teach C++ the new way and force beginners to deal with messages whose sum total comprehensible content boils down to there is an error somewhere on this line.
I found a third option, reminiscent of how James T. Kirk dealt with the Kobayashi Maru no-win scenario: reprogram the computer!
The Script
Consider the program error.cpp (Listing 1). When compiling under standard Visual C++ 6.0, the resulting error message is shown in Figure 1 [2]. The message is actually saying that the insert method of this multimap requires a pair<string, int> as a parameter, but weve provided an int instead, and a suitable conversion does not exist.
Figure 2 shows the result of filtering the text of Figure 1 through the Perl script STLFilt.pl (available for download at <www.cuj.com/code>). Now you can see what Visual C++ was really trying to tell you all along. I call what this Perl script does STL Error Decryption because it takes cryptic STL-related error messages and distills them down to their essence, translating the template-generated type information into a form as close to that of the original source code as possible.
Some of my specific goals for the script were:
- Eliminate unnecessary keywords (class, struct, __CDECL, etc.) and default template parameters (such as allocators and less<...> functors) completely.
- Reduce the fully expanded standard container/iterator template type names to just the basic container type and the simplest possible representation of the parameter types.
- If possible, make the resulting messages fit into the fine-print error message line at the bottom of the Visual Studio IDE [3]. Of course, the number of characters there is dependent upon your specific display settings. My display is usually 1280x1024, so that is what I used as my measuring stick. Fortunately, few final messages require the full length of the line (or more) at that resolution, so most messages should still end up fitting in the available space even at lower screen resolutions. If not, you can always scroll them in the usual manner within the output window.
- By allowing some customization of the Perl script, give users a choice of how to handle iterator types: either have the script delete all details of iterator types or have it leave the (decrypted) iterator types intact. For example, consider the program error2.cpp (Listing 2). The original, unfiltered error message produced by the insert statement is about 2,700 characters long. (You dont really want to see it, do you?) With the Perl variable $iter_policy set to 1 (output shown in Figure 3), the filter preserves iterator type-name qualifiers, so you know precisely what type of iterator is being referred to.
With $iter_policy set to 3, only the word iter remains [4], additionally shortening the error message. You can usually deduce the specific type of the iterator in these cases by examining the type information associated with the function in question. In Figure 4, for example, theres not much iter could logically refer to other than multiset<string>::iterator.
Installation/Configuration
The STL filter package consists of three files [5]:
- CL.CPP: The C++ program that runs in place of Microsofts standard CL.EXE. It invokes the original CL command, pipes the output to Perl to accomplish the decryption, and returns the exit status of the original compilation as its final exit status so that the VC++ IDE knows whether or not to proceed on to the link phase after compilation.
- STLFilt.pl: The Perl script itself.
- STLFilt.bat: A DOS batch file that turns decryption mode on and off by toggling the existence of a special sentinel file. When off, compilation produces conventional error messages. When on, those messages are filtered through the Perl script.
To install the package, follow these instructions:
- Install Perl on your machine. One source of a great Perl distribution is <www.activestate.com>. It is a freely distributed software package.
- Determine where your STLFilt.pl file, your Perl interpreter, and your toggle-control file are to reside on your system. Configure the symbolic constants FILTER_SCRIPT, PERL_EXE, and FILT_FILE at the top of CL.CPP to specify these three pathnames, and compile CL.CPP. Note that FILT_FILEs filename must end with the extension .ON.
- Modify the FILT_BASE variable in STLFILT.BAT to show the base name of the FILT_FILE variable defined in CL.CPP. For example, if FILT_FILE in CL.CPP is defined as FILTERING.ON, the setting in STLFILT.BAT should read:
- Rename the standard (Microsoft-supplied) CL.EXE (in your ...\VC98\BIN directory) to CL2.EXE. (If youd prefer to rename it to something else, change the definition of the symbol STANDARD_CL in CL.CPP accordingly.)
- Place CL.EXE (from step 2) and STLFILT.BAT somewhere along your system PATH.
- To enable filtering, open a DOS box and enter the command:
set FILT_BASE=FILTERING
stlfilt on
CL.EXE should now work both from the command line and from within the IDE, since Visual Studio invokes it to perform a compilation. (Thats why it has to be named CL.EXE.)
To disable error decryption (and see full error messages), either use CL2.EXE directly from the command line, or disable filtering in conjunction with the new CL.EXE by executing:
stlfilt off
Implementation Notes
My original CL.CPP had about 10 lines of code that essentially boiled down to the following:
return system("<original CL command> | perl stlfilt.pl");
In that version, I had the Perl script return a status value of 0 or 1 depending on whether or not it ever saw the word Error on its input. This approach actually seemed to work when I ran CL from the command line, but the Visual Studio IDE failed to get the message and always proceeded on to the link phase, even if there were fatal errors during compilation. The present version (over 200 lines) handles the piping operation directly via Windows system calls. The status returned by CL.EXE is now, appropriately, the actual status value returned by the compiler. This keeps the IDE happy.
For anyone thinking of adapting this method to the Unix environment, Id recommend using my original approach in the Perl script: have it return 1 if the word "Error" is ever spotted and 0 otherwise. Then the parent process (shell script or makefile) can invoke C++ and the Perl script using a pipe and receive the proper result status directly from the Perl process.
As for the Perl script, I basically began by completely deleting the simplest noise components of the messages (the words class, struct, etc., as mentioned above) and reducing vanilla instances of string to just "string". Those reductions alone made a major dent. For containers and iterators, I scanned the raw error messages for patterns, mapped them to their corresponding containers, and wrote Perl statements to ferret out the essentials for each type. This was mostly not very glamorous, trial-and-error grunt work. As I puzzled out each standard type, I ran regression tests to see what previous work had been broken, consolidated redundant transformations, and plodded on.
When I had all the standard container types working with built-in types and simple classes, I thought I was done. However, the Koenig/Moo book (on which I plan to base a new C++ course) uses constructs with nested templates, such as
map<string, vector<int> >
It seemed to me that a multiple-pass approach might suffice to deal with such nested templates; in theory, n levels of nesting could be decrypted by making n passes over the error message text. In practice, it actually seems to work that way! I surrounded the bulk of the filter with a big for-next loop and began creating test cases. Some issues cropped up with respect to allocators, functors, and the order by which their declarations were stripped from container and iterator declarations, but soon it all fell into place, and it seems to be doing the trick.
As I write this, if I so much as look at another regular expression, my vision starts to blur and I feel a panic attack coming on; so, Ive decided not to scramble to support the several other solid, popular C++ libraries out there with this version of the filter [6]. Perhaps by the time this sees print, Ill have added support of other libraries to the script, or, if problems arise trying to process different libraries with the same script, Ill have developed some alternate versions of STLFilt.pl for those libraries. In that case, the updated and/or alternate versions will be available for download at <www.bdsoft.com/tools/stlfilt.html>. If someone else beats me to it, however, so much the better!
Acknowledgments
Thanks to Dave Smallberg for the basic idea (wish Id have thought of it myself!) and to Scott Meyers for putting on his wonderful ESTL seminar in January 2001 the event that inspired this project. And a very special thank you to Thomas Becker, who wrote and kindly contributed the native Windows inter-process communication code in CL.CPP that replaced my original one-line pipe system call.
Footnotes
[1] To be fair, the STL error message complexity problem is not specific to Visual C++; most compilers, in fact, put out STL diagnostics that are more or less similar.
[2] Actually, Figure 1 is the result of compiling directly from the command line. If I compile error.cpp from within the IDE, then in addition to the errors in Figure 1, I also get a slew of obnoxious warnings about identifiers being truncated to 255 characters.
[3] Im sure Microsoft has an official name for this area at the bottom of the IDE, but never having seen a Visual C++ manual, I have no idea what that name might be.
[4] You can even customize the text used to substitute for the word iterator by changing the value of $newiter in the configurable portion of the Perl script, STLFilt.pl. That value is iter by default, but Ive also used IT with pleasing results. In fact, using IT is my personal preference, but I didnt want to impose the all-caps format on anyone who might find it offensive.
[5] There arent any points I make in this article that particularly require the listings to be on hand, so I relegate them all solely to download-land. Furthermore, if CUJ were to print, say, excerpts from the Perl script, the lines would run way too long for the available column space since I havent yet figured out how to wrap Perl program lines containing long, single-line regular expressions. Does anyone happen to know the secret?
[6] While playing with STLPorts library, however, I couldnt resist putting in two small changes (the detection of an optional underscore in a few strategic places, and removal of the STL:: qualifier) that cut the length of VC++s messages by an average of 50 percent when including the STLPort header files. But Id better stop now before I get sucked into doing everything else.
Leor Zolman designed and implemented one of the first C language development systems for personal computers, the BDS C Compiler back in 1979. Since then he has worked as a staff member at the C/C++ Users Journal and published a book on C Programming (Illustrated C, Prentice-Hall, 1991). Leor currently authors and delivers corporate training seminars in Java, Unix (introductory, Korn Shell programming and system administration), Perl, C, C++, and the Standard Template Library. He can be reached at [email protected].