Uncaught Exceptions: The Game of the Name
Bobby Schmidt
Straight talk on the name and content of Redmonds compilers, and why operator bool() is rarely the right choice.
Copyright © 2002 Robert H. Schmidt
Over the past few years, successive versions of Microsofts Visual C++ compiler have featured familiar numerical branding: 4.0, 4.2, 5.0, 6.0. Following the logic of this pattern, many people refer to the most recently released compiler as version 7.0. Im part of that group, having made reference to Visual C++ v7 in my previous column. It all makes such simple perfect sense.
As Scott Meyers is fond of saying: What was I thinking?
You see, Microsoft has no product called Visual C++ v7 or 7.0 or 007 or anything remotely 7esque. What they do have is a compiler first released this past February under the full name Microsoft Visual C++ .NET 2002. I just learned this factoid from Visual C++ product management.
I dont know about you, but Id never before seen or heard that full appellation. At best I knew the product as Visual C++ .NET. But 2002? Even the main page of Microsofts official Visual C++ .NET product site [1] doesnt use this name. The name does appear on the sites support page [2], although there its called Visual C++ .NET (2002). As is often the case with C++, the parentheses might not be significant.
It gets better. Visual C++ .NET 2002 is just the general product name; the SKU (or shelf-keeping unit) name is more elaborate. For example, if you buy the non-optimizing or academic version of Visual C++ .NET, the SKU you purchase is Microsoft Visual C++ .NET Standard 2002. Or if you buy the optimizing compiler as part of the cheapest Visual Studio .NET, your SKU is Microsoft Visual Studio .NET Professional 2002.
For some reason Im suddenly reminded of Westminster Kennel Club dog names.
Youll notice that this months CUJ highlights all things .NET 2002 and features a new supplement titled C++ .NET Solutions. I particularly direct your attention to Stan Lippmans article on the Managed Extensions to C++. I will be writing a lot about these extensions in my MSDN columns [3]. If you care about such things, or think you will start caring, I recommend Stans piece as an appropriate starting place.
Waist Deep in the Big Muddy
Q
Hi Bobby,
I really only have questions never answers. This month, Ill only plague you with one:
if (cin) { /* ... */ } if (cout << x) { /* ... */ }
and so on.
I believe this type of syntax can be achieved with any user-defined type when operator void *() is defined within the type. I have many C++ books modern and older but few seem to discuss this, and those that do dont exactly glorify it. I would love to know the dirt on it the pros and the cons. I personally love the clean and concise syntax, so why dont people use it more often?
Im a fairly new subscriber, so I apologize if youve addressed this recently. Thanks for any and all help.
Regards Shane Neph
A
The technique you cite compactly queries a streams error state. Ill give a synopsis of the technique and then ruminate a bit on its merits. While all of the identifiers I mention below are declared in namespace std, Ill be omitting the std:: qualifiers for clarity.
cin and cout have types istream and ostream, both of which are derived from ios. That base in turn derives from ios_base [4]. These bases conspire to implement a streams error state.
Conceptually, each stream maintains its error state as a set. The set contains any or all of three elements, each representing a particular error aspect:
- loss of stream integrity
- end of input
- failure to read/write correct characters
An empty set represents a good or error-free state.
Physically, the set is held in a value of type ios_base::iostate, which is an implementation-defined integer or enumeration something that can handle bitwise operation. The value is created through the bitwise ORing of these flags:
- ios_base::badbit
- ios_base::eofbit
- ios_base::failbit
When all bits are off, all bets are on, and the state value is 0. ios_base defines the special symbol goodbit to represent this error-free state [5].
You can query the error state with ios::rdstate(), which returns the entire set of bit flags [6]. Examples:
if (cin.rdstate() & ios_base::eofbit) // ... if (cin.rdstate() & (ios_base::eofbit | ios_base::badbit)) // ... if (cin.rdstate() == ios_base::goodbit) // ...
If you find this a bit chatty, ios provides more succinct equivalents:
if (cin.eof()) // ... if (cin.eof() | cout.bad()) // ... if (cin.good()) // ...
Yes, there is a fail() call, but no, it doesnt work like the others. You might expect that these would be equivalent:
- cin.fail()
- cin.rdstate() & ios_base::failbit
They are not. For reasons unclear to me, these two are actually equivalent:
- cin.fail()
- cin.rdstate() & (ios_base::badbit | ios_base:failbit)
That is, fail() returns true if either badbit or failbit is set. This ruins the symmetry of bad/badbit and eof/eofbit and (in my mind) makes the stream-state semantics more complex.
As if thats not confusing enough, ios supports an even terser form:
if (!cin)
which is equivalent to:
if (cin.fail())
and implemented as:
if (cin.operator!())
For symmetry and naturalness, ios also allows:
if (cin)
which is unsurprisingly equivalent to:
if (!cin.fail())
For this to work, cin must turn itself into some type implicitly convertible to bool. Of the possible types available, the C++ Standard committee chose void * and defined ios as declaring operator void *().
Why did the committee choose void * and not the apparently more obvious bool? According to P. J. Plauger, the committee at one point considered bool, until the members realized they had unleashed such horrors as:
int i = cin + 2; vector<int> v(cout);
void * allows presumably milder horrors:
delete cin;
I still dont know why ios declares operator !(), which seems redundant. If ios didnt declare operator !(), then:
if (!cin)
would actually be:
if (!cin.operator void *())
and evalute to:
if (cin.fail())
just as operator !() does. PJP suspects the operator is leftover DNA that the committee neglected to snip off. Angelika Langer surmises that the committee defines operator !() on purpose, to keep users from defining their own nonsensical implementation.
Im not an unbridled fan of this extreme shorthand. If streams were simple objects with simple states, so that mapping the real state to logical TRUE/FALSE were unambiguous, Id more likely endorse the technique. But as things stand, there are multiple reasonable interpretations for what statements such as:
if (!cin)
and:
if (cout << x)
could mean. Indeed, I can well imagine a novice programmer reading:
while (cin) // read more stuff
as:
while (!cin.eof()) // read more stuff
Given this real potential for ambiguity, I prefer more verbose and explicit forms such as:
if (!cin.fail())
and:
if (cout << x, !cout.fail())
As Angelika Langer and Klaus Kreft describe [7], input streams can have funky interactions between their ios_base::eofbit and ios_base::failbit states, so that you often need to test for both. Accordingly, I follow the authors recommendation to use:
if (!cin.good())
as a general-purpose test for input stream failure.
If you want to adapt the streams technique for your own classes, I can recommend implicit conversion to a bool-like type only if one obvious and reasonable interpretation for that conversion exists. If a user of your class sees:
your_class yc; // ... if (yc) // ... if (!yc)
and has any doubt about the meaning of yc and !yc in a logical context, syntactic clarity has just yielded to semantic obscurity.
(Many thanks go to P. J., Angelika, and Chuck Allison for their help on this item.)
Doo-Wacka-Doo
Q
Bobby,
I have a query arising from your response to Dave Stycos letter in your May 2002 column, where you define a macro like so:
#define THROW(e) \ do \ { \ /* ... */ \ } \ while (false)
Ive noticed the one-time do-while loop idiom above appearing frequently in macros. Presumably the motivation for using it is to introduce a new scope, allowing the declaration of in-line variables that are local to the macro. (Although this doesnt apply in the case of your THROW macro; I assume you were leaving room for future expansion.)
What puzzles is me is why people bother with the (to my mind) redundant loop construct the following naked block achieves the same effect:
#define MACRO(args) \ { \ /* declarations */ \ /* ... */ \ }
A good compiler will recognize the loop test is constant in the former example and optimize it away, so it doesnt involve any run-time overhead. But it still offends my aesthetic sense to have extraneous keywords present is it just convention or is there a good reason?
My reading of Appendix A.9 in K&R (2nd Edition) suggests that the latter has been legal since at least the first ANSI C draft. In K&Rs terminology, compound statements can be used anywhere a simple statement can so its not that support for naked blocks has been introduced inadvertently by dual C/C++ compilers.
Anyway, keep up the good work.
Regards Kieran Tully
A
As I mention later in this column, Im all for aesthetic sense, so I must have some other clever motive for this macro style.
Youve actually stumbled upon a central mystery of educational writing: how much do we assume of our readers? In this particular case, I used (without explanation) a macro idiom that I assumed would be familiar. But as your question demonstrates, familiar does not always mean understood.
Fortunately the answer is actually quite simple:
- By wrapping a block (a.k.a. compound statement) in a do/while pair, you turn the block into a do statement.
- According to the C and C++ grammars, do statements end with a semicolon (;) while compound statements do not.
- Implemented as a do statement and thus allowing or requiring a trailing semicolon, the THROW macro syntactically appears more like a real function call.
For example, if THROW were implemented as a plain block, then the construct:
if (x) THROW(1); f();
could be written unnaturally as:
if (x) THROW(1) f();
without the semicolon. Implemented as a do statement, THROW requires the semicolon and avoids the unnaturalness.
Next consider the related example:
if (x) THROW(1); else
With the do/while definition of THROW, this expands to:
if (x) do { /* ... */ } while (0); // ; okay else
A plain block or compound statement wouldnt do:
if (x) { /* ... */ }; // oops, cant have ; here else
As you surmise, an optimizer worth its name will remove the do/while scaffolding; but even if the optimizer doesnt, the loop will never iterate. The net semantic result is that of a compound statement; only the syntactic appearance changes.
Bugaboo
Q
Hi there.
I would like to know why the attached code always uses the create<C2> instantiation in Visual C++ v6 SP5. It should use create<C1>. The output should be:
C1
and not:
C2
It works on Visual C++ v7.
In Visual C++ v6 what is a good guideline to source code organization for templates? Visual C++ v6/v7 does not have the keyword export, and without this, it appears awkward to comply with the one definition rule.
A
I spent quite a while reducing your sample to a bare minimum. My distillation appears as Listing 1.
What youve run into is a pretty amazing bug. My experimenting showed that any of these changes makes the bug go away:
- Removing the unreachable call to create<C2>.
- Defining an explicit specialization for create<C2>.
- Changing create so that its template argument can be deduced.
Exploring the generated code, I finally tracked down the problem: the compiler generates the same decorated name for all create calls, regardless of the template argument. As a result, all calls route through the same code. How that code is generated depends on which template arguments the compiler sees.
Visual C++ v7 sorry, .NET 2002 generates the correct decorated names and code.
As for template organization: at the time of this writing, nobody completely implements the export keyword. [See Sutters Mill in this issue for an update on the state of export support. --cda] So the organization you use in general will apply to Visual C++ in particular. That most likely means inclusion-model templates implemented in headers and included in multiple translation units.
You are right about ODR complications, something I wrote about in my March 2002 column. But truthfully, export and separation-model templates do not eliminate ODR problems. In fact, the separation model may actually make the problems worse. John Spicer from Edison Design Group has written a nice treatise on this subject [8].
Erratica
In my March column, I ran a reader question about parallel ports in DOS and asked Diligent Readers who understood the question to write me. Happily, several of you responded. I particularly want to thank Diligent Readers Aki Peltonen, Chris Howe, and Paul Turelinckx for helping lift me from my ignorance. Ill summarize your answers in this space next time; I just wanted to acknowledge your help now.
At the same time, a couple of readers felt that I mocked the original questioner. For one of those readers, my answer apparently helped convince him to drop his subscription. I have written those readers privately and wont debate them here. My remarks are for everyone else.
In that item, I consciously and deliberately omitted the questioners name so that Id specifically avoid any appearance of personal mockery. I wrote that the question was inappropriate for my column, not inappropriate in general. My attempted tone was light humor, nothing more. I even poked fun at myself.
In my almost seven years writing for CUJ, this is the first time Ive received any such criticism. Maybe I did cross some invisible line. Im still not sure. But if ever Im tempted toward that same humoristic direction, Ill pay more attention to the potential (if unintended) effect.
Im encouraged that most people who wrote me saw the intended humor. (One reader even claimed to fall out of his chair laughing.) And even without any humorous veneer, I still would have rejected the original question as inappropriate for my column: it was not at all specific to C and C++, was quite platform-bound, and was way outside my zone of expertise.
One constantly enjoyable aspect of my CUJ career is the artistic freedom Im granted, for I do consider myself an artist, with this space as my canvas. I dont put on a persona to write this column; I write as I speak. I know that my style is unorthodox, especially for this field. I make pop-culture allusions and inside jokes that you might not get. I insert personal experience and observation, as Im doing now. I often look for truth within truth, so that by answering a programming question I can also illuminate some larger reality. And yeah, sometimes I just try to be funny.
In the end, I can control and take responsibility only for what I write, and not for what you read into my writing. As the reader, and as the one controlling your reading, please remember this: my desire is truth, aesthetic, and entertainment. I never, ever, set out to offend.
Notes
[1] <http://msdn.microsoft.com/visualc/ Default.asp>
[2] <http://msdn.microsoft.com/visualc/ support/default.asp>
[3] <http://msdn.microsoft.com/columns/ deepc.asp>
[4] Why the two-tier derivation? Because std::ios is really a typedef for std::basic_ios<char, std::char_traits<char>> and specific to streams of char, while std::ios_base is a non-template class common to all streams.
[5] I find the name goodbit unintuitive since 1) goodbits not the opposite of badbit as one might expect, and 2) goodbits actually a set, not a single element (bit) and thus would more logically be called goodbits or goodset or goodstate.
[6] Please dont ask me why this accessor function is rdstate and not readstate, when the paired mutator function is setstate and not ststate.
[7] Angelika Langer and Klaus Kreft. Standard C++ IOStreams and Locales (Addison-Wesley, 2000), p. 35.
[8] <http://std.dkuug.dk/jtc1/sc22/wg21/ docs/papers/1996/N0872.pdf>
Although Bobby Schmidt makes most of his living as a writer and content strategist for the Microsoft Developer Network (MSDN), he runs only Apple Macintoshes at home. In previous career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via [email protected].