Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Uncaught Exceptions: Party Like It's 1999


January 1999/Uncaught Exceptions

Copyright © 1998 Robert H. Schmidt

To ask Bobby a question about C or C++, send email to [email protected], use subject line: Questions and Answers, or write to Bobby Schmidt, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

I can hardly believe I'm writing for a 1999 magazine issue [1]. When I was a kid, I expected us to live like the Jetsons by 1999. Talk about diminished expectations: thanks to the infamous Millennium Bug, a.k.a. Y2K, I feel lucky to still get money out of my ATM and electricity out of my wall sockets. Maybe the "broken" programs are just foretelling the future. After all, if all the doom-and-gloom Y2K predictions come true, life really will resemble that of 1900. Time to break out my Barber dimes.

My uncle, an Internet newbie, ran across a couple of pessimistic Y2K sites [2]. Alarmed, he forwarded me the URLs and asked for my comments. While spelunking through the sites, I found an oblique reference to a pair of Salon Magazine articles written by Ellen Ullman [3]. In those articles she discusses the culture of programming from an insider's perspective. I found her articles a witty and engaging read — highly recommended.

Foreign Affairs

Q

I am looking for information about the extensions of the C standard by ISO in 1995. I know that multibyte and wide characters are concerned but I don't know exactly how and if, in practice, these new features are of great importance and if they are included in most compilers. — Claude Delannoy

A

The extensions are part of Amendment 1 to the original C Standard, and add support for non-US character sets. Most of the new features are concentrated in the headers

<iso646.h>


<wchar.h>
<wctype.h>

and the libraries implementing them. Amendment 1 was approved in 1994. By now, several years later, I expect most decent compilers are supporting the Amendment, especially if they implement both C and C++. (Such internationalization is part of the C++ Standard.)

Are these features of great importance? If you want to make your programs maximally portable around the world, or if you want your compiler to understand someone else's internationalized code, then yes, they are important. But if you're content to restrict yourself to a US audience, or to use some other mechanism for internationalization, then no, they probably are not important.

C is "tuned" to American letters, numbers, and punctuation: the source code syntax is most streamlined and elegant in the ASCII world, and the runtime library interface is simplest for ASCII. Once you start using the Amendment 1 features — especially if you restrict yourself to the ISO 646 subset of American characters — your code becomes less elegant and harder to read. And thanks to C's loose type rules, you can accidentally mix traditional "narrow" characters with international "wide" characters.

In practice I suspect relatively few programmers make much use of these features. Those who are concerned with internationalization often opt for other quasi-standards (TCHARs anyone?). Most projects I've done have assumed ASCII, often explicitly. With my Microsoft projects in particular, the process often went as follows:

1. Finish an intentionally all-ASCII version, maybe with a token genuflection towards Unicode.

2. Port the completed code to other languages as a distinct project.

3. Fix all the bugs arising because of the original ASCII assumptions.

I think this approach is a mistake, much like retrofitting const correctness to a finished project. If you suspect you'll want to use your code outside the US, you should factor that into the design from the start. (To be fair, those MS project groups may have changed their wicked ways since.)

I'll also add that C is probably not the best language choice in this regard. C++ can do a better job, by making a character-like class that hides many of the internationalization concerns. Java is also good here, since Unicode is its native tongue.

Sets-R-Us

Q

I am writing to you in the hope that you can either (a) provide a solution to my problem or (b) tell me once and for all it is impossible! Either way it will be a relief!!

Basically, I am trying to find a way in C++ using templates (or anything else in the language!) where I can create "bitsets" that work identically to Pascal/Modula-2 sets. That is, the compiler will enforce rules about the range of valid bits that can be set. In Pascal a set can be created from an enumeration, and the compiler will ensure that any time an operation is applied to a variable of that set, only valid values are passed in.

The Standard C++ Library has a bitset template, but it takes general size_t values for indexing, and throws runtime exceptions if there is an out-of-range value.

By the way, the main reason I would like to achieve this is not really for out-of-range handling, but rather to stop errors (which have occurred in my company many times over the years) from people using the wrong "constant" value in a set operation. Pascal prevents these errors by not allowing the wrong types to be used in set operations. Can this be done in C++? — Karl Lean

A

You can't exactly replicate Pascal's behavior here, since C++ does not support sets directly. However, by combining a class and an enumeration, you can get fairly close. A full-blown set implementation pulls in quite a number of design considerations, and could easily fill an entire column series on its own; I'll touch on just a few possibilities here.

As a first pass, consider

enum Element
    {
    A,
    B,
    C
    };
     
class Set
    {
public:
    Set &insert(Element const &);
    Set intersect(Set const &) const;
    // ... and so on
    };
     
Set s1, s2, s3;
s1.insert(A);
s2.insert(B);
s3 = s1.intersect(s2);

In this example objects of type Set can hold Elements of value A, B, and C. Because Element is an enumeration type, calls like

s1.insert(7); // error, type mismatch

result in a compile-time error, as you desire. Unfortunately, there's nothing to stop an insistent programmer from inserting the nefarious

s1.insert((Element) 7);

Pascal supports several built-in operators for sets; in our C++ analogue, operator overloading can make the sets look more "native":

class Set
    {
public:
    Set &operator+=(Element const &);
    Set operator&(Set const &)const;
    // ...
    };
     
s1 += A;	// was s1.insert(A);
s2 += B;	// was s2.insert(B);
s3 = s1 & s2;
       // was s3 = s1.intersect(s2);

Normally I find operator overloading among the most dangerous and overwrought features of C++. In this instance, though, I think considered use of operators can bring an elegance and utility.

To use an expression like

if (s1 & s2)

define Set::operator& to return a bool or something convertible to a bool. As written, Set::operator& returns a Set reference. If that returned Set turned itself into a bool via Set::operator bool, then

if (s1 & s2)

would really be

if ((s1.operator&(s2)).
    operator bool())

In this scenario, you'd define Set::operator bool to return false for an empty set and true otherwise.

With an appropriate operator| representing set union, you can initialize sets by

Set s1(A | B | C);

You may find that * and + make a more logical representation for intersection and union; after all, these are the built-in set operators Pascal uses. In that case, define operator* and operator+ instead of operator& and operator|. In no event should you overload operator&& and operator||, since that could lead to sequence-point surprises [4].

To make Set work for arbitrary element types, craft a template

template<class Element>
class Set
    {
    // ... same as before
    };

Managing the Set storage is a bit tricky: the template doesn't necessarily know the Element's lower and upper bounds, and therefore can't know how many different Element values it could be asked to hold. Possible solutions:

  • Pick a fixed maximum Set size.
  • Pick a starting Set size, and let the Set grow dynamically over time.
  • Specialize the template for element types of known ranges.
  • Require that element types be more sophisticated, with traits or properties that Set can query.

Finally, as I'm sure many Diligent Readers are busting to tell me, you can adapt the set and multiset templates implemented in the Standard C++ library.

Vertigo

Q

First off let me tell you, I love reading your column in CUJ. It is very informative and teaches me a lot about the language I didn't know.

One thing I've been fooling around with is trying to make type-cast operators for smart pointers safer. One example is in Microsoft's ATL (Active Template Library) 3.0: the CComPtr class has an operator-> that returns the interface pointer. Before version 3.0 you could call AddRef and Release on the object and screw up its reference count. They fixed this by doing something like this:

class NoRefCountInterface : public Interface
    {
    // ...
private:
    AddRef();
    Release();
    };

That way if you called the members via CComPtr::operator-> the calls would fail to compile, since the member are private. So I thought that I could do that with operator delete:

class NoDeleteInterface :
    public Interface
    {
private:
    static void
    operator delete(void *);
    // ... is not implemented
    };
     
class NoDeletePtr
    {
public:
    operator NoDeleteInterface*();
    // ...
    };

I was expecting statements such as these would be illegal:

NoDeletePtr x;
// ...
delete x;

But they compile fine on Microsoft Visual C++ 6.0! In fact I get linker errors because operator delete has no body!

I understand there is an operator delete and a delete operator (thanks to Scott Meyers' book), but I don't understand what is going on in the above situation. Should I get a compile error? — Justin Rudd

A

Remember how Alfred Hitchcock had a cameo role in many of the movies he directed? Even in Lifeboat, where he couldn't just stroll through the set, his picture appeared in a newspaper or magazine on screen. I've decided Scott Meyers is my column's Alfred Hitchcock. Even when he's not a guest star, he still manages to get his name mentioned. Maybe I should start wearing garlic to ward him off.

Anyway, took the liberty of de-Microsoftizing and otherwise changing the names in your code example, to make the role of each piece more obvious. I also pared down your example to something more succinct.

I've not used Microsoft's ATL, so I have to gloss over some details. In particular, I'm not sure if CComPtr is a template, or how it binds to an interface object. I'm guessing the proper usage is modeled on auto_ptr and looks something like

CComPtr<NoRefCount>
p(new NoRefCount); // ?

However the binding happens, when you try to compile

p->AddRef();

what you are probably getting is

(p.operator->())->AddRef();

The expression p.operator->() presumably returns a pointer to its interface object (NoRefCount here). Because AddRef is declared private in NoRefCount, the call won't compile.

In your second scenario, when you write the expression delete x, what really happens is this sequence:

1. x's destructor (NoDeletePtr::~NoDeletePtr) is called.

2. Some operator delete is called, with &x passed in as the argument.

You want to control which operator delete is called in the second step. In particular, you want NoDeleteInterface::operator delete to be called. For this to happen, the x in delete x must convert to a NoDeleteInterface object — and you expect that to happen by x implicitly calling operator NoDeleteInterface*. The net result would then be

x.~NoDeletePtr();
(x.operator NoDeleteInterface *())->
    operator delete(x);

So here's the real question: does x implicitly convert itself to a NoDeleteInterface object this way? The real answer, as usual, is teased from the C++ Standard. In section 5.3.5 ("Delete") we read:

The operand [of a delete expression] shall have a pointer type, or a class type having a single conversion function to a pointer type...

If the operand has a class type, the operand is converted to a pointer type by calling the above-mentioned conversion function, and the converted operand is used in place of the original operand...

The delete-expression will call a deallocation function [operator delete]...

Access and ambiguity control are done for both the deallocation function and the destructor.

I've omitted other passages, but you should get the gist. The official answer seems to be yes, the implicit conversion will occur, and the converted-to type's private operator delete will be called. Just for yucks, I tried your code on four different translators. Two made the implicit conversion, two did not — a draw.

I did find one interesting quirk with MSVC 5: if you add an explicit destructor to NoDeleteInterface, the above code compiles, even though the class's operator delete is still private. Check out your original sample on your version of MSVC, and see if you have an explicit destructor. By tracing through the debugger, I found the private operator delete was indeed being called.

You Say Tomato<>

Q

I read (and even reread) your CUJ columns with interest.

In your May 98 column on template metaprogramming you gave the following example (reformatted to my taste :-)):

template < int x >
struct f
{
   static const int value =
      x + f<x-1>::value;
};
     
struct f<1>
{
   static const int value = 1;
};

According to Stroustrup's The C++ Programming Language, 3rd Edition, the specialization should be written as

template <>
struct f<1>
{
   static const int value = 1;
     
};

This makes sense as the special case of the more general case where we have more than one template parameter, as in

template < typename T, int N >
class Array
{
   ...
   T _data[N];
};
     
template < typename T >
class Array<T,0>
{
   ...
   std::vector[T] _data;
};

Is the "template <>" optional in the standard, or is it just not yet required by the current compilers?

Thank you for an interesting column and for considering my question.

Regards — Hans Salvisberg

A

Compiler writers, and the programs they write, are not infallible; but those writers spend a lot more time than I studying and interpreting the C++ Standard. So while I don't rely on a compiler's verdict as the sole arbiter of program correctness, I do compile my code examples as a sanity check before submitting them for publication. If my interpretation of the Standard varies wildly from my compilers' interpretations, I dig deeper to resolve the difference.

In this instance all my compilers accepted the template<>-less code, and I didn't notice the error, so I saw no interpretation difference to resolve. Ooops. As I researched your question, I found it strange that translators from four different vendors — EDG, GNU, Metrowerks, and Microsoft — all accepted my published version. That lead me to think that perhaps some loophole in the C++ Standard sanctioned this code, but I never found such a loophole. In short, I goofed in that column, and should not have omitted the template<> prefix.

I eventually learned from P.J. Plauger that the template<>-less syntax was permissible under earlier C++ incarnations. Towards the end of C++ standardization, some compiler implementers demanded the addition of the keyword typename and other aids, to help resolve template-parsing ambiguities. template<> is apparently one of those aids.

The code I published contained no such parsing ambiguities, so the compilers were able to understand my intent. Even so, I still should have used template<>, since my aim is to show you conforming and well formed code wherever I reasonably can. Thanks for helping to keep me honest.

Notes

[1] Whenever I see that date, I expect to see the 9s lined up a little ragged, like the digits on a car odometer that's about to turn over.

[2] Luckily for him, my family lives in Sedona, AZ. If the sky really falls after Y2K, he's already in the remote hills everyone else will be running to. Although come to think of it, life in Sedona is already in a time warp, so the post-Y2K world would seem normal.

[3] <http://www.salonmagazine.com/21st/feature/1997/10/cov_09ullman.html>

[4] If you're not sure what I mean by "sequence-point surprises" then send me a question about it! That way I can answer in print, turning this column into a perpetual motion machine (each answer begetting a new question, ad infinitum). Just another form of job security for your Friendly Contributing Editor.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.