Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

The Learning C++urve


December 1996/The Learning C++urve

Diligent readers point out a few bugs and join Bobby in singing the woes of Hungarian Notation.

Copyright © 1996 Robert H. Schmidt


Literally as I write, I'm ensconsed at the San Jose Hilton with Dan "const-correct" Saks, where we --- along with P.J. Plauger -- are teaching at the Embedded Systems Conference. The chaos reigning here is not unlike that in my own life, at least short-term. For the next six weeks, I will be slowly moving into my new home; by the time you read this, the new contact information (listed at the bottom of this page in atom-sized print) will be correct.

Moving always brings me a spirit of searching through, weeding out, and closing off. In that spirit, I have found myself pondering accumulated e-missives from Diligent Readers Like You. Readers responding to questions I've posed or rebutting points I've raised. Readers who fervently hope I will give them their fleeting moment of fame in these august pages. Readers who wonder why they even bother to write me, who at this moment have fingers poised over their Send keys, about to cancel their subscriptions.

To those readers I say Take Heart! Your wait is over. As I clean through my home, I also clean through my backlog of unresolved issues. Once I've addressed the past, I'll turn to the future, describing the direction for my next article series.

Mind Your P's and Q's

David Beukers discovered a bug in my first Mutations column (June 1996). I wrote

This explains the allowed conversion of void * to int * in


  void *v;
  int *i = v; // OK

What allowed conversion? I really fell asleep at the wheel on this one, and should have written

This explains the allowed conversion of int * to void * in


  int *i;
  void *v = i; // OK

David also indirectly described a subtle flaw in my explanation. I asserted that conversions like the above are safe because everything a void * can do, an int * can do to its data. As an analogy, I suggested that v's domain of behavior is a subset of i's. I then wrote that


int *i = 0;
int const *ci = i;

succeeds for similar reasons: everything an int const * can do, an int * can do, suggesting ci's behavior is a subset of i's. In these examples, I unwittingly compared two different aspects of pointer behavior: the assumptions they can make about their data, and the actions they can take with that data.

By saying "void * is a subset of int *", I'm really saying "the assumptions a void * can make about its data form a subset of the assumptions an int * can make about its data." An int * loses type information as it converts to a void *, meaning the void * can make fewer assumptions:


int *i;
void *v = i; // OK -- referenced
             // data loses type
             // information
i = v;       // error -- referenced
             // data can't assume
             // type information

Were I to focus instead on what the pointers can do with the data, the relationship continues: void * cannot dereference its data, while int * can, making void * the behavior subset. Taking this line of reasoning a step further yields my second example: int const * cannot change its data while int * can, implying const int * is the subset.

Put more succintly and correctly, pointers convert from one type to a more restrictive type. void * is more restrictive than int *, so you can turn an int * into a void *. int const * is more restrictive than int *, so you can also turn int * into an int const *.

But what about turning int const * into void *? In the "assumptions about the data" respect, the conversion is more restrictive -- int const * assumes its data is an int, while void * can't. But in the "what can be done with the data" respect, the conversion is less restrictive -- the int const * promises not to change its pointed-to object, while void * makes no such promise. Because the conversion is not unassailably more restrictive, the compiler disallows it.

Syntax Coloring

In my August 1996 column, I remarked that a coding style such as


/*
void f()
    {
    //...
    }
*/

had an advantage over


#if 0
void f()
    {
    //...
    }
#endif

While I knew several syntax-coloring editors that could render /*...*/ differently from "normal" code, I knew of no editor that similarly handled #if 0...#endif, and challenged readers to let me know of any syntax-coloring editors that could.

Andy Lester and Dmitriy N. Vasilev write that ObjectMaster 2.2 for the Macintosh can color #if 0 code. Andy further notes that ObjectMaster is primarily an object brower which also makes for "a pretty good code-cranking-and-editing environment."

I have not used ObjectMaster, so I have to take Andy's and Dmitriy's assertions on faith. I find it interesting that this browser/editor runs on Brand A's systems but not on Brand M's. I think I'll suggest to the latter that they add such coloring to their Visual C++ editor.

C++ Casts

In that same August column, I explained my use of the phrase "RTTI casts" to describe C++-specific cast syntax like dynamic_cast, const_cast, and so on. I also asked anyone who knew the real name of these casts to step forward and sign in please.

To my rescue comes Herb Sutter, who tells me such casts are called (amazingly enough) "new-style casts." I will start calling them such in my future scrivings, until the C++ standard caprciously changes their name to something far more clever like "heckle-decl-speckle-casts."

Initializing Array of char

Bob Nevitt has trouble compiling some of my code from September. Here is that code, stripped down:

typedef char const name_part[7];

struct name
    {
    name_part last;
    name_part first;
    };
struct name moi = {"Schmidt", "Robert"};

While this works fine under C, under C++ his compiler complains on the last line:


const member 'name' in class without constructors

What Bob found was a typo in my original example. Where I wrote


typedef char const name_part[7];

I had intended to write


typedef char name_part[7]; // no 'const'

Even with this change, the example still won't compile in C++, as I mentioned at the bottom of Page 75 -- but at least then it fails "correctly" (i.e., because the array is too short for its initializer).

With the typo intact, Bob's compiler is (incorrectly) objecting to struct name having const data members. Such members must be initialized at construction time, since C++ has no other way to set their values[1].

Typically, C++ initialization occurs in a constructor initializer list[2]; but, since it doesn't have constructors, C can't use this method, and must rely on the more traditional


struct name moi = {"Schmidt", "Robert"};

initialization syntax. By implication, C code ported to C++ must also allow such initialization; otherwise we'd be forced to artificially add constructors to structs just to make them compile.

On a semi-related note, the C++ committees have just voted to give quoted character string literals type char const [] instead of char []. While you'd think this now renders invalid statements like


char *Spiro =
    "nattering nabobs of negativism";

the committees have provided a (deprecated) conversion allowing a char * to initialize this way; otherwise scads of ported C code would suddenly up and die.

Archaeology

Sanjiv K. Bhatia also read the September issue, which contained the fourth part of my Mutations. Unfortunately, he missed the first three parts, and wondered how he could get back issues.

To Sanjiv, and to any others who miss issues and find they simply can't live without my prosaic elegance, I have sad news: although I write for a Miller Freeman, Inc. publication, I do not in fact work for Miller Freeman, Inc.

But, while I cannot verify this from actual experience, I'm reasonably sure that a group of people incarcerated, er, living in Lawrence, Kansas do work

for Miller Freeman, Inc. Those people, the hardy staff of C/C++ Users Journal, are waiting breathlessly to sell you back issues; try their fabulous web site at http://www.cuj.com. [Gee, some of us actually do like it here. Also try [email protected] or 800-444-4881 or +1-913-841-1631 -- mb]

Hungarian Goulash

As I suspected would happen, I've received more mail on my Hungarian diatribe (October 1996) than I have on any other column. I'm guessing I'll get more Hungarian mail after I submit this column for publication, and will discuss that mail next month.

So far, every letter writer has agreed with me to some extent, which means either my position is self-evidently correct, or my e-mail client is filtering out senders with @microsoft.com in their addresses.

Of all this mail, the loudest amens come from Arch D. Robison and Les Kopari. Arch writes:

That was a great article on Hungarian notation! I was laughing out loud over the imagery, and strongly agree with your "muddled design" remarks. Thanks for putting in witty print what I'm sure many programmers have privately thought.

Well thank you Arch -- I now nominate you President of my fan club!

Les weighs in with Great article...humorous.

Being a Hungarian I feel obligated to point out that Simonyi wrote his Ph.D. thesis at Stanford on this subject, if you can believe it. It had a much more impressive title than "Hungarian Notation."

The stuff reminds me of the gibberish that COBOL programmers tend to use in that both look like alphabet soup after a while. There's no excuse for this junk in C++. Thanks for lending your voice of sanity.

Ouch! I think I'll nominate Les Sergeant at Arms.

On a less fawning note, Steven Lee writes that he appends Class to C++ class names. This of course mimics Microsoft prepending C to such names. But I have to ask: if you append Class to classes, do you also append Struct to structures? For in C++, structures are classes that default to public access, period. It strikes me as inconsistent to give one a wart and not the other.

Steven follows Microsoft's lead by prepending m_ to class member names. He sees the latter as solving problems like


c::f(int n)
    {
    m_n = n; // OK, names are different
    }

However, if the primary goal is avoiding a name collision, I offer an alternative approach:


c::f(int n)
    {
    this->n = n; // OK, both names are same
    }

G. Wade Johnson also uses a private subset of Hungarian, differentiating arrays from pointers:


char szBuffer[10];

char *pszBuffer = szBuffer;

As he correctly writes, such names alert you to problems like

size_t length = sizeof(pszBuffer); // oops!

I'm of mixed mind here. C's insidious way of transmuting arrays into pointers is often maddening, and I've made this sizeof mistake more than once. On the other hand, I'd argue that in a mature C++ project, native arrays should be replaced by "smart" array-like class templates, eliminating the array/pointer problem.

G. Wade extends Hungarian with the two warts my and our:


class c
    {
public:
    c();
    static int ourCount;
    int myData;
    };

c::c() : myData(0)
    {
    ++ourCount;
    }

ourCount is used by all of "us," that is, by all objects of the class. Conversely, myData is specific to "me," to one particular object. This device breaks down somewhat in usage outside a c member function, since the notion of "us" no longer makes sense:


int main()
    {
    // who is "us"?
    if (c::ourCount > 0)
    // ...
    }

Respecting Privacy

A college student (whose mail and name I unfortunately lost) wrote that while he agreed with my take on Hungarian, he disagreed with my habit of making all data members private. He felt that Microsoft's tendency towards protected data made for better coupling between base and derived classes.

This is really a matter of perspective, and depends greatly on how you view protected within the access control spectrum. I used to see protected as a slightly less secure variation of private, but now I see it as essentially public with suggested usage. The fact is, protected is really more a hint by the class designer than a mandate by the language:

class base
    {
protected:
    int protected_data;
    };
        
class base_wannabe : public base
    {
public:
    base::protected_data;
    };
base_wannabe x;

x.protected_data = 0; // OK

Here I'm freely able to manipulate protected data outside a derived class, which is probably not the base class author's intent. This gives the same net effect as if the data had been declared public in the first place.

In my view, then, there are two kinds of members: private and non-private. Once you move a member out of private, you are making it available to the whole world whether you wish that or not.

Distant Early Warning

The new year starts a new column arc, perhaps my most ambitious yet. A few months back (in the Mutations series) I showed you how to migrate your existing C code to C++ largely intact. If you never plan to change your newly-ported code, or to write new code, then you can stop reading me for a few months. However, if you intend to modify this code over time, integrate it with native C++ code, or write new C code that will eventually mix better with C++, stay tuned.

In the past I've shown you how to technically alter your C code; now I show how to alter your C design and mindset. Remember, much of C++ is really C idiom wrapped in a prettier and more secure syntactic package. Instead of abandoning all you've learned about C, you can apply that knowledge in a different context.

We'll start by taking C code and slowly massaging it into a design more compatible with C++'s strengths. Even if you never plan to port to C++, I'll argue that these changes will improve your C as C, not just as embyonic C++. C++'s extensions to C exist for good reasons, and you can apply those reasons even if you don't switch languages.

After a point, our options will become limited unless we adopt C++ (recall the similar wall we hit with the Boolean series a year ago). Once we've made that leap, a whole new world of language features difficult or impossible to mimic in C become available. These features exist as means to larger ends: improved and more diverse memory management, bug prevention, error detection and recovery, enhanced maintainability and portability, and so on.

In sum, this series will not only help you migrate code from traditional C design to more object-oriented C++ design, but it will also help you use these designs to solve specific programming problems. I don't see myself selling you on C++; I do see myself selling you on design strategies that happen to be better supported by C++.

Notes

1. Okay, you can always cast away the constness of such members -- but if you do, please tell me, so I know not to buy stock in your company.

2. In this particular example, a constructor initializer list wouldn't work anway, since it can't initialize array members. Changing the members to char const * const or string const would allow a constructor initializer to work and still keep the members const.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also a member of the ANSI/ISO C standards committee, an alumnus of Microsoft, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him at 3543 167th Ct NE #BB-301, Redmond WA 98052; by phone at (206) 881-6990, or via Internet e-mail as [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.