The New C++: C and C++: Wedding Bells?
by Herb Sutter
Copyright © 2002 Herb Sutter
For several years now, C and C++ have been following roughly parallel but slightly divergent evolutionary paths. Even if you program in only one of C or C++, you should care about this. This article is a summary of what's going on, why you should care, and what might just happen in the future.
Standard C and Standard C++
Standard C and Standard C++ are clearly related languages. Full stop.
I say the above right at the outset in an attempt to get rid of what I see as an unuseful red herring. Renowned experts have argued that they are sibling languages, both derived from K&R C and incorporating features from each other. Other renowned experts feel that, although Standard C has adopted some Standard C++ features, Standard C++ is a successor language that's built on Standard C after all, the C++ Standard incorporates the C Standard by reference. This question of heritage may be interesting to students of history or of the politics of standards, but is ultimately somewhat irrelevant to the practicing programmer.
What is important to us folks in the trenches is that Standard C and Standard C++ are clearly related languages and it's important even to those folks who use only one of the languages, because the two languages already share a de facto community. Most compiler vendors that support C++ also support C in the same compiler, and many tools target both languages. As Stroustrup argues persuasively in his recent C/C++ Users Journal article series [1, 2, 3], there are real benefits to being part of a large community.
The Divergence Issue
It is for these reasons that the languages are closely related, and that they share a de facto community that the existing divergences between the languages are worrisome and, if unaddressed, could increase further in the up-and-coming versions of those standards, C0x and C++0x.
There are already enough differences between the two languages' latest standards, C99 and C++98. For example, consider the existing differences and overlap between these two languages and their predecessor Standard C89 in Figure 1, which is a copy of a slide presented by Stroustrup in [4] and which he has graciously allowed me to reproduce here.
The good news is that there is a large degree of commonality between the languages; C89 is almost a proper subset of C++98. The interesting (and possibly dismaying) thing that Stroustrup points out is that there are already features that fall into each of the seven categories of full, partial, or no overlap; the encouraging (and less dismaying) thing is that, so far at least, the features in the more incompatibly overlapping regions are still generally few and minor.
Figure 1 also illustrates that C99 and C++98 are moving forward from C89 in somewhat different directions; to see what I mean, draw two arrows from the center of the C89 bubble, one through the middle of the C99 bubble and the other through the middle of the C++98 bubble.
Why is it important that we start to converge, rather than diverge, these two streams of related language standards? Consider Stroustrup's "nightmare scenario," (Figure 2) taken also from [4].
There is a small but vocal group of people (mostly in C, some in C++) who feel that programmers would be better off if C and C++ just gave up and permanently parted ways, to be true to their different natures, to boldly sail off across separate seas in search of their own unique adventures. I have a hard time accepting this "divided we stand" thesis, but some experts do feel strongly about it.
The time is ripe, some influential experts believe, to seriously consider harmonizing Standard C and Standard C++ while still allowing each to address its own vision and strengths. I believe they are correct, and harmonization is both desirable for the community and in the interest of the continued strength of both language standards.
Today's C, and C99
Well, what exactly is the animal we call "C"? Ideally, the answer would be simply "the latest version of Standard C." After all, presumably the community would want to plan to use the latest version of a standard, including all its Amendments and TCs (Technical Corrigenda), just as we would normally want to plan to upgrade to the latest version of a software product, including all its latest dot-release updates and patches. Today, that would be C99, which has had no Amendments or TCs issued yet.
But when software product XYZ version 2.0 comes along, we don't always upgrade to the latest version immediately or at all, do we? We weigh whether the cost of the upgrade makes sense, whether we need XYZ 2.0's new features, and if so, whether we need them immediately or can wait a year for some patches or dot-releases to be issued that will increase our confidence that we're getting a stable product. Even the third-party tools vendors who produce software that work with XYZ aren't always quick to update their tools to support the latest product version (to the chagrin of those who want to use them together).
Thus it is that, in practice and to some degree, the answer to "What is C?" is different for different people and companies. In the pre-ANSI/ISO days, "C" usually meant 1978-vintage K&R C [5], and I know of at least one major multi-platform software company that in the mid-1990s still mandated that its core product's code be written in K&R-compatible C, because the company targeted scores of platforms and some of those had C compilers that were never updated to conform to ISO/ANSI C. In the Standard C days, the question that still remains is, which Standard? We have at least five levels to choose from:
- C89
- C89 plus the 1994 TC1 (Technical Corrigendum 1)
- C89 plus the 1995 Amendment 1
- C89 plus the 1995 Amendment 1 plus the 1996 TC2 (Technical Corrigendum 2)
- C99, the second complete version of the C Standard
As P.J. Plauger notes in [6], "it seems that all the hard work done by the C committee throughout the 1990s has done more to proliferate dialects than to reduce them."
In practice, of course, it's not quite that bad. TCs don't add features; they're just noncontroversial patches, and it's good to install the latest patches of a product. We can assume that if a TC is available to a base standard, it ought to be applied. So the real choices today of which C to use and to conform to boils down to the following:
- C89 + TC1
- C89 + Amendment 1 + TC2 (the most current widely-adopted Standard C)
- C99
Pretty much every C compiler vendor supports C89 + TC1. Most support C89 + Amendment 1 + TC2, so this is the big target the most up-to-date version of Standard C that is actually widely adopted by compiler vendors and programmers. (Indeed, C89 + Amendment 1 is what's incorporated wholesale into the C++98 Standard.)
The current major issue with industry conformance to the C Standard is with C99. Today, most compiler vendors do not support all the things in C99. Few that I know of have plans to support it completely anytime soon. That's a real pity and threatens to marginalize the C99 Standard. (Even ISO and ANSI standards can get ignored. It's just not pretty when it happens, because it amounts to community rejection of what the standards committee did.)
While developing C99, the C standards committee did conscientiously attempt to avoid gratuitous incompatibilities with C++ features. Still, from the C++ programmer's point of view, an important aspect of C99 is that it contains various language and library extensions that are incompatible with C++. For example, C++ already has complex as a standard library template; C99 also provides complex, but as a built-in type. For another, C++ has function overloading; C99 sort of has function overloading, but for its own standard library functions only, not for C programs in general. What's to be done?
Reconcilable Differences
There are already some signs of the committees drawing closer. For several years, the C and C++ standards committees have agreed to hold their semiannual meetings in the same locations, on adjacent weeks. This makes it somewhat easier for people who are interested in both standards to attend both meetings and is a very good step forward. Unfortunately, two weeks of meetings is so long that you have to be pretty determined to actually be there for both weeks, and so while this is promising it is still only a tentative first step.
There is still, admittedly and unfortunately, some air of distrust among some C and C++ standards committee members. Members of both bodies, while open to compromise, quite naturally fear "being compromised" by a potential merger of the two standards and standards bodies. Most members of both committees appear to me to be open to the idea of harmonization and have expressed genuine goodwill toward their colleagues in the other committee. On the other extreme, a few in each body are worried about being taken over. I hope that with more light and closer contact these fears can gradually diminish.
In [2] and [3], Bjarne Stroustrup details some specific cases where C and C++ are incompatible, and how these incompatibilities might be resolved including changes to both C and C++ where appropriate, so as to minimize the impact of each piece of harmonization on users. In fact, interestingly, Stroustrup concludes that in many cases the onus of change for compatibility ought to be on C++, not C, and that he would support changes to C++ as part of a reconciliation for the sake of C compatibility that he would never support by themselves in C++ alone. I think he's right on the money, and showing this olive branch is an important step in clearing the path toward harmonization. For balance, in [6], Plauger argues that harmonization might not really be needed, and that the community will decide what they accept and what they reject; even so, I submit that having increasingly incompatible standards and just saying that at worst we'll let the community reject one by voting with its feet is not in anyone's best interests, and in particular it's not in the best interests of the committee whose standard could be marginalized.
Here is what I believe ought to be done, and why it is in the best interests of both language standards:
- Harmonize (and possibly integrate) the standards. C++98 already incorporates C89 + Amendment 1 by reference. Ideally, Standard C would be a true subset of Standard C++. This does not mean tossing away the work the C committee has done since C89 rather, Standard C++ would have to change to accommodate C's presence hospitably and naturally, although adjustments to C would also be needed where a small change to C would be the least compromise and break the least amount of existing user code. This helps the many compiler vendors who already and widely support both C++ and C in the same product, it helps the programmers who use both languages, and it makes "official" the larger combined de facto community that benefits even those programmers who use only one of the languages.
- Keep C99's enhancements. For one example, consider that the C89 standard library contains unsafe APIs like sprintf() that are prone to buffer overruns. These have been largely fixed in C99 with the addition of now-standardized length-checked versions, such as snprintf() (see [7] for more details) . Of course we'd want to keep such improvements; this is a clear advance that shouldn't be lightly shunted aside, and if we can build on it and continue such improvements, so much the better.
- Specifically, adopt and integrate the C99 Standard library in C++0x. Plauger in [8] has been kind enough to do the legwork and research necessary to see what's needed and how to achieve it. It's not impossible, and in fact Plauger's company, Dinkumware, ships a combined library that claims to achieve near-perfect conformance to both C++98 and C99 at once.
I believe that harmonizing the standards in this way would help C99 directly: it would encourage adoption. Today, most compiler vendors I know of have no plans to implement all of C99's changes, which puts C99 in the awkward position of a standard that is danger of being marginalized or possibly even ignored. On the other hand, C++ compiler vendors (most of whom are also C compiler vendors) are closely tracking the next C++ Standard, C++0x, because they want to achieve full conformance to C++0x. If C++0x were to be harmonized with C0x so that C++0x would keep all or virtually all of C99, it would be the biggest force for C99 adoption that could exist, and certainly a much bigger motivation than is available today for the clear marketplace legitimization of C99.
Going forward as a unified language, some things must be owned by the C subset. In particular, let me call out the issue of deprecating insecure APIs: the C99 Standard goes partway toward correcting the buffer overrun exposures in the C Standard library by providing safer versions, but didn't deprecate the insecure versions, which would have put the world on notice that they might disappear in a future version of the Standard and encouraging people to switch their code to use the safer functions. Deprecating dangerous APIs is a good thing to do, but since they're part of the C subset, the C experts should own them and handle changes to them.
Summary
I believe it is in the C and C++ community's interest to harmonize Standard C and Standard C++. Based on what we know today, the potential benefits are great, the risks small or nonexistent. (Indeed, I have yet to hear about any significant risks in harmonization; even the inventor of the C++ language is happy to make changes to C++ that involve minor compromises for the sake of full compatibility, and the analyses presented so far of what it would take to reconcile the differences have not shown any issues that would threaten to break a lot of code.) It is my hope that the C and C++ committees will be able to take at least the next step or two to investigate further the possibility of harmonization, and I'll do what I can to help that process along.
Acknowledgments
Thanks to Bjarne Stroustrup and Bill Plauger for their comments on drafts of this article, and for their longtime interest and investment in the C++ programming community.
References
[1] B. Stroustrup. "C and C++: Siblings," C/C++ Users Journal, July 2002.
[2] B. Stroustrup. "C and C++: A Case for Compatibility," C/C++ Users Journal, August 2002.
[3] B. Stroustrup. "C and C++: Case Studies in Compatibility," C/C++ Users Journal, September 2002.
[4] B. Stroustrup. Possible Directions for C++0x (talk presented at ACCU in April 2002).
[5] B. Kernighan and D. Ritchie. The C Programming Language, 1st edition (Prentice-Hall, 1978).
[6] P.J. Plauger. "The C/C++ Programming Language," C/C++ Users Journal, October 2002.
[7] H. Sutter. "Sutter's Mill: The String Formatters of Manor Farm," C/C++ Users Journal, November 2001.
[8] P.J. Plauger. "Proposed C99 Library Additions to C++ (Revised)," ISO/ANSI C++ standards committee paper (ISO/IEC JTC1/SC22/WG21 paper N1372, ANSI/NCITS J16 paper 02-0030).
Herb Sutter (<www.gotw.ca>) is secretary of the ISO/ANSI C++ standards committee, author of the acclaimed books Exceptional C++ and More Exceptional C++, and one of the instructors of The C++ Seminar (<www.gotw.ca/cpp_seminar>). In addition to his independent writing and consulting, he is also C++ community liaison for Microsoft.