Enforcements
Andrei Alexandrescu and Petru Marginean
Download code for this article.
You know a concept really sticks when it undergoes the transformation from a proper noun to a common noun. Kleenex, Xerox, Q-Tips...right? It should come at no surprise, then, that I was pleased to hear during a Microsoft workshop that you can now use "modern C++ design" with the new Visual C++ .NET. By that, the speaker referredat least in my imaginationto a compendium of template-based techniques that Modern C++ Design [1] helped popularize.
Of all the Generic<Programming> articles, there is one that, due to its success, is undergoing the proper-to-common-noun "grammatical promotion." That's ScopeGuard [2], which is increasingly becoming "scope guard" in conjunction with the general technique of planting "undo" actions on the normal execution path and dismissing them when a complex operation succeeds. In a quirk of fate, my most popular article was not entirely my work, but the result a fruitful cooperation with Petru Marginean. I'm all the more happy, then, to work with him again on the article you're now reading.
Last time, we talked about assertions, a powerful design proofing and debugging mechanism. Today, we'll discuss the release-time counterpart of assertions: enforcements, which are comfortable on-the-fly condition verifiers. Much as ScopeGuard did, the macro ENFORCE further reduces dramatically the amount of coding you need to dedicate to error handling. ScopeGuard and ENFORCE work very well independently, but they are best when used together: ENFORCE is the exception initiator, and ScopeGuard is the exception propagator.
ASSERT(false);
Although the "Assertions" article [4] asserted that assertions are cool (and in an assertive manner I'd say), the code attached to the "Assertions" article contained a couple of bugs. These bugs were first revealed by Paul A. Renard, who wrote:
- Add ASSERT(true) as the first item in main's try block. You'll find that the user is unfortunately queried about a true assertion. I moved that test (regarding holds_) from DoHandle to Handle, which is what the text in your article implies should be done anyway.
- Add yet another test for ASSERT(false), maybe just before the current one. Then answer G when queried. You'll still be queried on the next ASSERT, which is probably not what you want. As the article text implies, you can check for global ignores with the help of a static item in the Asserter class. The one inside DoHandle is too late to prevent the query from happening. Intertwined with this problem is that since the query is made, an input is extracted from cin. That brings me to the next observation.
- I assume that cin.get() is just a suggestion. Since that method doesn't extract terminators from the input stream, when you answer the first-ever query with I, say, the query from the second ASSERT will actually read \n as a response. The input stream really contains I\n. cin.getline(), which isn't much better, but for different reasons. The main problem is the assumption that cin is only being used for responses to ASSERTs.
Users who read other things from cin (which may be redirected from a file!) will be baffled by ASSERT's asynchronous use of the same stream. However, it is likely that users who use cin have also worked out a smarter cin mechanism that can be used in place of the cin.get() in AskUser.
Enforcements
Imagine you have acres of code in front of you, and you gaze over it. You know you'll have to plow your way through that code, and you also know you'll have to write some more acres of new code.
A very useful activity is to look through the code and try to find some common pattern. You try to understand what the concept behind that pattern is. Very likely, all the occurrences are not incidental but rather different incarnation of the same underlying concept. Then, you try to formulate the abstraction once so you can concisely express all pattern occurrences as realizations of that concept.
Now stop and look at the sizes of the different patterns you might analyze. If the patterns are huge, referring to entire applications as basic blocks, for example, then you're working with architectural patterns.
If the patterns are medium-sized, spanning multiple objects and/or functions, then you're a design patterns hunter.
If the patterns are rather small, involving as little as 3-10 lines of code, then you're in idiom land.
Finally, if the patterns deal with just 1-2 lines of code, you're getting nitpicky about coding style and formatting.
These four classes of magnitude span a range that is equivalent (on the small end) to the view of a building at the atomic level... In real architecture, minute imperfections in the atomic structure don't matter. In software architecture, any and all of those defects can bring down the whole "building" (or, to use a more familiar term, "build"). Conversely, you need to use proper techniques at all scales to succeed. If you're enamored with the minute details and neglect the big picture, you'll waste talent in building intricate arabesques that aren't visible from a distance. If you focus on the big and disregard the details, you'll construct a giant with clay feet.
Such is the extraordinary challenge of writing software, a challenge that people working in other trades never fully understand.
But let's not digress.
Enforcements fall in the category of idioms. More specifically, enforcements greatly simplify error-checking code without impacting readability and the fluency of the normal flow.
The idea stems from the following observation. Whenever you throw an exception, you do it as the result of a Boolean test, something like this:
if (some test) throw SomeException(arguments);If that the exception appears all over the place, why not putting it in a little function:
template <class E, class A> inline void Enforce(bool condition, A arg) { if (!condition) throw E(arg); }and use it like this:
Widget* p = MakeWidget(); Enforce<std::runtime_error>(p != 0, "null pointer"); Enforce<std::runtime_error>(cout != 0, "cout is in error"); cout << p->ToString();So far, so good. Now, let's make a couple of important observations.
First, the condition tested is not always Boolean. It might also be a pointer or an integral type. Second, it is very likely that you are going to use the tested value right after testing it. For instance, you may wish to make sure a pointer is not Null before using it, or you may want to use a file handle after creating it. So let's modify Enforce so it has filtering semantics by passing back the value received:
template <class E, class A, class T> inline T& Enforce(T& obj, A arg) { if (!obj) throw E(arg); return obj; } template <class E, class A, class T> inline const T& Enforce(const T& obj, A arg) { if (!obj) throw E(arg); return obj; }(Two versions are needed, for const and non-const objects.) You can add two overloads to express the fact that you often know what exception you'll throw and what type of argument it takes:
template <class T> inline T& Enforce(T& obj, const char* arg) { return this->Enforce<std::runtime_error, const char*, T>(obj, arg); } template <class T> inline const T& Enforce(const T& obj, const char* arg) { return this->Enforce<std::runtime_error, const char*, T>(obj, arg); }If you also agree with passing a generic argument (message) to std::runtime_error, the call can be further simplified. All you need to do is add an extra couple of overloads:
template <class T> inline T& Enforce(T& obj) { return this->Enforce<std::runtime_error, const char*, T>(obj, "Enforcement error"); } template <class T> inline const T& Enforce(const T& obj) { return this->Enforce<std::runtime_error, const char*, T>(obj, "Enforcement error"); }Now, with only these simple additions, the code became considerably more expressive:
Enforce(cout) << Enforce(MakeWidget())->ToString();In one line you not only make a widget and print it to the console, but you also signal any errors that might occur in the process! Should you also want to free the created Widget automatically, you add auto_ptr to the mix:
Enforce(cout) << Enforce(auto_ptr<Widget>(MakeWidget()))->ToString();Wow! Not bad at allespecially when you compare it with the competition.
Without disrupting the normal execution flow, Enforce nicely filters errors out of the way. Thus, Enforce provides a convenient means for checking and weeding out error conditions.
It's very important to make error handling as comfortable to the programmer as possible. This is because error handling is, unfortunately, often considered unpaid work. Managers don't value error handling as a feature. Consequently, hurried, overworked, underspecked [3] programmers are left to cross their fingers and hope that cout will always be in a good state and MakeShape never returns the Null pointer. And crossing fingers is not really a good programming technique.
Embellishing Enforce
The message "Enforcement failed," as it appears in the code above, is not particularly helpful, so we've got to do something about it. Fortunately, Petru's inspiration is unstoppable. As Seinfeld told Kramer with admiration: "That brain never stops working!"
First off, some good information to pack in the error message would be the offending __FILE__ and __LINE__. Also, seeing the expression that failed would be informative as well. As we did with Asserter [4], we'll build a little class that holds this information for us:
template <class Ref> class Enforcer { Ref obj_; const char* const locus_; public: Enforcer(Ref obj, const char* locus) : obj_(obj), locus_(locus) {} Ref Enforce() { if (!obj_) throw std::runtime_error(locus_); return obj_; } };The obj_ member holds the object that's being tested. The locus_ member is the aforementioned information about the file, the line, and the expression.
Why did we call Enforcer's template argument Ref and not the traditional T? The explanation is that we'll always instantiate Enforce with a reference type (not a value type) which will save us a lot of duplication down the road. (If you've ever had to write very similar functions for const and non-const references, you know what we're talking about.)
Ok, now to create Enforcer objects, we'll rely on a little function so that we benefit of that type deduction thing:
template <typename T> inline Enforcer<const T&> MakeEnforcer(const T& obj, const char* locus) { return Enforcer<const T&>(obj, locus); } template <typename T> inline Enforcer<T&> MakeEnforcer(const T& obj, const char* locus) { return Enforcer<T&>(obj, locus); }We now need to just add icing to the cake - the promised macro. We know you hate macros, and you're not alone, but we hate repeatedly typing __FILE__ and __LINE__ even more.
#define STRINGIZE(something) STRINGIZE_HELPER(something) #define STRINGIZE_HELPER(something) #something #define ENFORCE(exp) \ MakeEnforcer((exp), "Expression '" #exp "' failed in '" \ __FILE__ "', line: " STRINGIZE(__LINE__)).Enforce()The STRINGIZE and STRINGIZE_HELPER macros are the complicated litany needed by the preprocessor to turn the number __LINE__ into a string. (No, #__LINE__ doesn't work.) I never knew exactly why and how these macros work (they have something to do with preprocessing phases... aw, traumatic memories start coming back to my mind! Stop, doctor!) - and, frankly, I'd be more interested in knowing how the sewer system in NYC works than in the details of this business. Suffice it to say that STRINGIZE(__LINE__) yields a string containing the current line number. For those keeping the score at home, [6] provides a thorough explanation.
This column's long-standing tradition is to not care about compiler idiosyncrasies, so let's just mention en passant that the STRINGIZE trick won't outsmart MSVC's preprocessor, which yields mysterious strings such as (__LINE__Var+7) as the result of STRINGIZE(__LINE__).
On the bright side, Enforcer's initialization is as cheap as two pointer assignments, and it saves very useful information. You can easily add information about the date of the file and the date of the build, as well as nonstandard information such as __FUNCTION__.
Supporting Arguments and Custom Predicates with ENFORCE
ENFORCE is a nice concept, but don't you hate it when you take something out in the field and you notice it's not quite as applicable to the real world as that article writer made it look?
We did, and we noticed two important shortcomings.
First, passing a custom string in additionor instead ofthe default file, line, and expression information is often desirable.
Second, ENFORCE only tests things against zero with the ! operator. However, in real life, sometimes the "wrong" value that needs to be checked is not zero. Many APIs that use integral IDs, including the Standard C file functions in <io.h>, return -1, and not zero, to signal an error. Some other APIs use a symbolic constant. And COM uses a more complex condition: If the returned value is zero (the symbol S_OK), everything's fine. If the returned value is less than zero, that means a failure occurred, and the actual number returned gives information about the nature of the error. If the returned value is greater than zero, the state is "success with info," so there's something valuable in the returned value [5].
Clearly we need a more flexible checking and reporting framework. We need to be able to configure Enforcer on two dimensions (predicate and argument passing mechanism), preferably at compile time so that the enforcement mechanism doesn't introduce more overhead than the equivalent hand-written code. (A sanity check is always worth doing: does some abstraction, when brought back to concrete, compare well with the equivalent not abstracted solution?)
Policy-based design fits this problem like a glove. So Enforce gets advanced in rank from a simple class to a two-arguments template class. The first policy is the predicate policy (which deals with the testing), and the second policy is the raising policy (which deals with constructing and throwing the exception object).
template<typename Ref, typename P, typename R> class Enforcer { ... use the two policies (see next section) ... };The two policies have very simple interfaces. Here's how the default policies would look:
struct DefaultPredicate { template <class T> static bool Wrong(const T& obj) { return !obj; } }; struct DefaultRaiser { template <class T> static void Throw(const T&, const std::string& message, const char* locus) { throw std::runtime_error(message + '\n' + locus); } };Implementation Details (and Neat Tricks)
Ok, now it shouldn't be too hard to have Enforcer use its two policies to test values and throw exceptions.
A nice to have thing would be to allow the user to format an arbitrarily baroque message in case of an error; furthermore, that baroque formatting (which could be quite costly at runtime) should be avoided unless an exception will really be thrown. With some inspiration and the proverbial 99% of transpiration, we devised a mechanism that fulfills these requirements.
Let's show the code and then proceed with explanations. The final class Enforcer is shown below.
template<typename Ref, typename P, typename R> class Enforcer { public: Enforcer(Ref t, const char* locus) : t_(t), locus_(P::Wrong(t) ? locus : 0) { } Ref operator*() const { if (locus_) R::Throw(t_, msg_, locus_); return t_; } template <class MsgType> Enforcer& operator()(const MsgType& msg) { if (locus_) { // Here we have time; no need to be super-efficient std::ostringstream ss; ss << msg; msg_ += ss.str(); } return *this; } private: Ref t_; std::string msg_; const char* const locus_; }; template <class P, class R, typename T> inline Enforcer<const T&, P, R> MakeEnforcer(const T& t, const char* locus) { return Enforcer<const T&, P, R>(t, locus); } template <class P, class R, typename T> inline Enforcer<T&, P, R> MakeEnforcer(T& t, const char* locus) { return Enforcer<T&, P, R>(t, locus); } #define ENFORCE(exp) \ *MakeEnforcer<DefaultPredicate, DefaultRaiser>(\ (exp), "Expression '" #exp "' failed in '" \ __FILE__ "', line: " STRINGIZE(__LINE__))Alright, so Enforce defines two new operators: operator* and the templated operator(). Also, note that the ENFORCE macro prepends the "*" to the MakeEnforcer call. How does this all work, and why the scaffolding?
Say you write the following:
Widget* pWidget = MakeWidget(); ENFORCE(pWidget);The ENFORCE macro expands to something like:
*MakeEnforcer<DefaultPredicate, DefaultRaiser>((pWidget), "Expression 'pWidget' failed in 'blah.cpp', line: 7")MakeEnforcer gets called creating an object of type:
Enforcer<const Widget*&, DefaultPredicate, DefaultRaiser>That object is created with its two-argument constructor. Notice that locus_ is initialized to a non-null pointer only if P::Wrong(t) is true. In other words, locus_ points to useful information only if an exception ought to be thrown, otherwise it's null.
To the object thusly created, operator* is applied. Unsurprisingly, if locus_ is non-null, R::Throw is called. Otherwise, the object being analyzed is just passed back.
On to a more interesting example. Consider the code below:
Widget* pWidget = MakeWidget(); ENFORCE(pWidget)("This widget is null and it shouldn't!");Here, after the Enforcer object is created as above, operator() enters in action. That operator either appends the incoming information to the msg_ member, or ignores it altogether if pWidget is non-null and there's no error. In other words, the normal execution path is as fast as a test. Here's the beauty of it all - the real work is done only in case of an error.
Because operator() is templated and uses a std::ostringstream, it supports anything that you could send to cout. Furthermore, operator() returns *this, so you can chain successive calls to it. Consider this illustrative example:
int n = ...; Widget* pWidget = MakeWidget(n); ENFORCE(pWidget)("Widget number ")(n)(" is null and it shouldn't!");We don't know about you, but we were thoroughly pleased with this design. Or, which one of conciseness, expressiveness, and efficiency don't you like?
Customizing the Predicate and Raiser Policies
The policy-based Enforcer provides important hooks that allow unbounded variation. For example, checking handle values against -1 (instead of zero) is now a six-liner:
struct HandlePredicate { static bool Wrong(long handle) { return handle == -1; } }; #define ENFORCE(exp) \ *MakeEnforcer<HandlePredicate, DefaultRaiser>((exp), "Expression '" #exp "' failed in '" \ __FILE__ "', line: " STRINGIZE(__LINE__))Don't forget that Enforce returns its incoming value, which confers a lot of expressiveness to the client code:
const int available = HANDLE_ENFORCE(_read(file, buffer, buflen));The line above reads data from a file, records the number of bytes read, and signals a possible error without missing a beat. Cool!
Similarly, you can define handy new policies and XYZ_ENFORCE macros for your application with ease. There would be one XYZ_ENFORCE macro per error encoding convention. Most applications would use 1-4 different conventions. In practice we've encountered the following common conventions:
- Good ole ENFORCE that we discussed above. Compares using operator! and throws std::runtime_exception.
- HANDLE_ENFORCE. Compares against -1 and throws some exception.
- COM_ENFORCE. The result is deemed wrong if it is negative. The raising policy retrieves the error message from the COM-return code and packs it in the exception object being thrown. We believe that this truly is an invaluable tool in building serious COM applications.
- CLIB_ENFORCE. Many functions in the C Standard Library return zero on error and have you inspect errno for details about the error. CLIB_ENFORCE works best in conjunction with a nice Raiser policy that converts errno to text and puts the text into the exception being thrown.
- WAPI_ENFORCE. Some functions in the Windows API return zero on success and a positive error code in case of error.
It's very easy to accommodate new error-encoding conventions.
Conclusion
We find the notion of automated value enforcement extremely useful, to the extent that coding is fun with it and hard without it. Enforcement is way more than condensing a couple of lines of code into one. Enforcements allow you to concentrate on the normal flow of the application and filter out, naturally and expressively, unneeded values. This filtering is achieved through filter functions that return their incoming argument.
A couple of macro tricks add useful information at a very low run-time cost.
Parameterization through template arguments implements a policy-based design that looks good not only on paper but also in practice. The policy-based approach produces a design a low run-time overhead (which is as little as a handwritten if) and a highly configurable framework that can accommodate the most peculiar error-encoding conventions.
We tested the attached code with Microsoft Visual C++ .NET Everett Beta and gcc 3.2. Enjoy!
Bibliography and Notes
[1] A. Alexandrescu. Modern C++ Design (Addison-Wesley Longman, 2001).
[2] Andrei Alexandrescu and Petru Marginean. "Simplify your Exception-Safe Code"
[3] To underspec means to hand insufficient specifications to a programmer.
[4] Andrei Alexandrescu. "Assertions"
[5] This convention looks nice at face value, but somehow led to some backwards convention. In COM-land, S_TRUE is zero, and S_FALSE is 1. Don't kill the messenger!
[6] http://www.jaggersoft.com/pubs/CVu10_1.htmlAbout the Authors
Andrei Alexandrescu is a Ph.D. student at University of Washington in Seattle, and author of the acclaimed book Modern C++ Design. He may be contacted at www.moderncppdesign.com. Andrei is also one of the featured instructors of The C++ Seminar (<http://thecppseminar.com>).
Petru Marginean is Vice President at Bear Stearns, New York. He has more than 9 years experience as C++ developer. He can be reached at [email protected].