Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Handles and Exception Safety, Part 4 -- Tracking References without Counters


C++ Made Easier: Handles and Exception Safety, Part 4 -- Tracking References without Counters

Introduction and Review

The articles in this series so far have centered on the notion of a handle class. We can attach an object of a handle class to an object of another class and use the handle in place of the object to which it is attached. We would like to be able to copy a handle independently from copying the associated object. Accordingly, we must allow for the possibility that two or more handles might refer to the same object:

In such a case, when we destroy a handle, we must avoid destroying the object to which a handle is attached unless we know that no other handles are attached to that object. We have solved this problem so far giving each object an associated use count, which counts how many handles are attached to that object.

The first kind of use count that we developed was intrusive, in the sense that it resides in the object itself. Then we moved on to a non-intrusive use count, which is a separate object. Most recently, we defined a class, named Use, which encapsulates the idea of a use count:

class Use {
public:
  Use(): refcnt (new unsigned(1)) 
    { }
  Use(const Use& u): refcnt (u.refcnt) 
    { ++*refcnt; }
  ~Use() {
    if (--*refcnt == 0) delete refcnt;
  }
  Use& operator=(const Use&);
  operator size_t() const {
    return *refcnt;
  }
private:
  unsigned* refcnt;
};

Use& Use::operator=(const Use& u)
{
  ++*u.refcnt;
  if (--*refcnt == 0)
    delete refcnt;
  refcnt = u.refcnt;
  return *this;
}
We can use this class as part of a handle class by including a member of type Use as part of the handle. To illustrate this technique, we have been using the classic example of a class named Shape, which acts as a handle attached to an object from a hierarchy rooted in Shape_base:

class Shape {
public:
  Shape(Shape_base*b): p(b) { }
  ~Shape() {
    if (u == 1) delete p;
  }

  Shape& operator=(const Shape&);
  // Cause this handle to be the
  // only one referring to its
  // object, by copying the object
  // if necessary.
  void make_unique() {
    // Are there other objects
    // that share this one?
    if (u > 1) {
      // This definition might
      // throw.
      Use new_u;
      // If the following assignment 
      // throws, no state change has 
      // happened yet.
      p = p->clone();
      // Now we can change the
      // state safely.
      u = new_u;
    }
  }

  void draw() const { p->draw(); }
  // We can put other functions
  // here as needed

private:
  Use u;
  Shape_base* p;
};
Suppose, for example, that we were to write:

Shape s1(new Circle);
Shape s2(s1);
We would then have a data structure that looks like this:

Here, u is the Use object that is part of each Shape, and p is the pointer to Shape_base that is also part of each Shape object.

The Problem

Our last article concerned itself specifically with the make_unique function, the purpose of which is to ensure that a handle is the only one associated with its particular object. Before we arrived at the present version of make_unique, we tried to define it as:

  // Incorrect -- not exception-safe.
  if (u > 1) {  // Are there other objects that share this one?
    u = Use();     // If so, create a new copy.
    p = p->clone();
  }
This strategy is exception-unsafe. The reason is that when we evaluate p->clone() to form a copy of the object to which our handle is attached, we have already set u to refer to a new use-count object. If evaluating p->clone() throws an exception, we cannot simply propagate this exception to our caller because u refers to a newly created use count with a value of one. Therefore, when our handle object is destroyed, it will decrement the counter to zero and will then destroy the object to which our handle is attached. This destruction will happen even if there are other handles attached to the same object.

The form we actually used for the make_unique function guards against this possibility by avoiding any change to the value of u until after successfully calling p->clone.

Unfortunately, the fact that constructing a Use object might throw an exception causes another, more serious problem. Consider our earlier example:

// Incorrect -- not exception-safe
Shape s1(new Circle);
This usage is so obvious that you probably did not notice anything wrong with it. Indeed, even though we have now marked it with a comment noting that it is exception-unsafe, you may have difficulty finding the reason. The point is that because class Shape has a member of class Use, it is possible for the act of allocating a Shape object to throw an exception if there is insufficient memory for the Use member to allocate its counter. Consider what happens in that case:

  • Part of constructing s1 is to execute its constructor. To do so, it is necessary to evaluate the constructor's arguments before executing the constructor itself. Accordingly, the new Circle object must be allocated before anything else happens.
  • Part of constructing s1 is constructing s1's Use member. Let's assume that the Use member throws an exception during construction. Then this exception will propagate out through the Shape object, causing an exit from the block that contains the definition of that Shape object.
  • As part of exiting that block, s1, which is presumably a local variable, is destroyed. However, there is no way of destroying the Circle object that was allocated by new Circle to pass to the Shape constructor. After all, the Shape constructor has not yet had a chance even to examine the address of the Circle object, let alone store it for later deletion.
In other words, as a side effect of the fact that the Use constructor might throw an exception, the seemingly obvious usage:

Shape s1(new Circle);
has the possibility of creating a memory leak. Moreover, this possibility is particularly insidious because it will happen only when new Circle succeeds and the Shape constructor then runs out of memory. In other words, the memory leak happens in circumstances that are obscure enough that it is difficult for ordinary testing strategies to reveal it.

How can we avoid such obscure problems?

The Solution

The memory leak is possible for a reason that we have seen before: the program changes the state of the system and then does something that might throw an exception. Specifically, it executes new Circle, which changes the system state by allocating a resource, and then constructs a Shape object, which might throw an exception as part of constructing its Use member.

We had a similar problem in the definition of make_unique: the original version of make_unique assigned a new value to u, changing the state, and then tried to execute p->clone(), which might throw an exception. We solved the problem there by taking care to do the operations in the right order: first we created a Use object, which might throw an exception, but did not change the state; next we called p->clone(), which might throw an exception or change the state, but not both; and finally -- only after p->clone() was known to have succeeded -- we changed the state of u.

Unfortunately, it is difficult to use the same strategy here. For example, we might write:

// might throw, but no state change yet
Shape s1;
// might throw or change state, but not both
Shape_base* p = new Circle;
try {
  s1 = Shape(p);  // might throw
} catch(...) {
  delete p;
  throw;
}
but who wants to have to go through such gyrations merely to create a handle?

A much simpler way of avoiding this problem is to ensure that:

Shape s1(new Circle);
contains only a single operation that is capable of throwing an exception. We do not control the definition of class Circle, so we have no chance of making new Circle immune to exceptions. Therefore, we shall have to avoid exceptions in the Shape constructor. In order to do so, we shall have to redesign the constructor so that it does not allocate any dynamic memory, which implies that we shall have to avoid any dynamic memory allocation in the constructor of class Use.

How is such a redesign possible? One way is to go back to intrusive reference counting. As long as the reference count is part of the Shape_base object itself, no extra allocation is needed. However, suppose we want to stick with a non-intrusive strategy. Where can we possibly store the reference-counting information?

The answer is that although we do not wish to intrude on the Shape_base hierarchy, there is no reason not to intrude on the Use objects themselves. Suppose that instead of making each Use object refer to a separate counter, we instead link together all of the Use objects that refer to a particular Shape_base object. Then we might have a data structure that looks like this:

Here, the box marked forw in each handle object points to the next object in sequence, and the box marked back points to the previous object in the same sequence.

It should be clear that this data structure can do almost everything for us that our earlier data structure, with its dynamically allocated counter, can do, and that the new data structure does not require any dynamic memory allocation. We say "almost everything" because the new data structure does not offer the ability to determine quickly the value of the counter. If we want to know how many handles refer to an object, we have little recourse but to visit all of the handles and count them.

Fortunately, we do not normally care how many handles refer to an object; we care only whether that number is one. Therefore, we can replace the operator size_t in our Use class with, say, a member function named only that returns a bool indicator of whether its object is the only one in its chain.

With that change in mind, we can implement our revised Use class:

class Use {
public:
  Use(): forw(this), back(this) { }
  ~Use() { remove(); }
  Use(const Use& u) { insert(u); }
  Use& operator=(const Use& u) {
    if (this != &u) {
      remove();
      insert(u);
    }
    return *this;
  }

  bool only() const;
private:
  mutable const Use* forw;
  mutable const Use* back;
  void insert(const Use&) const;
  void remove() const;
};
We are defining the constructor, destructor, copy constructor, and assignment operator in terms of two functions, insert and remove, which we shall define soon. The assignment operator is the only one of these functions that presents any difficulty. It checks for self-assignment, in which case it does nothing. Otherwise, it removes the current object from the list that it occupies and inserts it into right-hand side's list. In either case, it returns *this. We note in passing that the usual rule of thumb -- that assignment operators that check for self-assignment are exception-unsafe -- does not apply to this example because this assignment operator does nothing that could possibly throw an exception.

The insert function will insert our object into the list corresponding to another Use object; the remove function will remove our object from the (only) list in which it resides.

We define forw and back as being mutable because we will want the insert and remove to be able to modify these pointers even for a const object. For example, note that the copy constructor calls insert with a reference to a const Use object. In order for insert to put that object into a doubly linked list, it must be able to modify that object's pointers even though the object is const. The way to arrange for such modification is to use mutable.

Next, let us write the insert and remove functions. The easiest way to understand these functions is to draw a picture of the data structure yourself and add and remove links as needed:

void Use::insert(const Use& u) const {
  this->back = &u;
  this->forw = u.forw;
  this->back->forw = this->forw->back = this;
}

void Use::remove() const {
  this->forw->back = this->back;
  this->back->forw = this->forw;
}
All that remains is to define the only function. This function turns out to be nearly trivial, because Use object is the only one in its list if and only if it is linked to itself:

bool Use::only() const {
  return this == this->forw;
}
In order to use the revised Use class, we need to change our Shape class slightly, because we can no longer convert a Use object to an integer. Instead, we call the only function directly:

class Shape {
public:
  Shape(Shape_base* b): p(b) { }
  ~Shape() { if (u.only()) delete p; }

  Shape& operator=(const Shape&);
  void make_unique() {
    // Are there other objects that share
    // this one?
    if (!u.only()) {  
      // If it throws, no state change yet.
      p = p->clone();
      // This statement cannot throw!
      u = Use();
    }
  }

  void draw() const { p->draw(); }
  // Insert other functions as needed

private:
  Use u;
  Shape_base* p;
};
Note the simplification of the make_unique function. We now call p->clone() first; either it throws an exception or it doesn't. If it throws an exception, there haven't been any state changes yet, so we're safe. If it doesn't throw its exception, we assign the result to p. This assignment changes the state -- but now, assigning Use() to u cannot throw an exception because it no longer allocates any dynamic memory! Therefore, this sequence of operations is safe.

Discussion

When programs get into trouble over exceptions, the act of throwing the exception itself is rarely the cause or the trouble. Rather, the trouble comes when the program allocates a resource that it fails to free in the exceptional case.

One strategy for avoiding such trouble is to catch exceptions and to free explicitly whatever resources were allocated before the exception happened. However, it is often possible to simplify such programs by doing first whatever might throw the exception and only then allocating resources that might have to be freed explicitly.

We saw one example of this strategy when we rewrote:

{
  u = Use();
  p = p->clone();
}
as:

{
  Use new_u;
  p = p->clone();
  u = new_u;
}
Here, only the call to p->clone() allocated a resource, so we took care to call it only after creating the Use object, which might have thrown an exception.

When we learned that even this strategy was insufficient to make:

Shape s1(new Circle);
work in all cases, we used the same strategy again when we rewrote the definition of Use to ensure that it would never throw an exception.

Our rewritten Use class combines the advantage of the intrusive approach, which avoids having to allocate dynamic memory, with the advantage of the non-intrusive approach, which avoids the need to rewrite the class to which the handle is attached.

About the Authors

Andrew Koenig is a member of the Large-Scale Programming Research Department at AT&T's Shannon Laboratory, and the Project Editor of the C++ standards committee. A programmer for more than 30 years, 15 of them in C++, he has published more than 150 articles about C++ and speaks on the topic worldwide. He is the author of C Traps and Pitfalls and co-author of Ruminations on C++.

Barbara E. Moo is an independent consultant with 20 years' experience in the software field. During her nearly 15 years at AT&T, she worked on one of the first commercial projects ever written in C++, managed the company's first C++ compiler project, and directed the development of AT&T's award-winning WorldNet Internet service business. She is co-author of Ruminations on C++ and lectures worldwide.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.