Herb Sutter (http://www.gotw.ca/) is a leading authority and trainer on C++ software development. He chairs the ISO C++ Standards committee and is a Visual C++ architect for Microsoft, where he is responsible for leading the design of C++ language extensions for .NET programming. His two most recent books are Exceptional C++ Style and C++ Coding Standards (available in August and October 2004). Jim Hyslop is a senior software designer for Leitch Technology International. He can be reached at jhyslop@ ieee.org.
Wendy found me outside the conference room. "Hey pardner." "Oh. Hey," I responded. "You looking for me?"
"Oh yeah. Yeah. Kerry found something interesting. You'd better come and have a look at this."
I grinned, perhaps a little too gleefully. "Some horrible code that Bob wrote?"
"Uh, sort of..." she trailed off. I frowned and followed her to Kerry's desk. After our hellos, Wendy said: "Kerry, show him."
"You've got this collection generator," said our little intern, and showed me codemy own code. A small adrenaline chill started to spread in my lower back.
"My code?"
Wendy gave a wan smile. "Yep. Your code," she confirmed. Simplified, what they showed me boiled down to this:
class Thing { /*...*/ }; vector<Thing*>* MakeCollection();
"Okay, right," I said. "I remember this. I checked it in last week. Is there a problem?"
"Ah, yes," Kerry said quietly. "There sure is somewhere. See, I'm using the vector you're returning and I'm leaking like crazy."
"Oh, that. You have to go and delete"
"But I do delete! Look here," Kerry interrupted, clearly frustrated. He showed me:
vector<Thing*>* v = MakeCollection(); // ... delete v;
"Right," I leaned forward, reaching for the keyboard, "but that's not enough. Like I was about say, you have to go and delete each object the container owns, then the container itself. But you don't need to call all that yourself, because there's a DeleteCollection helper that does it for you." I pulled up the API and showed them:
void DeleteCollection( vector<Thing*>* );
"So," I continued, "instead of delete v;, you have to write DeleteCollection( v );. That's all."
"Oh."
Kerry and Wendy thought about this. Finally Wendy said: "Wonderful. So those Things are really owned by the collection."
"Uh-huh."
She seemed dubious. "I dunno. This all seems pretty fragile, doesn't it?"
"Uh, well, maybe," I allowed. "I, uh, didn't actually think anyone else would ever use this code. I wrote it as a helper for myself so I, uh..."
"Yes?" she prompted sweetly.
"Yes?" Kerry added tartly.
"...I, uh, took a shortcut."
"You were sloppy, pardner," she gibed gently.
"I took a shortcut," I smiled sheepishly.
"Well, all I know is that my code leaked and Bob blamed me," Kerry put it into more personal perspective. At that point, I suddenly understood his frustration a lot better.
"All right, I'll take my lumps," I agreed. "So, Kerry, how would you fix it?"
That slowed him down. "Um."
"Think smart," Wendy prompted helpfully. I shot her a look to tell her to let him work it out himself, but she took pleasure in making life easier for Kerry: "Shall I give you a pointer?"
Then Kerry smiled, and I could see the penny had dropped. "Smart pointer!" he said. "I could use a smart pointer instead of a dumb pointer!"
"Right," I accepted it. "So go on, fix my code." Gleefully Kerry took the keyboard and changed it to:
auto_ptr< vector<Thing*> > MakeCollection();
"I guess it's good enough," Wendy said. "That'll work. It's not what I'd have written, but it'll work."
"It is one of the three things auto_ptr is good for," I agreed. "Copying an auto_ptr has transfer of ownership semantics, and that explicitly documents that you're passing ownership back to the caller. MakeCollection is a 'source' function."
"What are the other two things it's good for?" Kerry asked.
"As a local automatic object to own a heap object that's local to a function," and "As a function parameter to a 'sink' function that assumes ownership," Wendy and I responded simultaneously.
Kerry blinked. But, encouraged and undaunted, he nodded and continued his changes:
auto_ptr< vector< auto_ptr<Thing> > > MakeCollection();
"Noooo!" Wendy and I reacted in unison, making Kerry jump.
"Why not? Just the other day Bob showed me this kind of code and I..." Kerry trailed off, realizing that he was not citing the most reliable authority.
Snap. This time all three of us jumped a little, and turned to find the Guru standing behind us and smiling beatifically. Her voice drifted among us: "No, child. Never that."
"Uh, hi," we all said.
"Go on, apprentice," she said to me, gesturing with the thinner-than-usual tome she had just closed. "Why never defile a container with auto_ptr objects?"
"That's easy, everyone knows by now. They're not value types like ints that you can just copy around like containers assume they can," I shrugged. "Destructive copy semantics, woo-hoo. Bad trip. Just try sorting a beast like that..." I grabbed a whiteboard marker and wrote:
vector< auto_ptr<Thing> > v; // evil, icky, yucky sort( v.begin(), v.end(), DerefLess() ); // compares dereferenced values
I jabbed at the sort call with the marker. "After you do that, typically some of your pointers will be null and some Things will have been deleted."
"Indeed, my child, it is just as the prophet Meyers and others preach." She waved her thin tome. [1, 2, 3] "But pray tell once again, the three uses of auto_ptr?"
This time we hesitated. "Sources, sinks, and locals?" I made it a question because now I was less certain.
"Correct," she agreed. "But now that shared_ptr is added to the Blessed Standard Library, it is a better substitute for auto_ptr...for all three uses of auto_ptr. Indeed, there is little reason left to ever use auto_ptr. And, naturally, blessed code will avoid using bald pointers alone."
"Oh!" Kerry exclaimed, surprised.
"Yes, all three main uses of auto_ptr can be replaced with shared_ptr and be at least as righteous." She wrote:
shared_ptr<T1> Source(); // at least as good as auto_ptr<T1> void Sink( shared_ptr<T2> ); // at least as good as auto_ptr<T2> void f() { shared_ptr<T3> local( new T3 ); // at least as good as auto_ptr<T3> }
"And," the Guru continued, "shared_ptrs should be preferred in general because they are safe to be stored in containers. If you use shared_ptr all the time, my child, you will never forget and accidentally write auto_ptr where you should not." She amended my whiteboard scrawlings:
vector< shared_ptr<Thing> > v; // much better sort( v.begin(), v.end(), DerefLess() ); // compares dereferenced values
"This," she declared, "works correctly. It is the default correct way to have owning collections of heap-based objects. And so you really want this..." Moving to the keyboard, she typed:
shared_ptr< vector< shared_ptr<Thing> > > MakeCollection();
"There are other improvements that could be made to let this function live a more blessed life," she continued, "but this corrects the direct evils of the original. And there is no further need of DeleteCollection, which may be removed and rest in peace."
"I get it," Kerry enthused. "And it's still better than what he wrote" I bristled slightly. "because if you return it by a bald pointer, the user might not delete it, but if you return it by shared_ptr, all is okay. Cool."
"Indeed. There are three or four main motivations for wanting to have a container of pointers," the Guru said softly. "Apprentice, pray name one?"
"Uhhh," I uhhhed, thinking furiously. "Well, in this case, I did it because I wanted a polymorphic container that could store objects of different but related types. That is, sometimes I insert Things and sometimes I insert SpecificThings that are derived from Thing." I showed her some other code in my original module:
vector<Thing*>* v = MakeCollection(); // an owning container v->push_back( new Thing ); // ok v->push_back( new SpecificThing ); // ok DeleteCollection( v );
"But then why not still use a smart pointer?" Wendy asked.
"Indeed," the Guru smiled, and amended the code:
vector< shared_ptr<Thing> > v; // a safe owning container v.push_back( shared_ptr<Thing>( new Thing ) ); // ok v.push_back( shared_ptr<Thing>( new SpecificThing ) ); // ok // no need for DeleteCollection
"Okay, let me make sure I've got this. So one motivation for containers of pointers is to have a polymorphic container that can hold objects of different but related types," I mused half-aloud, "and if the collections own their objects, there's no drawback to using shared_ptrs and plenty of reasons to use them. But what about other scenarios...?" I trailed off, still thinking and looking at the ceiling.
"Index containers!" Wendy suddenly put in.
"Pray tell?" the Guru encouraged her to continue.
Wendy brightened. "When you have one main container that holds objects, maybe sorted one way, and you want to be able to access it using different sort orders without resorting the container each time. So you have secondary containers of pointers into the main container, and those are sorted by whatever you want...using dereferenced compare functors such as DerefLess. It's just like a database where you have indexes that provide fast sorted access into a main database table."
"And the main difference...?" the Guru prompted.
"They're nonowning," I offered.
"Indeed," she agreed. "So prefer to use a secondary container of iterators, instead of a secondary container of pointers, for that case. Can you think of other reasons to hold objects by pointer, rather than by value?"
"If they aren't value-like types?" Wendy offered.
"Well done," nodded the Guru. "Types that are not value-like must be held by pointer, and again preferably a smart pointer, specifically by default shared_ptr."
Wendy smirked. "Well, auto_ptr is a nonvalue-like type. You could even correctly have a container< shared_ptr< auto_ptr<T> > >, although the auto_ptr extra indirection wouldn't add any value. So to speak."
"More realistically," the Guru added, "might be a container< shared_ptr<DatabaseLock> > or container< shared_ptr<TcpConnection> >. Classes for locks and such resources are often not copyable and least like values."
Wendy's smirk got even wider, and I suspected she'd thought of a bad pun. "Just the other day I was reading a book about garbage collection, and one of the three main approaches is to use owning smart pointers. So smart pointers are good for collections of objects in two ways," she quipped, "these kinds of collections of objects, and garbage collections of objects. Ha."
"What about optional values?" I continued brainstorming. "Say you want to have a map<SomeType, Thing*> express that some SomeType objects have no associated Thing. Using a pointer as the value lets you do that directly, because the pointer can be null if there's no associated Thing. But, hmm, I guess you can do that with shared_ptr, too, and that would be safer than bald pointers. If the map owns its values."
"Indeed."
Kerry looked confused. "Uh, this is fun and everything, but isn't there a simple rule I could learn?"
The Guru held up a hand to stop us all, and then spoke into the silence: "The two areas are pointers and containers. For owned pointers: Avoid bald pointers and auto_ptrs, and instead prefer shared_ptrs. For nonowned pointers into containers: Avoid bald pointers, and instead prefer iterators. For other nonowned pointers: Prefer to find an alternative to using a bald pointer.
"For containers: Store only values and smart pointers in containers. Specifically, use either a container<value_type> to hold objects directly by value and ensure that they really do have value semantics, or else use a container< some_refcounted_smart_ptr<any_type> >, preferably a container< shared_ptr<any_type> >.
"If you don't," she added with a twinkle in her eye, "you'll wish you had." And she glided away...
References
- [1] Meyers, Scott. Effective STL, Addison-Wesley, 2001.
- [2] Stroustrup, Bjarne. The C++ Programming Language, Special Third Edition, Section 14.4.2, Addison-Wesley, 2000.
- [3] Sutter, Herb. Exceptional C++, Addison-Wesley, 2000.