Letters to the editor may be sent via email to [email protected], or via the postal service to Letters to the Editor, C/C++ Users Journal, 1601 W. 23rd St., Ste 200, Lawrence, KS 66046-2700.
Jonathan Ringle's article, Singleton Creation the Thread-Safe Way,'' ( CUJ, October 1999) intrigued many readers. Some of them pointed out some potential problems in his implementation. We don't have space to print all their letters here, but Jonathan has graciously sent us this summary of the issues brought up (edited for brevity), as well as suggested fixes. Look for an expanded discussion of this topic on our Code Review web page at www.cuj.com/forum/. mb
Dear CUJ,
There are three issues that need to be pointed out with respect to my article, Singleton Creation the Thread-Safe Way'': 1) The use of a volatile flag, 2) Initialization of static mutexes, and 3) Multi-processor cache coherency [1].
1. There is a potential danger that a couple of the variables used in the implementation may be put into registers by the compiler. If a variable is put into a register, then other threads will potentially not see changes made to the variable across thread context switches. This problem can be avoided with the use of the volatile cv-qualifier. Using volatile ensures two things: a) that the compiler will not put the variable into a register, and b) that the compiler doesn't optimize away the double-check of the variable as being superfluous code.
We could start out by making the static Singleton* _instance volatile, but that quickly gets messy. Either Singleton must provide volatile member functions or the caller of Singleton::instance must cast away the return value's volatile-ness using const_cast.
So, rather than play the casting game, I've decided to separate the functionality of the flag from the pointer, as in:
class Singleton { public: static Singleton& instance(); //... private: static Singleton* _instance; static volatile bool _is_created; //... }; static Singleton& Singleton::instance() { if(!_is_created) { //.. double-checked locking omitted } return *_instance; }
2. Initialization of static mutexes. The code for the Critical_Section::acquire method could potentially be called before the constructor for Critical_Section is called. In fact this would occur for the very same reason that got me started on all this. (See the anecdote at the start of the article.) This issue can be resolved by creating a static instance of a mutex type object that requires no constructor code at all. This can be done by leveraging the guaranteed zero-initialization of non-local statics before anything else. In section 3.6.2 the C++ Standard (http://www.maths.warwick.ac.uk/cpp/pub/) states, The storage for objects with static storage duration (3.7.1) shall be zeroinitialized (8.5) before any other initialization takes place [2].'' Because there is such a strong reliance on initialization in this fashion, access to declare an instance of this type should not have public visibility.
Listing 1 shows an implementation of a mutex type object, class Interlocked_Mutex, which will work on the Windows NT platform. (I have not verified if it will work on other Windows platforms.)
3. Multiprocessor cache coherency. The double-checked locking optimization works well within the confines of a uniprocessor machine. However, the game rules change when you move to a multiprocessor platform. This is especially true if the platform performs very aggressive memory caching optimizations, in which read and write operations can execute out of order'' across multiple CPU caches [1]. Without modification, the double-checked locking optimization can not be declared safe on such systems.
The double-checked locking pattern is an optimization that tries to remove the need for an expensive lock for an infrequently called critical section of code. The pattern exposes read-only" parts outside the context of the lock. The reader thread depends very much upon an ordering: the visibility of the writes in main memory as done by the writer thread must be in the same order as the instructions issued to the CPU. On a uniprocessor system, this is not a problem even if the processor writes things out of order,'' because a thread context switch will make sure that the write-back cache is flushed before the next thread gets control. However, on a multiprocessor platform, the out of order'' writes become visible to threads running on other CPUs! So then it is possible for a thread on CPU number 2 to evaluate if(!_is_created) as true before the data contained within Singleton is available in main memory, or even the _instance pointer. The timing window for these conditions to occur is probably in the nanosecond range, but it's the type of bug that I'm sure would drive me mad trying to find.
Given all the complications involved, I would conclude that the use of the double-checked locking optimization is not worth the trouble on a multiprocessor architecture.
I have used the following references in my research of these issues:
[1] Douglas Schmidt. Double Checked Locking Optimization,'' http://www.cs.wustl.edu/~schmidt/patterns/Double-Checked-Locking.pdf.gz
[2] Andrew Koenig. Working Paper for Draft Proposed International Standard for Information Systems Programming Language C++ (2 December 1996). http://www.maths.warwick.ac.uk/cpp/pub/, section 3.6.2.
[3] Compaq Computer Corporation. Alpha Architecture Handbook, http://ftp.digital.com/pub/Digital/info/semiconductor/literature/alphaahb.pdf, chapters 4.11, 5.
[4] A newsgroup discussion thread on comp.programming.threads MP safe Singleton using double-checked locking'' that ran last year from 05/28/98-06/04/98. I used http://www.deja.com to search for this discussion thread.
[5] Intel Architecture Software Developer's Manual, Volume 2: Instruction Set Reference, pgs 3-230, 3-456.
Jonathan Ringle
Dear CUJ,
In the September 1999 issue, P.J. Plauger wrote about the limits to the Standard C library. As he stated in his article, So there is more than one way to head off this problem, and nearly four decades in which to do it. For my part, I hope to be retired before then.'' I for one hate this because more and more the limitations that we as programmer place upon ourselves the more we fix or rework old problems. I started programming before the PC and remember the 640K barrier. How much time and code and device drivers did we that lived through DOS 6.22 consume because someone placed a limit out there?
I would agree that 2037 is far enough away that some of us will be retired, but most of us will be working in 2007, when banks hit limits as they calculate a 30-year mortgage. Our company has already been hit by limits with dates as we deal with companies that are over 100 years old. As we continue to move forward we should look to ways that do not impose limits, especially within standards; or the standards should list the ramifications of those limits.
The real Y2K issue will be 2001/2002 when companies return to normal program enhancement and maintenance. All the people that are working on visas return home and we have a drastic change to our economy, as these people leave behind houses. I hope that our current Government does not say, For my part, I hope not to be in office at that time.''
Leland F. Carpenter
847/645-5747
I quite agree. I was being facetious, and not very clear about it. pjp
Dear CUJ,
Everyone keeps saying that objects in arrays can be initialized only with the default constructor'' (quote from Pete Becker's column in CUJ, April 1999, for example). However, the following code compiles and runs very nicely under Borland compilers ranging from BC++ 4.5 to BCB 4.0:
#include <iostream.h> struct A { A () : _n( -1 ) {cout << "construct default object\n";} A ( int n ) : _n( n ) {cout << "construct A(" << n << ")\n"; } A ( int n, int m ) : _n( n ) { cout << "construct A(" << n << ',' << m << ")\n"; } A ( const A & rhs ) : _n( rhs._n ) { cout << "A(" << _n << ") is assigned A(" << rhs._n << ")\n"; { cout << "copy (A" << _n << ")\n"; } A & operator= ( const A & rhs ) {_n = rhs._n; return *this; } int _n; }; int main(int, char*[]) { A array[6] = { A( 0 ), A( 1 ), A( 2, 0 ),3, A() }; return 0; }
The output is:
construct A(0) construct A(1) construct A(2,0) construct A(3) construct default object construct default object
All objects are constructed in place; there are no copy constructor or copy assignment calls. Removing the default constructor still allows creating an array with the first four elements.
I respect Borland for making great efforts to track the developing standard and to bring BCB to full conformance to the final standard. Are they misinterpreting the standard, or can we use non-default constructors to initialize array elements after all?
Thank you for a very interesting and useful magazine!
Regards,
Hans Salvisberg Berne
Switzerland
[email protected]
I checked with P.J. Plauger, and he sees nothing in the C++ Standard that prohibits the form of initialization you present. I'll bet the statement you keep hearing about default constructors only'' assumes the array is being dynamically allocated:
A *array = new A[6];
There's no place here to specify individual constructor arguments, so the compiler calls the default constructor for each element. mb
Dear Mike Betrand,
My name is Ivan Zelina. I am a software developer with Micromine Ply Ltd in Perth, Western Australia. We develop mining and exploration software. Recently I read your article in C/C++ Users Journal on TrueType fonts (TrueType Font Secrets,'' CUJ, August 1999). Reading it I got an idea how to solve a problem I have been struggling to solve for a few months. (Thanks a lot for the idea.)
We use TrueType fonts to display various symbols and we need to be able to draw a border around each character as displayed to achieve this Halo" or "Glow'' effect. We have tried two complex solutions but they did not work well. What we are going to try now is quite simple. We will vectorise the character and then draw the character outline using thick line. Then we will just do simple CDC::TextOut over the vectorized characted with parts of the thick border sticking out from underneath.
I have just tried to use your CGlyph class combined with DC::TextOut. Calling CGlyph::Draw and CDC::TextOut will generate characters of equal size but at different positions. Output produced by CDC::TextOut is offset to the bottom right when compared to CGlyph::Draw generated character outline.
Well, my question is if there is a simple way to work out the offsets needed for CGlyph::Draw to make it draw at the same position as CDC::TextOut. I do not want to use too much of your time. Please, let me know if there is a simple answer.
Regards,
Ivan Zelina
Mike Bertrand replies:
Hi Ivan,
Good hearing from you.
You probably know about SetTextAlign, which gives you some flexibility to determine where TextOut's text goes. For glyph placement, look at the GetGlyphOutline parameter that is a pointer to a GLYPHMETRICS structure, defined as follows in MSDN:
typedef struct _GLYPHMETRICS { UINT gmBlackBoxX; UINT gmBlackBoxY; POINT gmptGlyphOrigin; short gmCellIncX; short gmCellIncY; } GLYPHMETRICS;
where:
gmBlackBoxX specifies the width of the smallest rectangle that completely encloses the glyph (its black box).
gmBlackBoxY specifies the height of the smallest rectangle that completely encloses the glyph (its black box).
gmptGlyphOrigin specifies the x- and y-coordinates of the upper left corner of the smallest rectangle that completely encloses the glyph.
gmCellIncX specifies the horizontal distance from the origin of the current character cell to the origin of the next character cell.
gmCellIncY specifies the vertical distance from the origin of the current character cell to the origin of the next character cell.
This enables you to calculate the character's bounding box, including the very bottom, calculate the distance between adjacent characters, etc.
Regards,
Mike Bertrand