Still Going in Circles
Dear DDJ,
As usual, the July 1990 issue proved to me that my DDJ subscription is worth every penny. I enjoyed all the articles, including Mr. Paterson's on drawing circles, but I'd like to add my two cents on the subject, nonetheless.
Mr. Paterson's analysis of the circle drawing problem using calculus was interesting, but there are some simplifications overlooked which can be found if one applies some algebra and a little elbow grease. Allow me to borrow from Mr. Zigon, Mr. Lee, and Mr. Paterson, and state the problem as follows. Given a point (x,y) known to be near the circle, find the next contiguous point. Furthermore, we will confine ourselves to points in the octant defined by x >= 0, y < = r, x < = y.
Since we are only considering points on the octant from (0, r) traveling clockwise, the next point will either be (x+1, y) or (x+1, y-1). The problem then becomes to find which is closer to a perfect circle. We know that if (x,y) is on the circle, (x+1,y) is above or on the circle and (x+1,y-1) is below or on the circle. I won't prove this, but a drawing should be convincing enough. If the distance from the circle to (x+1,y) is greater than the distance from (x+1,y-1) to the circle, then we will choose (x+1,y-1), otherwise we will choose (x+1,y) as the next point. If the two distances are identical, we'll arbitrarily choose (x+1,y) as the next point.
Let e(x,y) be the error function, which is the distance from a perfect circle of radius r centered on the origin to the point (x,y).e(x,y) is expressed as e(x,y) = x2 + y2 - r2.
e( ) can be easily derived by using the definition of a circle and Pythagoras' Theorem. At this point, the circle drawing algorithm looks like Example 1. The goal now is to reduce the amount of computation needed in the loop. Most of it is being performed in the if statement. The problem is the inequality which computes e( ) twice:
Example 1
y =r ; x =0 ; while (x <= y) <SUP> plot8 (x, y) ; if (e(x + 1, y) > -e(x + 1, y -1)) y-- ; x++ ; </SUP> e(x+1,y) > - e(x+1,y+1)
Algebra to the rescue. First, expand the right-hand side, and substitute using the definition of e( ).
e(x+1, y -1) = (x +1)<SUP>2</SUP> + (y - 1)<SUP>2</SUP> - r<SUP>2</SUP> e(x+1,y -1) = (x+1)<SUP>2</SUP> + y<SUP>2</SUP> - 2y+1 - r<SUP>2</SUP> e(x+1,y-1) = e(x+1,y) - 2y+1
Substitute into the original inequality:
e(x+1,y) > - e(x+1,y)+2y -1 Add e(x+1,y) - 2y+1 to both sides: 2e(x + 1, y) - 2y + 1 Divide by 2: e(x+1,y) - y>0
Note that the 1/2 drops out, since we are performing an integer comparison. Now e( ) is computed only once instead of twice each time through the loop. The algorithm now looks like Example 2.
Example 2
y = r ; x = 0 ; while (x ,= y) { plot8 (x, y) ; if (e (x + 1, y) - y > 0) y-- ; x++ ;
Now, to get rid of the remaining e( ). The above expressions showed that e(x+1,y-1) can be expressed in terms of e(x+1,y). If e(x+1,y) can be expressed in terms of e(x,y), the calculation of e( ) can be written in an iterative form (each new value of e( ) computed by modifying the previous e( )), which is (hopefully) easier to compute.
e(x+1,y) = (x+1)<SUP>2</SUP> + y<SUP>2</SUP> - r<SUP>2</SUP> e(x+1,y) = x<SUP>2</SUP> +2x+1+y<SUP>2</SUP> - r<SUP>2</SUP> e(x+1,y) = e(x,y)+2x+1
Adding a variable, e, to accumulate e( )-y and changing the multiplication by 2 to a left shift by 1 yields the final version shown in Example 3.
Example 3
y = r ; x = 0 ; e = -y ; /* e(o, r) == 0, since this point is on the circle */ while (x <= y) { plot8 (x, y) ; e += (x << 1) + 1 ; if (e > 0) { y-- ; e -= (y << 1) ; } x++ ;
There's my version of the circle plotting algorithms. Plotting a circle of radius 1,000,000 took 3.81 seconds for Mr. Paterson's algorithm versus 1.51 seconds for mine. To generate an ellipse, the plot( ) routine may be modified as Mr. Paterson suggests.
Michael P. Lindner
Basking Ridge, New Jersey
Pointers: Far vs. Huge vs. Based
Dear DDJ,
On page 85 in the August 1990 DDJ, Bruce Schatzman gives a useful description of the new based pointer technique available under MS C 6.0. Unfortunately, he makes some incorrect statements regarding far pointers.
The discussion claims that far pointers are the only solution available for data arrays larger than 64K. This is simply not true, as far, near, and based pointers all suffer from the 64K limitation on contiguous data size. The only available solution to this limitation is the huge pointer.
Technically, near, far, and based pointers are represented using variations of segmented notation. near pointers assume an implied segment called DGROUP and only require a 2-byte offset. far pointers, on the other hand, make no assumptions about the segment. They require 4 bytes; the high word holds the segment, while the low word holds the offset. based pointers represent a middle ground between near and far. The segment and offset are split into two variables; one variable holds the segment, while the other holds the offset. This leads to improved efficiency where numerous references to pointers of a common segment are made.
In contrast, huge pointers are represented using 4-byte absolute notation. The use of absolute notation supports contiguous blocks of data which are larger than 64K, however, additional code overhead is incurred to transform the absolute notation into segmented notation at runtime.
Hope this untangles a few pointers and thanks for a great magazine.
Kent Funk
Manhattan, Kansas
Bruce responds: Kent is correct. Both far pointers and based pointers suffer from the 64K addressing limitation, which is the size of the offset (16 bits). The reference to far pointers in the last paragraph of the article should have been a reference to huge pointers. My apologies and thanks to Kent for the correction.
DOS + 386 = 4 Gigabytes
Dear DDJ,
In my article "DOS + 386 = 4 Gigabytes!" (DDJ, July 1990) there is a compatibility problem with some 386 machines. To correct this, you must replace the a20( ) routine in Listing Three, page 110. The correct routine is:
/* This code is the same */ void a20(int flag) { if (inboard) { outp(INBA20,flag?INBA20OFF); } /* changes start here */ else { keywait( ); outp(0x64,0xD1); keywait( ); outp(0x60,flag?0xDF:0XDD); keywait( ); outp(0x64,0xFF); keywait( ); } }
The code in the article works on some 386 machines, but fails on Compaq and Dell computers. The code above works on all machines I tested, including Everex, Dell, Compaq, and CompuAdd.
My apologies for any inconvenience this has caused. I'd like to thank John Hamilton of Quad-S for providing several of the test machines.
Al Williams
League City, Texas
Dear DDJ,
In reviewing the code supplied with "DOS + 386 = 4 Gigabytes!" by Al Williams (DDJ, July 1990), I have identified some potentially dangerous practices that are avoidable.
The first is that the state of the IRQ interrupt flag (IF) is cleared and then forcibly set at the end of the routine "protsetup." In case this routine is called while interrupts are disabled, it will reenable them. A safer method is to PUSHF, CLI, and then POPF (POPF should be avoided on code that may run on an 80286, but this is 80386 or above, only).
Generally, interrupts really are enabled in such context, so programmers tend to be lax, but it should not be left up to chance. At least the fact that interrupts will be enabled on exit should be explicitly documented.
Secondly, after changes to the protection-enabled (PE) flag, a short jump is performed to flush the prefetch queue. While this is recommended by the 80386 Programmer's Reference Manual(Ahern-Wahlstrom, 1986) when going into protected mode, section 14.5 clearly states that a farjump is required to put "appropriate values in the access rights of the CS register [sic]."
In fact, it is desirable to do farjumps on both entry and exit of protected mode. This assures the kind of coherency that greatly simplifies the transition and facilitates the use of an In Circuit Emulator (ICE) or software debugger. If one is single-stepping through such code, the dumps and restores of invalid CS values could disrupt the orderly execution of instructions.
The lack of explicit CS descriptor loading should not actually cause the code to fail because no far transitions are performed while in protected mode. This assures the cached descriptor information will remain valid. On the other hand, this introduces an unnecessary architectural dependence that may interfere with operation on yet-to-be-released 80386 compatible processors.
I hope these recommendations help you and Mr. Williams to produce and distribute code that is as robust and widely compatible as possible.
Thomas Roden
Irvine, California
Al responds: I want to thank Thomas for his interest in "DOS + 386 = 4 Gigabytes!" I'm glad he took the time to explore it in such detail. I'd like to address his issues one at a time.
His comments about reloading the CS register for protected mode are certainly true -- true, that is, if you are writing a protected-mode application. In this case, however, the same caching that allows us to use the 4-gigabyte segments allows us to use this handy technique. Of course, if the code ran with interrupts enabled, or made any long calls or jumps, we would have to force the CS register to reload . This would have required another GDT entry, and was deemed unnecessary. Also, if you look at the code for my DOS extender (Part I in this issue), the CS register is, in fact, loaded as you suggest. Thanks again and I hope I have answered any concerns.
[Editor's note: Thomas is the author of "Four Gigabytes in Real Mode," published in Programmer's Journal (November/December 1989).]
Zortech Heard From
Dear DDJ,
It was with considerable interest that I read the article titled "Collections in Turbo C++" by Bruce Eckel in your August 1990 issue.
The article seems to suggest that factors such as compilation speed, code size and speed, and support for development of Microsoft Windows 3.0 and OS/2 are irrelevant or certainly not worth his mention. For those interested in such features, Zortech C++ remains the platform of choice. Our benchmarks clearly show Turbo C++ compiles slower, and generates bigger, slower code. Additionally, Turbo C++ currently has no support for the Windows and OS/2 operating environments.
Surprisingly, a big deal was made of Zortech not supporting the C++ language feature pointers to members. As of July 1, Zortech introduced this feature in its version 2.1 release. It's particularly surprising, since at the time the article was submitted, Bruce Eckel and DDJ had beta copies in their possession that clearly implemented this feature.
Perhaps of greatest concern was that Bruce Eckel's business relationship with Borland International was not made clear on the front of the article in the traditional fashion. It is standard journalistic practice to reveal to readers any conflicts of interest at the time of publication.
Paul Leathers
President, Zortech Inc.
DDJ responds: First of all, the article was about collections in Turbo C++, not collections in Zortech C++. No comparisons were made or intended so there was no reason to discuss Zortech's features, admirable as they may be. As you said, when the article was written, Zortech C++ 2.1 was still in beta. DDJ covers software only when readers can buy it off the shelf.
There was no tie between Bruce and Borland when Bruce wrote the article. After the magazine was available on newsstands, Bruce began teaching an "Introduction to C++ " seminar as part of Borland's roving OOP educational program. You're right-on about magazines publishing at the front of the article any of the author's interests that might suggest possible bias. Happily, DDJ is one of the few magazines to adhere to this practice.
Editor Update
Dear DDJ,
I have to take exception to the sidebar "Regular Expressions" in the article "Awk as a C Code Generator," by Wahhab Baldwin (August 1990). The Microsoft Editor (M) does indeed support parentheses and + in regular expressions. In fact, it supports two flavors of each.
Parentheses may be used for simple grouping, as is typical. In addition, braces provide grouping and retain the matched substring for use in a replacement expression. This is done by specifying $1, $2, etc., in the replacement expression to indicate what matched the first pair of braces, the second pair, etc.
The simple plus sign is supported, and is defined to match the shortest string that gives a successful match for the regular expression as a whole. Also, the hash sign is also used to mean one or more occurances of the previous subexpression, but in this case it is defined to match the maximum number of occurences that gives a match for the expression as a whole. The difference between these two forms is only of significance in search and replace operations. For example, replacing foo(bar)+ with spam in the text foobarbar will give spambar. On the other hand, replacing foo(bar)# with spam in the text foobarbar will give spam.
Paul J. Ste. Marie
CIS 72200,1324
Guess Who's Coming to Dinner?
Dear DDJ,
Shocking but true. In the August 1990 DDJ, both Douglas N. Franklin and Michael Swaine are right in their assertions about the "Dining Philosophers" problem. Mr. Franklin suggests it can be solved and Michael suggests it cannot. The key is that they are talking about different forms of the same problem. The problem solved by Mr. Franklin is not that same problem proved unsolvable by deterministic algorithms in Algorithmics by David Harel (Addison-Wesley, 1987). To quote the problem from Mr. Harel on page 288:
Can the dining philosophers problem be solved without a doorman [Mr. Franklin's first solution], and without resorting to shared memory [Mr. Franklin's second solution] or its equivalents? In other words, is there a fully distributed, fully symmetric solution to the problem that does not employ any additional processors? Here "fully distributed" means that there is no central shared memory and the protocols may use only distributed variables.... "Fully symmetric" means that the protocols for all philosophers are essentially identical [ruling out Mr. Franklin's third solution].
Mr. Harel goes on to prove that this highly restricted form of the dining philosophers problem can't be solved with deterministic algorithms. Mr. Harel provides the following algorithm, page 304:
1 do the following again and again forever: 1.1 carry out private activities until hungry; 1.2 toss coin to choose a direction, left or right, at random; 1.3 wait until fork lying in chosen direction is available, and then lift it; 1.4 if other fork is not available do the following: 1.4.1 put down fork that was lifted; 1.4.2 go to 1.2; 1.5 otherwise (i.e. other fork is available) lift other fork; 1.6 critical section: eat to your heart's content; 1.7 put down both forks (and back to 1.1);
and then proves that it solves the restricted form of the problem. Note that this algorithm depends on randomness, the point of Mr. Swaine using the reference in his article.
It is unfortunate Mr. Swaine did not make it clear that Mr. Harel was referring to this highly restricted form of the problem, not the classic form solved by Mr. Franklin. But, Mr. Franklin should have given Mr. Swaine the benefit of the doubt and looked up Algorithmics in his local library, which I consider Mr. Franklin's homework, before accusing Mr. Swaine of not doing his. I suggest Mr. Franklin, or anyone else even marginally interested in the theory of algorithms, pick up a copy of Algorithmics. It is an excellent book.
Charles P. Jazdzewski
Watsonville, California
Errata: Correction for Example 1, "Dining Philosophers" problem, August 1990 "Letters."
#define N 5 #define take_fork(num) down (s[num]) #define put_fork(num) up (s[num]) typedef int semaphore; semaphore s[N]; philosopher(i) int i; { while (TRUE) { think(); take_fork(i); take_fork((i+1) % N); eat(); put_fork(i); put_fork((i+1) % N); } }