A Ray-Casting Engine in C++
Trading realism for speed
Mark Seminatore
Mark is a mechanical engineer with General Electric. He has been programming for over 15 years and specializes in numerical analysis and optimization. Mark can be reached on CompuServe at 72040,145.
Ray casting is a real-time, three-dimensional rendering technique that's at the heart of many computer-graphics applications, particularly games. The allure of ray casting is its capacity to produce very realistic 3-D images at extremely high frame rates.
Ray casting is similar enough to ray tracing that some books treat the two synonymously. Both techniques trace imaginary rays of light from the display plane through a data structure representing a 3-D world. Both look for intersections between these rays and the 3-D world to determine what is visible. Ray casting, however, sacrifices realism for speed by placing constraints on the 3-D world. In this article, I'll discuss the theory behind ray casting, present a ray-casting engine called "Raycastr," and address issues such as performance and optimization.
Ray-Casting Basics
In ray casting, the 3-D world is represented by a 2-D map of wall cubes--fixed-size cubes with a texture-mapped image on each face. The 2-D map is just an overhead view of the 3-D world. Typically, you would like to have several types of wall cubes, such as a brick wall and a wooden wall. So, a style code is assigned to the wall cubes and open spaces are given a style code of 0. A simple 2-D map might then look something like Figure 1. Note that the room is totally enclosed by wall cubes of style 1 (brick walls), and there is a style-2 wall cube (perhaps a wooden wall) on the right-most wall. The point of view (POV) location and orientation is shown with a ">". (POV identifies the current x,y location and view angle within the 3-D world. For realism, the POV encompasses a 60-degree field of vision. Only wall slices 30 degrees to either side of the POV are visible.) An imaginary ray of light is then "cast" from the POV through the 2-D map. The ray continues until it finds a wall cube or the map boundary. When a wall cube is hit, you use the ray endpoints as an index into the 2-D map and look up the style code. What the ray actually sees is a wall slice--a vertical piece of a wall cube, the smallest visual element in our 3-D world. (Only one wall slice can be drawn for each column of the video display. The image generated by ray casting is made up of a number of wall slices; everything visible is either a floor, a ceiling, or a wall slice.) If the ray reaches the map boundaries, it's assumed that the ray saw nothing.
Once the ray endpoints (x2,y2) are known, consider the ray to be the hypotenuse of a right triangle. Then, use the Pythagorean theorem to calculate the distance from the POV (x1,y1) to the wall slice. In Figure 2, the distance calculated is relative to a single point, the POV. If you use this distance, however, your view of the 3-D world will be distorted into a curved image. This is sometimes referred to as the "fish-bowl" effect. To solve this problem, you need the distance relative to the flat plane of the video display, which is found by multiplying the distance by the cosine of the view angle. This cosine-corrected distance can then be used to determine the height of the wall slice. I make the assumption that the perceived wall height is proportional to 1/Distance'. The ratio of wall height to viewport height is the scale factor used for drawing the wall slice on the screen. (The viewport height is the part of the video screen used to display the ray-casting image; viewport width determines the number of ray casts required, while height scales the texture-mapped wall slices.) The ray endpoints also allow you to calculate what column of the wall cube was struck. The column determines what part of the bitmap tile to draw on the wall slice. For example, if a ray strikes column 6 of a wall cube, draw column 6 of the bitmap tile on the wall slice.
For a single ray, you know the style code, the distance (and therefore the height), and which column of the wall cube is visible. You continue to cast rays for each column of the viewport. Once this information is known, a single screen of graphics can be generated. The entire process is repeated continuously for the duration of the program.
Design Decisions
In building the Raycastr ray-casting engine, I had to make several design decisions, many driven by performance. The code uses the standard VGA mode 13h (320x200x256). You could use a different video mode, such as the tweaked VGA mode X (see Michael Abrash's "Graphics Programming" column, DDJ, July 1991), or one of the Super VGA modes. Mode 13h is the easiest to understand and implement. The bitmap tiles are 256-color PCX files with a resolution of 64x64, sharing a single color palette.
For keyboard control, I avoided DOS and BIOS interrupts and installed a custom keyboard-interrupt handler. This improves response and eliminates the keyboard buffer, which causes an annoying beep when it overflows. The viewport size is 240 columns by 120 rows. A smaller viewport would improve performance but the goal is to provide a reasonably large view space.
The bitmap tiles are stored in an array of pointers; see Listing One. The first array element is a NULL pointer since style code 0 is reserved for open spaces. The array index corresponds to the wall-cube style code. Next, I define a data structure for the POV. The structure holds the current (x,y) location and view angle within the 3-D world; see Listing Two. A data structure is also needed to hold the results of the ray casts for each video column. An array of this structure will hold the wall-cube style, wall-slice column, and the distance for each viewport column.
Several tables are required for efficient texture mapping. The first, in Listing One, is a table of offsets into video memory; this avoids calculating offsets inside loops. The second table relates the wall-slice height to distance. The last is the ratio of bitmap height to wall-slice heights.
The Map Representation
The previous explanation of ray casting made use of the Pythagorean theorem to calculate the distance from the POV to a wall cube. In PC graphics, however, you cannot afford square roots or floating-point math of any kind. Ideally, the inner loops should involve only integer addition or subtraction. Therefore, you must find another method of calculating the distance from the POV to a wall cube. The solution is to cast two rays instead of one: One ray looks only for walls in the XZ plane, the other for walls in the YZ plane. Also, since parallel walls are a multiple of a fixed distance (64 units) from each other, you only need to look for walls at these locations. This optimizes ray casting considerably.
Consider an XZ ray with the POV at coordinates 96,96 and a view angle of 0 degrees (facing right). The ray is cast at an angle of 30 degrees relative to the POV. You find the distance dx from the POV to the first x-wall the ray strikes. The distance dy from the POV to the x-wall is dx multiplied by the tangent of the ray angle. Now check the 2-D map for a wall cube at location x+dx and y+dy. If there is no wall cube at this location, you add 64 to x and add dy to y, then check again. This continues until a wall cube is found or the end of the map is reached. When a wall cube is found, the ray endpoints are returned.
A similar process is followed for the YZ ray: Find the distance dy to the first Y-wall struck by the ray. The distance dx from the POV to the y-wall, however, is now dy multiplied by 1/tangent of the view angle. Check the yz-wall map for a wall cube. If no wall cube is found, add 64 to y and dx to x and continue. Compare the distance for both rays and use only the shorter one. The path of the rays and the trigonometry involved is easier to visualize if they are drawn as right triangles. Consider the two right triangles in Figure 3. The 2-D world map must now be reformatted so that it represents wall boundaries instead of wall cubes. Since each wall cube has two xz walls, there is an additional x-wall boundary, likewise for the y and z walls. The wall-boundary maps are stored in arrays of type char, which are allocated at run time; see Listing One.
All distance calculations use fixed-point arithmetic for performance reasons. In addition, the standard C floating-point trigonometry functions cannot be used. Instead, all the trigonometry values are precalculated and stored as fixed-point numbers in look-up tables. The data structures for these tables are arrays of long integers allocated at run time.
The Raycastr Code
I compiled Raycastr using Borland C++ 3.1 and Turbo Assembler, but it should be reasonably compatible with other compilers. The complete source code, which consists of several modules, is available electronically; see "Availability," page 3. The code can logically be grouped into three phases: initialization, animation, and clean-up and exit. The main() function (see Listing Three) calls Initialize(), which in turn calls several other initialization routines. InitKeyboard() installs the new keyboard-interrupt handler, InitVideo() sets the video mode to 13h, and InitHeightTable() and InitTrigTables() build the various lookup tables. Finally, InitBitmap() loads the palette file and the PCX bitmaps, and InitMap() loads the 2-D map file and builds the wall arrays.
The animation phase starts with EventLoop(), which checks the keyboard status. If the keyboard status indicates motion forward or backward, EventLoop() calls HitTest() to validate the movement and updates the POV state. The HitTest() function is really a simplified version of GenView(): It calls the ray functions to find the distance from the POV to wall cubes in the direction of movement. If the distance is too short, the function returns a nonzero value. The return value determines whether movement in a particular direction is valid.
EventLoop() then calls GenView(), which calls RayXZ() and RayYZ() for each viewport column. RayXZ() and RayYZ() cast rays to find visible wall cubes. Based on the results of the ray functions, GenView() fills in the View[] structure, stores the distance to the farthest wall cube in FarthestWall, and then calls DrawFrame(); see Listing Four. DrawFrame() uses the information in the View[] array to perform the texture mapping required to generate one frame of graphics. The function finds the height (h) of the shortest wall slice using FarthestWall to index HeightTable[]. If h is greater than VIEW_HEIGHT, no ceiling or floor is drawn. Otherwise, the ceiling color is drawn from the top row of the viewport to the row (VIEW_HEIGHT-h)/2. The floor is drawn similarly, starting from the bottom viewport row.
Next, wall slices are drawn for each viewport column. The Display pointer is initialized to ScreenBuffer, plus the viewport column, and Bmp points to the wall-slice style bitmap. The height (h) of each wall slice is found using View[]. The distance and height are used to look up the scaling factor, ScaleFactor, from ScaleTable[]. Next, calculate the starting row relative to the center of the viewport. DrawFrame() then loops over the height, drawing a bitmap pixel in ScreenBuffer, updating the Display pointer, and then adding ScaleFactor to BmpOffset. This is the texture-mapping loop that draws the bitmaps on each wall slice. After the wall slices are drawn in system memory, the entire frame is copied to video memory.
When the animation is finished, main() calls CleanUp(), which calls RestoreVideo(), FreeMem(), and RestoreKeybd(). These routines restore the initial video mode, free all allocated memory, and reenable the DOS/BIOS keyboard-interrupt handler, respectively. Finally, main() displays performance statistics, including total frames, clock ticks, and frames per second.
Ray-Casting Optimizations
There are several ways to improve performance. First, the program was written and tested in C. This served to validate the algorithms used by the code. On a 20-MHz 386SX, the code produces nearly six frames per second, not too bad for a C program. Next, Turbo Profiler shows that more than 60 percent of the execution time is spent in the DrawFrame() function. This is not surprising, since DrawFrame() draws 28,800 pixels per frame. At 24 frames per second, it must draw and copy roughly 700,000 pixels per second. For this reason, our optimization efforts will concentrate mainly on DrawFrame().
The first optimization, loop unrolling, improves the ratio of value-added code to loop overhead code. In DrawFrame(), a for loop accumulates the sum of a scaling factor for texture mapping a wall slice. Since the amount of code inside the loop is small, more time is spent processing the loop code (increments, compares, and jumps) than the code inside the loop. In order to unroll the loop, I take advantage of the fact that the row variable is always a multiple of two in order to perform two additions instead of one for the same amount of loop code. The loop is now faster because it spends more time doing useful work; see Figure 4. Optimizing compilers will attempt to unroll such loops, but it's often simpler to do it by hand.
When performance counts, there's no substitute for assembly language. The easiest way to convert a function from C to assembler is to use the assembly output of the compiler as a starting point. Conversion to assembly provides three advantages over compiler-generated code. The first is avoiding redundant segment-register loads. Segment values rarely change, yet most compilers insist on reloading segment registers inside loops. Another advantage is the ability to keep values in registers longer and avoid memory references. Most compilers use register variables when possible, but a good assembly programmer is hard to beat. Assembly language also allows us to use REP STOSW and REP MOVSW as often as possible. These fast, compact instructions are an excellent way to draw and move graphics in memory.
The real-mode 386 assembly-language instructions REP STOSD and REP MOVSD allow data to be processed 32 bits at a time; the effect of this is most significant when the source and destination are in system memory. The BlitTest utility (available electronically) demonstrates this by testing the performance of 8-, 16- and 32-bit block transfers to both system and video memory. An additional benefit of 386 code is the ability to perform integer arithmetic using the 32-bit general-purpose registers. This speeds up fixed-point math considerably. For example, using 16-bit code, the addition of two long-integer values typically requires six instructions and six memory operands. The same addition in 32-bit code requires just three instructions and three memory operands. Figure 5 illustrates this with assembly-language versions of RayXZ() and RayYZ() that use 16- and 32-bit fixed-point math.
Finally, you can write tighter loops by avoiding memory operands and keeping everything in registers. Many times, however, there are simply not enough registers available for this approach. The solution is to replace the memory operands with immediate values and change the values at run time. All that is needed is a MOV instruction with a code-segment override. The hardest part is knowing where in the code segment to write the new values. The answer lies in the .LST generated by TASM. Write the self-modifying code, assemble it, then check the .LST file to verify that the code is overwriting your dummy value. Note that if the modified code follows too closely (without an intervening branch) you may have to clear the prefetch queue using a JMP $+2. This breaks the optimization, so be careful. One more caveat: By definition, self-modifying code is difficult to debug. Use this optimization only when the benefits outweigh those of all other options.
Conclusion
The assembly-language version of DrawFrame() uses all these optimizations heavily, and the results are impressive--nearly an 80 percent increase in performance. And since ray casting only requires addition in the inner loops, it is very fast. However, this also means that drawing the graphics will always be the limiting factor affecting performance. For slower 286 and 386SX machines, it may be faster to draw the graphics in off-screen video memory and then flip the video pages. This requires a planar graphics mode such as mode X. The low cost of the OUT instruction on the 286 (3 cycles versus 16 on the 486) and slower system memory make this option more attractive.
Figure 1: Map of a simple, enclosed room.
Figure 2: Cosine-corrected distance; Distance'=Ray Length*cos(q).
Figure 3: Ray trigonometry.
Figure 4: (a) Original loop; (b) unrolled loop.
Copyright © 1995, Dr. Dobb's Journal
(a) for(j=row;j<0;j++)
BmpOffset+=ScaleFactor;
(b) for(j=row;j<0;j+=2)
BmpOffset+=ScaleFactor+ScaleFactor;
Figure 5: Assembly-language versions of RayXZ() and RayYZ(). (a) Using 16-bit fixed-point math; (b) using 32-bit fixed-point math.
(a)
mov ax,[fp1] ; get lower word of fp1
mov dx,[fp1+2] ; get upper word of fp1
add ax,[fp2] ; add in lower word of fp2
adc dx,[fp2+2] ; add in upper word of fp2
mov [fp1],ax ; store lower word result
mov [fp1+2],ax ; store upper word result
(b)
mov eax,[fp1] ; get a fixed point value
add eax,[fp2] ; add two fixed point numbers
mov [fp1],eax ; store result
Listing One
// Data.c: This module instantiates all the global data.
// Copyright (c) 1994 by Mark Seminatore, all rights reserved.
#include "pcx.h"
#include "ray3d.h"
char *face[NUM_FACES];
int MapWidth, // run-time map width
MapHeight; // run-time map height
char *map, // raw map data
*xzWalls, // xz wall boundary map
*yzWalls; // yz wall boundary map
unsigned VideoRow[200]; // Video row memory offsets
unsigned char volatile ScanCode; // modified by int 9 handler
unsigned int volatile FaceIndex; // current face bitmap to display
unsigned int Frames=0; // number of frames generated
Pov_t Pov; // current Point-of-View
long CosAngle, // movement trig values
SinAngle;
int HeightScale=10000; // The wall-height aspect ratio
PcxImage pcx; // structure for loading bitmaps
int HeightTable[MAX_DISTANCE+1]; // table of height vs. distance
long ScaleTable[MAX_DISTANCE+1]; // table of bitmap scaling factors
char *bitmap[NUM_BITMAPS+1]; // array of bitmap tiles
char *ScreenBuffer, // pointer to frame memory buffer
*screen; // pointer to video memory
long RayX,RayY; // last coordinates of x and y
int iRayX,iRayY; // ...raycasts
int FarthestWall; // distance to farthest wall in frame
View_t View[VIEW_WIDTH]; // raycasting data
long *Sine; // pre-calc'd trig tables
long *Tangent;
long *InvTangent;
long *InvCosine;
long *InvSine;
long *Cosine;
long *dxTable; // pre-calc'd table of 64*tan(angle)
long *dyTable; // pre-calc'd table of 64/tan(angle)
Listing Two
// Ray3d.h: Declares all global data structures and constants
// Copyright (c) 1994 by Mark Seminatore, all rights reserved.
#ifndef __RAY3D_H
#define __RAY3D_H
#define FP16 16
#define FP20 20
#define ROUND20 ((1L<<20)/2)
#define CHECK_DIST 48L // minimum distance from wall
enum {NO_WALL,X_WALL,Y_WALL}; // wall struck codes
#define MAX_DISTANCE 2048
#define MIN_DISTANCE 10
#define BITMAP_WIDTH 64
#define BITMAP_HEIGHT 64
#define MAP_SIZE 64
#define MAP_WIDTH 64
#define MAP_HEIGHT 64
#define MAP_MAX MAP_WIDTH*MAP_HEIGHT
#define MAP_XMAX MAP_WIDTH*MAP_SIZE
#define MAP_YMAX MAP_HEIGHT*MAP_SIZE
#define MAP_XMAXLONG (MAP_XMAX*(1L<<FP16))
#define MAP_YMAXLONG (MAP_YMAX*(1L<<FP16))
enum {FALSE,TRUE};
#define CEILING_COLOR 0
#define FLOOR_COLOR 8
#define SCREEN_WIDTH 320
#define VIEW_WIDTH 240
#define VIEW_HEIGHT 120
#define VIEW_ANGLE 60
#define NUM_BITMAPS 16
#define NUM_FACES 5
#define MAX_HEIGHT 1024
#define MIN_HEIGHT 16
#define BITMAP_BYTES (BITMAP_WIDTH*BITMAP_HEIGHT)
#define Degree_6 (int)(6L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_30 (int)(30L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_45 (int)(45L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_90 (int)(90L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_135 (int)(135L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_180 (int)(180L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_225 (int)(225L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_270 (int)(270L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_315 (int)(315L*VIEW_WIDTH/VIEW_ANGLE)
#define Degree_360 (int)(360L*VIEW_WIDTH/VIEW_ANGLE)
typedef struct
{
int Style,
Column,
Distance;
} View_t;
typedef struct
{
int x,
y,
angle;
} Pov_t;
#ifdef __cplusplus
extern "C" {
#endif
void Blit(char*,int,int,int,int);
void DrawFrame(void);
void GetSineAndCosine(int);
char RayXZ(int,int,int);
char RayYZ(int,int,int);
int InitTables(void);
void FreeMem(void);
void InitTimer(void);
void RestoreTimer(void);
void CleanUp(void);
void InitMap(char*);
void InitBitmaps(char*);
void InitHeightTable(void);
void Initialize(void);
void GenView(void);
int HitTest(int,int,int);
void EventLoop(void);
void InstallKeybd(void);
void RestoreKeybd(void);
#ifdef __cplusplus
}
#endif
#endif
Listing Three
// Main.c: This module calls the initialzation routines, then jumps into
// the animation loop, and lastly shows some statistics.
// Copyright (c) 1994 by Mark Seminatore, all rights reserved.
#include <stdlib.h>
#include <time.h>
#include <alloc.h>
#include <stdio.h>
#include <limits.h>
#include "ray3d.h"
#include "globals.h"
#include "keys.h"
void GenView(void)
{
int i,distance,RayAngle;
unsigned char yStyleCode,StyleCode;
unsigned long RayLength,yRayLength;
unsigned int WallColumn,yWallColumn;
long dx,dy;
RayAngle=Pov.angle-Degree_30; // Start by looking 30 to our left
if(RayAngle < 0) // Keep angle within table bounds
RayAngle+=Degree_360;
FarthestWall=0; // No farthest wall yet
for(i=0;i<VIEW_WIDTH;i++) // Cast two rays for each video column
{
RayLength=LONG_MAX; // Start with an impossible distance
// Don't cast an XZ ray if it is not possible to hit an XZ wall
if(RayAngle!=Degree_90 && RayAngle!=Degree_270)
{
StyleCode=RayXZ(Pov.x,Pov.y,RayAngle); // cast XZ ray
if(StyleCode) // Did we hit a wall?
{
// Use y-intercept to find the wall-slice column
WallColumn=(unsigned)(RayY>>FP16)&0x3F;
if(iRayX < Pov.x) // if looking west
WallColumn=63-WallColumn; // reverse column
dx=iRayX-Pov.x; // get ray x-component
RayLength=(dx*InvCosine[RayAngle]) >> 14; // calc distance
}
}
// Don't cast a YZ ray if its not possible to hit any YZ walls
if(RayAngle!=0 && RayAngle!=Degree_180)
{
yStyleCode=RayYZ(Pov.x,Pov.y,RayAngle); // cast YZ ray
if(yStyleCode) // was a wall found?
{
// Use the x-intercept to find the wall-slice column
yWallColumn=(unsigned)(RayX >> FP16)&0x3F;
if(iRayY > Pov.y) // if looking south
yWallColumn=63-yWallColumn; // reverse column
dy=iRayY-Pov.y;
yRayLength=(dy*InvSine[RayAngle]) >> 14;
if(yRayLength < RayLength) // use the shorter ray
{
RayLength=yRayLength;
StyleCode=yStyleCode;
WallColumn=yWallColumn;
}
}
}
if(WallColumn < 64) // A wall was found (either X or Y)
{
RayLength*=Cosine[i]; // cosine correct distance
RayLength+=ROUND20; // round up fixed point
distance=(int)(RayLength>>FP20); // convert to an int
if(distance < MIN_DISTANCE) // check for min distance
distance=MIN_DISTANCE;
if(distance >= MAX_DISTANCE) // check for max distance
distance=MAX_DISTANCE-1;
// Save the wall-slice data
View[i].Distance=distance;
View[i].Style=(StyleCode&0x3F);
View[i].Column=WallColumn;
if(distance > FarthestWall) // is new distance the farthest?
FarthestWall=distance; // update farthest distance
}
RayAngle++; // get next ray angle
if(RayAngle >= Degree_360) // keep angle within tables
RayAngle-=Degree_360;
}
DrawFrame(); // Use View[] to draw the entire screen
Blit(face[FaceIndex],266,133,50,45); // draw the new bitmap
} // end GenView()
// This routine validates player moves
int HitTest(int x,int y,int angle)
{
int wall;
unsigned long distance,ydistance;
int dx,dy;
distance=LONG_MAX; // Set to a very large distance
// Correct for angles where sin=cos, push the angle to one side otherwise
// we can get false readings.
if(angle==Degree_45 || angle==Degree_135 ||
angle==Degree_225 || angle==Degree_315)
angle++;
// Don't cast a XZ ray if it can't possibly hit an XZ wall!
if(angle!=Degree_90 && angle!=Degree_270)
{
if(RayXZ(x,y,angle)) // cast an xz-ray
{ // we hit something
dx=iRayX-x; // get ray x-component
distance=(dx*InvCosine[angle])>>14; // calc distance
wall=X_WALL; // set wall struck code
}
}
// Don't cast a YZ ray if it can't possibly hit a YZ wall!
if(angle!=0 && angle!=Degree_180)
{
if(RayYZ(x,y,angle)) // can a yz-ray
{ // we hit something
dy=iRayY-y; // get ray y-component
ydistance=(dy*InvSine[angle])>>14; // calc distance
if(ydistance < distance) // use the shorter ray
{
distance=ydistance; // use y ray distance
wall=Y_WALL; // set wall struck code
}
}
}
if(wall) // a wall was hit
{
distance*=Cosine[VIEW_WIDTH/2]; // cosine correct distance
distance+=ROUND20; // round-up fixed point
distance>>=FP20; // convert back to integer
if(distance > CHECK_DIST) // check distance
wall=NO_WALL; // its OK to move
}
return wall; // return wall code
}
// This is the main program loop.
void EventLoop(void)
{
int dx,dy,RevAngle,result;
while(ScanCode!=ESC_PRESSED) // do until ESC is hit
{
switch(ScanCode)
{
case 0: // no keys pressed
break;
case PLUS_PRESSED:
HeightScale+=1000; // increase aspect ratio
if(HeightScale > 20000) HeightScale=20000;
InitHeightTable(); // recalc height/distance table
break;
case MINUS_PRESSED:
HeightScale-=1000; // decrease aspect ratio
if(HeightScale < 1000) HeightScale=1000;
InitHeightTable(); // recalc height/distance table
break;
case LEFT_ARROW_PRESSED:
Pov.angle-=Degree_6; // turn left 6 degrees
if(Pov.angle < 0) // keep angle within tables
Pov.angle+=Degree_360;
GetSineAndCosine(Pov.angle); // get trig values
break;
case RIGHT_ARROW_PRESSED:
Pov.angle+=Degree_6; // turn right 6 degrees
if(Pov.angle >= Degree_360) // keep angle within tables
Pov.angle-=Degree_360;
GetSineAndCosine(Pov.angle); // get trig values
break;
case UP_ARROW_PRESSED: // move forward
dx=(int)(CosAngle >> 12); // get x-component
dy=(int)(SinAngle >> 12); // get y-component
result=HitTest(Pov.x,Pov.y,Pov.angle); // check for all walls
if(result == X_WALL)
{ // we hit an x-wall
dx=0; // can't move in x-dir
if(Pov.angle < Degree_180) // so check for y-walls
RevAngle=Degree_90; // check north
else
RevAngle=Degree_270; // check south
result=HitTest(Pov.x,Pov.y,RevAngle); // check for y-walls
}
if(result == Y_WALL)
{ // we hit a y-wall
dy=0; // can't move in y-dir
if(Pov.angle>Degree_270 || Pov.angle<Degree_90)
RevAngle=0; // look east
else
RevAngle=Degree_180; // look west
result=HitTest(Pov.x,Pov.y,RevAngle); // check for x-walls
}
if(result==NO_WALL)
{
Pov.x+=dx; // no walls were hit so we can move
Pov.y+=dy; // ...update POV
}
break;
case DOWN_ARROW_PRESSED: // move backward
dx=(int)(CosAngle >> 12); // get x-component
dy=(int)(SinAngle >> 12); // get y-component
RevAngle=Pov.angle+Degree_180;
if(RevAngle >= Degree_360)
RevAngle-=Degree_360;
result=HitTest(Pov.x,Pov.y,RevAngle);
if(result == X_WALL)
{
dx=0; // can't move in x-dir
if(RevAngle < Degree_180)
RevAngle=Degree_90;
else
RevAngle=Degree_270;
result=HitTest(Pov.x,Pov.y,RevAngle);
}
if(result == Y_WALL)
{
dy=0;
if(RevAngle > Degree_270 || RevAngle < Degree_90)
RevAngle=0;
else
RevAngle=Degree_180;
result=HitTest(Pov.x,Pov.y,RevAngle);
}
if(result == NO_WALL)
{
Pov.x-=dx;
Pov.y-=dy;
}
break;
default: // handle unrecognized keys
break;
}
GenView(); // cast some rays and draw screen
Frames++; // update frame counter
}
}
// This is the main program start
void main(void)
{
clock_t begin,fini;
unsigned memleft;
Initialize(); // setup video, read bitmaps, etc.
Pov.x=Pov.y=96; // set starting player location
Pov.angle=0;
GetSineAndCosine(Pov.angle); // get sin/cos values
GenView(); // draw initial view
memleft=(unsigned)(coreleft()>>10); // get free memory kbytes
begin=clock(); // start timing the program
EventLoop(); // the animation loop
fini=clock(); // finish timing the program
CleanUp(); // free up all memory
printf("Raycasting Performance\n----------------------\n");
printf("Memory: %uk\n",memleft); // show free memory
printf("Frames: %u\n",Frames); // show frame count
printf(" Ticks: %u\n",(fini-begin)); // show clock count
printf("Frame rate: %6.2f f/s\n",Frames*18.2/(fini-begin));
printf("\n\nRayCastr: A raycasting demo for Dr. Dobb's Journal.\n\n");
printf("Written by: Mark Seminatore\n");
printf(" CIS: [72040,145]\n");
}
Listing Four
// Draw.c: This module contains the DrawFrame() routine which texture
// maps the wall-slices in View[].
// Copyright (c) 1994 by Mark Seminatore, all rights reserved.
#include <stdio.h>
#include <mem.h>
#include "ray3d.h"
#include "globals.h"
void DrawFrame(void)
{
unsigned hceil;
long ScaleFactor,BmpOffset;
char *Display,*p,*Bmp;
int i,j,h,row;
h=HeightTable[FarthestWall]; // lookup smallest height on screen
if(h < VIEW_HEIGHT) // is it still > viewport height?
{ // if not then draw floors/ceilings
row=(VIEW_HEIGHT-h)>>1;
hceil=VideoRow[row+h];
Display=ScreenBuffer;
for(i=0;i<row;i++)
{
memset(Display,CEILING_COLOR,VIEW_WIDTH); // draw ceiling row
memset(Display+hceil,FLOOR_COLOR,VIEW_WIDTH); // draw floor row
Display+=VIEW_WIDTH;
}
}
for(i=0;i<VIEW_WIDTH;i++)
{
Display=ScreenBuffer+i; // init display pointer
Bmp=bitmap[View[i].Style]; // get ptr to bitmap
Bmp+=(View[i].Column<<6); // add in column offset
h=HeightTable[View[i].Distance]; // find wall-slice height
row=(VIEW_HEIGHT-h)>>1; // calc starting row
BmpOffset=0; // start with first pixel
ScaleFactor=ScaleTable[View[i].Distance]; // get scaling factor
if(row < 0)
{
h+=row<<1; // adjust wall-slice height
for(j=row;j<0;j++) // loop until on screen
BmpOffset+=ScaleFactor; // update position in bitmap
}
else
Display+=VideoRow[row]; // point to starting row
for(j=0;j<h;j++) // texture mapping loop
{
// Copy bitmap pixel to ScreenBuffer
*Display=*(Bmp+(unsigned)(((0xffff0000L&BmpOffset)>>(16))));
Display+=VIEW_WIDTH; // get next ScreenBuffer row
BmpOffset+=ScaleFactor; // update position in bitmap
}
}
Display=ScreenBuffer; // get source pointer
p=screen+20+SCREEN_WIDTH*40; // get destination pointer
for(i=0;i<VIEW_HEIGHT;i++) // copy each row
{
memcpy(p,Display,VIEW_WIDTH); // copy row to screen
p+=SCREEN_WIDTH;
Display+=VIEW_WIDTH;
}
}
Listing Five
; Xray.asm: This module implements the optimized assembly version of the
; RayXZ() function.
; Copyright (c) 1994 by Mark Seminatore, all rights reserved.
ideal
smart
p386
model compact,c
dataseg
include "rayasm.inc"
codeseg
proc RayXZ
ARG x:word,y:word,angle:word
push si ; save Turbo C register vars
push di
mov ax,[x] ; get x
and ax,0ffc0h ; calc (int)(x/64) * 64
les di,[dyTable] ; get full ptr to table
mov bx,[angle] ; get ray angle
shl bx,2 ; multiply by the size of a long
mov edx,[es:di+bx] ; get value from table
cmp bx,4*Degree_270 ; compare angle with 270 degrees
jg short LookingRight ; jump if greater than
cmp bx,4*Degree_90 ; compare angle with 90 degrees
jl short LookingRight ; jump if less than
mov [word cs:dxPos-2],-MAP_SIZE ; dx=-MAP_SIZE
neg edx ; dy=-dy
jmp short Continue1
LookingRight:
add ax,MAP_SIZE ; xLocation=xBeg+MAP_SIZE
mov [word cs:dxPos-2],MAP_SIZE ; dx=MAP_SIZE
Continue1:
mov [dword cs:dyPos-4],edx ; store dy inside loop
mov cx,ax ; cx contains xLocation
sub ax,[word x] ; xLocation - x
cwde ; xLocation - x
lfs di,[dword Tangent] ; get ptr to table
imul [dword fs:di+bx] ; (xLocation-x)*Tangent[angle]
movsx edx,[word y]
shl edx,16
add edx,eax
lgs di,[xzWalls] ; get ptr to xzWalls
GiantLoop:
cmp cx,0 ; is xLocation < 0?
jl short OuttaHere
cmp cx,MAP_MAX ; is xLocation > MAP_MAX?
jg short OuttaHere
cmp edx,large 0 ; is yLocation < 0?
jl short OuttaHere
cmp edx,large MAP_YMAXLONG ; is yLocation > MAP_YMAXLONG?
jg short OuttaHere
mov eax,edx
shr eax,16
and eax,0ffc0h
movsx ebx,cx
shr ebx,6
add ebx,eax
mov al,[byte gs:di+bx] ; get xzWalls[bx]
cmp al,0 ; was a wall found?
jne short Continue2 ; then quit
add cx,100h ; add imm16 value
dxPos:
add edx,1000000h ; add imm32 value
dyPos:
jmp GiantLoop
Continue2:
; and cx,0ffffh
mov [iRayX],cx ; iRayX=xLocation
mov [RayY],edx ; RayY=yLocation
pop di ; clean up and return
pop si
ret
OuttaHere:
xor al,al ; no wall was found
pop di ; clean up and return
pop si
ret
endp RayXZ
end