Computing the Hough Transform
By Andrew Queisser
The Hough Transform (HT) is an algorithm for finding shapes in images or sets of data points. Unlike regression, algorithms that find one line or curve, the HT can be used to find several lines in one set of data points. Although well known in the field of image processing, HT was originally published in 1962 as part of a patent for a device to track high-energy particles [1]. In this article, I introduce the HT for line and circle detection, and discusses implementation details.
Imagine you have a set of x-y data points that represent several lines. Traditional regression algorithms can only give you one line that best fits the data points. If the data points are edge pixels from an image, then it is likely that you want to find several lines at once. HT is well suited for edge searches like this because it is a global transform, which considers all likely edge pixels. The HT does not require any a priori information other than the edge shape (line, circle, ellipse, or the like) to be found.
Line Detection
A straight line in x-y space is often written as y=mx+b, but can also be written in the normal form:
<IMG src="http://twimgs.com/ddj/cuj/images/cuj0312queisser/rho14.gif" WIDTH="6" HEIGHT="10">= x*cos(<IMG src="http://twimgs.com/ddj/cuj/images/cuj0312queisser/theta14.gif" WIDTH="5" HEIGHT="10">)+y*sin(<IMG src="http://twimgs.com/ddj/cuj/images/cuj0312queisser/theta14.gif" WIDTH="5" HEIGHT="10">),
where is the angle of the normal vector and is the distance of the line to the origin. Figure 1 illustrates values and for a line L1 going through two points P1 and P2. The normal form is more suitable for the HT because it avoids the problem of the slope m becoming infinite for lines parallel to the y-axis.
Inserting coordinates (x1,y1) of a specific edge pixel P1 into the normal form yields a curve =f(), which describes all possible straight lines through P1. In - coordinates, also known as "Hough space," is a sinusoidal curve. A point in x-y space maps to a sinusoidal curve in Hough space, while a line in x-y maps to a point in Hough space. Figure 2 shows the two sinusoidal curves corresponding to points P1 and P2 in Figure 1. The two curves intersect at the and for line L1.
The basic idea of the HT is to calculate the curves that represent all possible lines through each edge pixel and to accumulate the results in Hough space. Each edge pixel contributes one curve through the Hough space accumulator; and each accumulator cell through which the curve passes gets incremented. If each of N accumulated edge pixels lies on a perfect line L1, the Hough cell corresponding to L1 ends up with a value of N. In most real images, the points will be close to but not perfectly on L1, and the Hough cell for L1 will contain a local maximum smaller than N. The cells around the maximum cell drop down to smaller values. When the accumulation for all edge pixels in the image is finished, the Hough accumulator contains a peak for each line in the image. A peak selection algorithm, (in the simplest case, just a search for the maximum value) finds the strongest line in the image.
Implementation
A naive implementation of HT would simply loop through all values of from 0 to PI (the range of PI to 2*PI would accumulate the same lines again), calculate the value, and increment the cell in question. Listing 1, which shows the loop required to calculate the Hough accumulator, illustrates some important implementation choices:
- The angle resolution is determined by the value of numAngleCells. A resolution of half a degree requires a value of numAngleCells=360.
- The value of rhoMax is sqrt(dx^2+dy^2), where dx and dy are the maximum coordinates of the image. In effect, rhoMax is the length of the diagonal of the original image. The full range for rho is [-rhoMax,rhoMax] so the accumulator has to be large enough to hold numAngleCells*2*rhoMax values unless the resolution of is scaled down to a lower resolution
For a typical image with dx=640, dy=480 and an angular resolution of 0.5 degrees, the accumulator has to be at least [360][sqrt(640*640 + 480*480) * 2] or [360][1600]. As you can see, the angular resolution and the distance resolution is quite different dividing the rhoValue by two results in a more balanced overall resolution with 360 values and +/-400 values. In general, a suitable rhoDivisor value should be chosen, but a division by two is efficient because it can be accomplished by a right-shift of one.
Many implementations of the Hough Transform use the approach in Listing 1 more or less directly. Unfortunately, the loop as shown only increments one cell per value instead of all cells through which the sinusoidal curve passes. To get complete coverage of all cells, you need to consider parts of the curve where the curve passes through several cells for one value of . To get complete coverage, a nested loop that fills in the skipped cells is needed. Listing 2 is the complete accumulation loop.
Optimization
The first optimization that speeds up the accumulation is to replace the calls to cos() and sin() with lookup tables. The values can be precalculated because they depend only on the step size. The second optimization is moving the origin to the center of the image by subtracting dx/2,dy/2 from each point. Centering the origin halves the size of the accumulator because rhoMax is now only half as large as before. The accumulation loop can be rewritten, as in Listing 3. (For clarity, I have omitted the nested loop that fills in the skipped cells.)
The lookup tables cosTable and sinTable have been precalculated once when the size of the original image is known. Further optimization is possible by replacing cosTable and sinTable with scaled integer values by scaling up the sine and cosine from the original [-1,1] range to [-1024,1024] and dividing the end result by 1024, which is a right-shift by 10 bits.
With the sine and cosine functions replaced by a lookup table, it is tempting to go one step further and replace the multiplications x*cosTable[a] and y*sinTable[a] with two-dimensional lookup tables xTable and yTable. However, the big lookup tables did not speed-up the algorithm in my tests, presumably because the accesses over large memory areas are slower than smaller lookups combined with multiplications.
Table 1 lists some times measured on a not-so-fast PC for a 640×480 image with 13,801 edge pixel candidates. These times will vary significantly depending on processor type and available cache memory.
An HT Accumulator Class
Listing 4 is the class definition of an HT accumulator base class HoughAccum, which encapsulates the management of the accumulator memory. I use std::vector<int> for the accumulator. There is no appreciable performance penalty for using vector instead of a plain array because the vector is presized during initialization. If the maximum values for dx and dy are known beforehand, m_accum can be defined as a true two-dimensional array or a boost::multi_array. Using a two-dimensional array makes the implementation of GetCell/SetCell simpler and would also simplify further analysis of the accumulator. For now, the std::vector is enough.
The AddPoint function is implemented as a template method [2]. The nonvirtual public AddPoint function checks the bounds of x and y and calls the virtual DoAddPoint function, which has to be defined by the derived class for the specific accumulator class.
The GetCell/SetCell and other accessor functions for the accumulator array are also nonvirtual. For these frequently called functions, the overhead of a virtual function is best avoided.
The motivation behind creating an abstract base class is that the HT can be used for shapes other than straight lines. HoughAccum itself can not be instantiated, so I define a derived class LineHoughAccum, which defines the functionality for the line detection. LineHoughAccum adds the lookup tables for sine and cosine and defines the necessary Init and DoAddPoint overrides. LineHoughAccum uses the row index for the angle and the column index for the distance . Listing 4 is the definition of LineHoughAccum. To use the LineHoughAccum class, follow these steps:
- Create a LineHoughAccum object with the desired angle resolution.
- Initialize the accumulator with the size of the image.
- Call the AddPoint function with the coordinates for each of the points in the image that might be an edge point.
- Analyze the accumulator for cells with values that are high enough to be significant using the GetCell function.
- Call Clear() and repeat steps 2 and 3 for additional images with the same size, or start at step 1 if the image size changes.
Step 3 deserves special attention because the interpretation of the Hough accumulator is not trivial.
Peak Detection Algorithms
The simplest peak detection algorithm simply searches for the cell with the maximum value in the accumulator and calculates the angle and distance values from the cell index. Here's how to calculate the angle and distance for the cell at row,col (avoid integer division by multiplying with PI first):
<IMG src="http://twimgs.com/ddj/cuj/images/cuj0312queisser/theta14.gif" WIDTH="5" HEIGHT="10"> = (PI * row)/numRows <IMG src="http://twimgs.com/ddj/cuj/images/cuj0312queisser/rho14.gif" WIDTH="6" HEIGHT="10"> = col-numCols/2
where numRows and numCols was obtained from the call to HoughAccumulator::GetAccumSize().
Usually, an image contains more than one line, though. You will need a way of successively finding peaks that are far enough away from previous peaks. To make things more challenging, each peak in the Hough accumulator is spread across several cells because the edge pixels in a typical image will not be on perfectly straight lines. An iterative peak detection algorithm looks like this:
1. Find the maximum cell in the accumulator.
2. Find the area around the maximum cell that belongs to the same line.
3. Mark or zero out the area.
4. Continue at step 1 until all cells are below some preselected sensitivity threshold.
Step 2 is the tricky part. How can you determine how many neighboring cells belong to the same maximum? You have to choose which range of and values should be merged with the maximum line. The extreme cases include the following:
- The detection of a single polygonal shape requires a peak detection that finds local maxima and doesn't return multiple lines that are clustered close to each other. The minimum values of and that separate two peaks in the accumulator can be directly determined from the polygon.
- The interpretation of natural scenes requires a detection algorithm that returns many lines, which are then further interpreted. In natural scenes, several lines may actually lie close to each other and shouldn't be lumped together.
The source code for this article, available at http://www.cuj.com/code/, includes examples of peak detection algorithms.
Experimental Results
Figure 3 shows an example image of two brackets. The image was processed with a Sobel edge detection filter and made binary. Figure 4 shows the lines extracted with the HT overlayed with the original image. Only peaks with accumulator cells greater than 50 were extracted. Lines within +/- five degrees were merged into one line.
Further Refinements
The HT can be used for shapes other than straight lines. The basic idea of accumulating votes in a parameter space can be applied to any shape that can be represented by a small number of parameters. A circle of known radius can be represented by two parameters: The two coordinates xc,yc of the center point. If you accumulate votes along a circle with radius r around each edge pixel, you will accumulate a maximum at xc,yc. In some ways, the HT for circles is easier than for lines because the Hough space for circle detection is actually the image space itself. If the accumulator results are viewed as an image, you will see bright spots at the center points of circles in the source image.
For grayscale images, several ways to make the Hough Transform more efficient or robust have been proposed:
- The basic Hough Transform ignores information about edge direction that can be extracted from the grayscale source images. Each edge pixel contributes a complete curve through the accumulator. The approximate edge direction can be estimated from the image so it is sufficient to accumulate only the portion of the curve around the parameter estimated from the image.
- The edge strength of the edge pixel can be used to make the HT more robust. Instead of simply giving each edge pixel one vote, the edge strength is added to the accumulator. Strong edges contribute more to the accumulator than weak edges, which helps reduce noise effects.
- The Hough Transform has also been expanded to detect shapes other than circles and lines. The "Generalized Hough Transform" can be used to detect arbitrary shapes although the accumulation becomes much more involved than the simple schemes introduced here. The machine-vision literature discusses many other ways of applying the HT [3].
References
[1] Hough, P.V.C (1962). Method and means for recognizing complex patterns. U.S. Patent 3069654.
[2] Gamma, Erich, et al. Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley, 1995.
[3] Davies, E.R. Machine Vision: Theory, Algorithms, Practicalities, Second Edition. Academic Press, 1997.
Andrew is a software engineer for Hewlett-Packard. He can be contacted at [email protected].