The Alpha-Beta Filter
Robert Penoyer
Robert Penoyer has an M.S. in electrical engineering from the University of Southern California. He has 17 years experience in electronic circuit and system design and simulation. He has been programming in C for five years and working with alpha-beta filters for four years. Bob can be reached on GEnie at RPENOYER. or on Compuserve at 71603,1335.
Data collected or received in the real world is nearly always corrupted by errors, or noise. With relatively consistent data information that varies more slowly than the noise you can look at more than one measurement at a time and more accurately separate the data from the noise.
To minimize the effects of noise, the measured data can be filtered, or smoothed. In an ideal world you could remove all the noise and be left with an exact duplicate of the original information. (But in an ideal world, the information would not have been corrupted in the first place.) The best you can hope to accomplish in the real world is to remove as much noise as possible while retaining as much of the original information as possible. The a-b filter is a classic option for smoothing a corrupted signal and tracking discrete data.
Child of the Kalman Filter
The a-b filter is a derivative of the Kalman filter. The Kalman filter, a so-called optimal filter, performs better than any other linear filter, under specific conditions (Gelb 1974, Meditch 1969, Mendel 1974). You pay for such excellent performance with extreme complexity and computationally-intensive software. The a-b filter resembles the Kalman filter in its nature, structure, and performance, but without as much complexity.
The Kalman filter automatically varies its coefficients to optimize its performance as a function of the statistics of the incoming corrupted data. The a-b filter, on the other hand, uses either fixed coefficients or coefficients that vary as the programmer requires them to vary. Thus, the a-b filter avoids entirely, or almost entirely, the computational overhead necessary for the Kalman filter to perform properly.
The a-b filter is called suboptimal. But don't interpret that to mean it performs poorly. On the contrary, the a-b filter can be a highly effective smoother. By varying the coefficients of this filter, you can control the length of the data history used to estimate the data's true value from the most recent measurement.
The Classic a-b Filter
The a-b filter has about as many forms as there are sources that describe it (Benedict and Bordner 1962, Lewis 1984, Morris 1986). The following equation set is based on one presented by Morris.
where:
s = position
v = velocity
a = multiplier on the position measurement error (0<=a<=1)
b = multiplier on the velocity measurement error (0<=b<=1)
T = time between measurements
n parameter value for this time increment
pn predicted parameter value for this time increment
p(n+1) predicted parameter value for the next time increment
m measured parameter
^ estimated value (corrected prediction)
Though these equations use the terms position and velocity, any parameters that are related by the derivative operator (e.g., d/dx) can be substituted. The equations form a recursive relationship and are used formally as follows:
Given a new position measurement, an estimate is made of the true position and velocity for the current time increment using Equation 1 and Equation 2. The estimates from Equation 1 and Equation 2 are used to predict the position and velocity for the next time increment using Equation 3 and Equation 4. When the next time increment occurs, i.e., when a new position measurement is made, the process is repeated. This method requires that T be constant or nearly so.
The a and b terms give the filter its name. These fixed coefficients replace the optimized coefficients of the Kalman filter.
Simplifying the a-b Equations
While fine as given, Equation 1 through Equation 4 can be massaged to make the programming effort simpler. However, beware that any time a derivative is used on noisy measurements, the derivative tends to multiply the effects of the noise. Because the velocity term in the a-b filter is related to the position term by the derivative operator, the velocity estimate can be very poor. Therefore, I assumed that the velocity estimate of this a-b filter would not be used to provide a velocity estimate outside the equations. That is, I used the a-b filter only to estimate position from some set of position measurements. (If you should want to use the velocity term as an estimator of true velocity, consider smoothing it with a second a-b filter.)
Because I was not going to use the velocity term outside the equations, I could afford to modify it. First, I rearranged Equation 2 by multiplying through by T:
I further modified Equation 5 such that T was absorbed into v. This is a valid simplification since I do not use v directly. Consider these facts about T:
- T operates only on v
- Every place v occurs, T occurs (with the understanding that Equation 4 is also multiplied by T)
- T is the time between position measurements
- T depends upon the operating speed of the computer system (assuming its speed determines the time between measurements)
As noted above, Equation 6 simply causes v to be scaled by T.
To begin the recursive estimation sequence, the initial estimate of position, , is taken to be the first measured position, sm0. Set the initial velocity estimate, , to 0 unless you have a good reason for doing otherwise. Given these starting conditions, the equations can be rearranged, placing Equation 3 and Equation 4 first. The indices in these two equations then become pn rather than p(n+1) and n-1 rather than n so that these equations are used to predict the parameters for the current time increment based on the previous time increment. (Note that, as originally formulated, the equations were used to predict the parameter value for the next time increment based on the current time increment.) Also, since Equation 4 provides no new information, it can be ignored the old estimate of velocity becomes the new predicted velocity.
Following the elimination of Equation 4 and the replacement of vpn with in Equation 6, the reformulation of the original set of equations and their operating sequence now becomes:
Equation 1:
When the above sequence begins, the entry point is Equation 7 with set equal to the first value of smn (i.e., sm0) and set equal to 0. This set of equations and conditions are the basic smoothing algorithm. The actual programming implementation adds one more step.
The parenthetical term in both Equation 1 and Equation 8 is the difference between the measured position and the predicted position. This difference is the measurement error. Instead of calculating it twice, once for each equation, a single measurement error term can be calculated following Equation 7:
emn = smn - spn
Thus, the actual implementation of the a-b filter is:
Equation 7:
Equation 9:
emn = smn - spn
Equation 1:
Equation 8:
This formulation is very simple. It lends itself to simple coding and to a simple geometric explanation of the a-b filter's operation.
A Geometric Explanation of the a-b Filter
Assume that the filter has been running. That is, the starting procedure has been completed and, when a new measurement is made, the filter runs through the sequence of equations. The equations use the estimates from the previous time increment and the current position measurement to estimate the true current position.
Figure 1 shows that, given the starting point, , and the velocity, , a straight line predicts the new position, spn. This is the operation performed by Equation 7. Since includes the T multiplier, the units of are distance, not velocity.
In general, the measured position will not coincide with spn. Equation 9 quantifies the measurement error. This error is due primarily to noise. However, some of the measurement error legitimately may be due to a change in velocity. Therefore, since some of the difference may be valid, all the difference cannot be ignored. Thus, a, a positive multiplier that generally has a magnitude smaller than 1, is used to add a fraction of the measurement error to the predicted position to improve the position estimate. Equation 1 yields the final estimate of position. The function of Equation 1 is illustrated in Figure 2.
From the geometry of Figure 1 and Figure 2, it can be seen that if v is varied inversely as T is varied, then none of the details in the figures (i.e., positions or slopes) will change. That is, if the data and measurements are speeded up equally, the details of the figures will not be affected. Hence, variations in T affect only the filter's speed; damping is not affected.
Relating b to a
Just as a determines the fraction of the measurement error applied to the position prediction, b determines the fraction of the measurement error applied to the velocity prediction. Equation 8 uses b for this purpose. Benedict and Bordner have derived a relationship between a and b that optimizes the filter's performance:
This relationship is based on the condition that the best estimate be provided for a system that is tracking noisy information that is changing at a constant velocity. This may or may not be satisfactory for your particular application.
Equation 10 yields a slightly underdamped system. Being underdamped implies that the system will "ring" somewhat. That is, if the data makes a sudden step in position, the estimate of position will overshoot the true position at least once before settling down to an accurate estimate. A more heavily damped system might perform better in your application. Benedict and Bordner offer an a-b relationship for a critically damped system. If their relationship is rearranged to solve for b, it will be found that damping is not critical. To get closer to critical damping, Equation 11 multiplies the rearranged relationship by 0.8.
Equation 11 yields a system that is nearly critically damped. (In a critically-damped system, the output agrees with the input as quickly as possible with no ringing.) By using Equation 11 instead of Equation 10, you can reduce but not eliminate the ringing response. Because a is never permitted to exceed the range 0a1, the quantity under the radical is never negative.
Consider Equation 10 and Equation 11 merely as reference points. As long as the limits 07lt;= a<=1 and 0<b<=1 are not violated, you are free to select the values of a and b to suit your needs.
A Programming Example in C
I begin by estimating measured data uncorrupted by noise. Figure 3 contains a profile of the data that begins at a steady state, ramps to a second steady state, then steps to a third steady state. (Normally, you would not filter clean, uncorrupted data like this for an estimated value, but I am using it to demonstrate the performance of the a-b filter against a ramp and a step.) The profile might represent a sequence of measured voltages, currents, distances, altitudes, volumes of storage tank contents, disk-drive head positions, highway traffic volumes, percentage of x in a vat full of y, etc.
I wrote and compiled ALPHABET.C (Listing 1) using the Borland C++, Version 3.1 compiler configured for MS-DOS. I verified it on a system running MS-DOS, version 5.0. The plotting functions assume a standard VGA monitor (640 x 480 pixels). If your monitor is different, modify the #define statements for PLOTx and STEADY_STATEx by simple scaling. profile (Listing 1) generates the data profile of Figure 3.
ALPHABET .C is a simple demonstration of the performance of the a-b filter for two sets of a-b values. In each case, I give a an arbitrary value of 0.25; thus, 25% of each measurement error is used as a correction to the predicted position. Over the first part of the program, b is determined by Equation 10, the so-called optimal value. (It may not be optimal for your application.) The second part of the program uses the b value determined by Equation 11, the so-called near-critically damped value.
First, the program plots the data profile. Once it plots the profile, the program pauses until you press a key. After you press a key, the program plots the response of the a-b filter to the data profile over the profile. You will see slight overshoots as the profile begins and ends the ramp. You will notice a significant overshoot following the step. Pressing another key plots the filter's output by itself.
Press another key to produce the signal corrupted by noise. gauss_noise generates a mean-zero, gaussian (normal) noise function. To do so, it uses the Box-Muller method (Press, et al. 1988, MathSoft, Inc. 1988). The noise is given an arbitrary standard deviation of s = 10. The mean is set to m = 0 to comply with the requirement that the mean of the corrupting noise be zero. If the mean were non-zero, the a-b filter's response would be offset by the mean value of the noise. In gauss_noise the variable mean is retained, despite being set to zero, to clarify its part in the computation of the noise function.
Continue pressing a key to run through all the combinations of clean signal, noisy signal, and smoothing provided by Listing 1.
Filter Performance
Several figures illustrate the filter's performance. Figure 4 compares the original data profile with the resultant output due to the a-b relationship of Equation 10. The filter tracks the ramp well. It tracks the step well, too, except for the overshoot following the step. The ringing of the underdamped filter is apparent here. Figure 5 illustrates the data profile corrupted by noise. The noisy data remains the same throughout a single run of the program so that each step of the program operates on the same noise. The noise is different each time the program runs so that the noise you will see if you run it will be different from that shown in Figure 5. Figure 6 compares the filtered noisy data with the original, uncorrupted data. Keep in mind that the intent is to recover the original, uncorrupted data as accurately as possible, thus the comparison. Figure 7 and Figure 8 use the more heavily damped filter due to the a-b relationship of Equation 11.
You might find it useful to modify Listing 1 by supplying your own values for a and b. Just be sure that neither is permitted to be negative or to exceed a value of 1.0.
alpha_beta performs the actual a-b filter algorithm. Three things should be said about this function. First, if it were used in real time, all variables could be scalar. No arrays would be needed. As each estimated position (or whatever) value was calculated it could be returned to the calling function. Only the most recent values of variables s_est and v_est need to be preserved. Therefore, if the algorithm were to be modified to filter measurements one call at a time, s_est and v_est should be made static variables.
Second, keep in mind that the method used here has absorbed the term Tv into the single variable, v. If you want to make use of v as well as s, be sure to divide v (actually, Tv) by T, the fixed period over which each measurement is made. As implemented, the b/T term of Equation 2 has been change to simply b, thus eliminating any need for a division. If you would rather preserve the true scaling of v, you can absorb T into the calculation of b, avoiding repeated divisions by T.
Finally, you will have to make your choice of a based on the nature of the data you are attempting to smooth. A large a (near 1.0) results in very little smoothing and very little noise reduction. A small a (near 0.0) results in a lot of smoothing, i.e., a lot of noise reduction and very rounded estimates of data steps. Your choice must be based as much on your good judgment as it is on the data to be smoothed. Equation 10 or Equation 11 will usually provide a suitable b value, though you might want to experiment with this too. Let experience and testing be your guide.
Tracking
Tracking, another important application of the a-b filter, differs from smoothing by centering a tracking window on the predicted position of the measured data as determined by Equation 7. When the measured data falls inside the window, that data is taken as a valid measurement and used to update the filter's position estimate. Data that falls outside the window is ignored.
When the data is ignored for failing to fall inside the track window, the calculation of Equation 9 cannot be completed. Instead, ignore Equation 9 and assign 0 as the error value. Then carry out the other calculations. Because the a-b filter incorporates a velocity term, you use it to predict where the data should have been. When the error value is zero, the velocity estimate of Equation 8 is not changed so that the velocity estimate remains constant. That is, the velocity estimate, based on the history of calculations up to the data that failed to fall within the track window, is used to update the position prediction of Equation 7 and, hence, the position of the track window. If the data falls inside the track window once again, the normal calculation of measurement error, Equation 9, is resumed and tracking updates continue. If the system goes through too many iterations without the data falling inside the track window, the program must declare loss of track. You must determine for your application what constitutes too many misses.
Once you determine that track has been lost, you need a search-and-acquire algorithm to reacquire the data. During acquisition, you might find it useful to scale a (and b) from a larger value (near 1.0) when the target is first acquired, down to a more suitable value (e.g., 0.25 or so) to maintain track. (Note that initially setting is equivalent to starting with an alpha symbol value of 1.0.) The down scaling can be accomplished over a few iterations. This strategy can speed the convergence of the position estimate to the true data value.
You will have to determine how wide the track window should be for your application. You will also have to figure out how to determine that track has been lost and how the data is to be reacquired.
References
Benedict, T.R., and G.W. Bordner. July 1962. "Synthesis of an Optimal Set of Radar Track-While-Scan Smoothing Equations," IRE Transactions On Automatic Control, pp. 27-32. New York, NY: The Institute of Radio Engineers.
Gelb, Arthur. 1974. Applied Optimal Estimation, Chapter 4. Cambridge, Massachusetts: The M.I.T. Press.
Lewis, Frank L. 1986. Optimal Estimation: With an Introduction to Stochastic Control Theory, pp. 88-89. New York, NY: John Wiley & Sons.
Meditch, J.S. 1969. Stochastic Optimal Linear Estimation and Control, Chapter 5. New York, NY: McGraw-Hill, Inc.
Mendel, Jerry M. 1973. Discrete Techniques of Parameter Estimation, pp. 159-162. New York, NY: Marcel Dekker, Inc.
Morris, Guy V. 1988. Airborne Pulsed Doppler Radar, Section 12.7.1. Norwood, MA: Artech House, Inc.
Press, William H., et al. 1988. Numerical Recipes in C, pp. 216-217. New York, NY: Cambridge University Press.
MathSoft, Inc. Summer 1988. "Industrial Engineering Prof. Uses MathCAD for Simulations, Statistics, and Teaching," MathCAD User's Journal, Vol. 2, No. 3, p. 4. Cambridge, MA: MathSoft, Inc.