Recently, our source code analyzer was used to find flaws in several open source applications that are widely used in Internet communication. The latest releases of Apache, OpenSSL, and sendmail were analyzed. An overview of each application follows.
Open Source Software Under Test
APACHE. According to apache.org,
the Apache open source hypertext transfer protocol (HTTP) server is the
most popular web server in the world, powering more than 70% of the web
sites on the Internet. Given the ubiquity of Apache and the world’s
dependence on the Internet, the reliability and security of Apache
represent an important concern for all of us.
The Apache web server consists of approximately 200,000 lines of code, 80,000 individual executable statements, and 2,000 functions. 2.2.3 is the version under test.
OPENSSL. OpenSSL is an open source implementation of Secure Sockets Layer (SSL) and Transport Layer Security (TLS). TLS is the modern reimplementation of SSL, although SSL is often used as a general term covering both protocols. SSL forms the basis of much of the secure communication on the Internet.
For example, SSL is what enables users to send private credit card information securely from their browsers to an online merchant’s remote server. In addition to being intimately involved with data communication, OpenSSL contains implementations of a variety of cryptographic algorithms used to secure the data in transit.
OpenSSL is available for Windows; however, OpenSSL is the standard SSL implementation for Linux and UNIX worldwide. In addition, because of its liberal licensing terms (not GPL), OpenSSL has been used as a basis for a number of commercial offerings. Like Apache, OpenSSL is a keystone of worldwide secure Internet communication. Flaws in this software could have widespread deleterious consequences.
OpenSSL consists of approximately 175,000 lines of code, 85,000 individual executable statements, and 5,000 functions. 0.9.8b is the version under test.
SENDMAIL. According to wikipedia.org, sendmail is the most popular electronic mail server software used in the Internet. Sendmail has been the de-facto electronic mail transfer agent for UNIX (and now Linux) systems since the early 1980s.
Given the dependence on electronic mail, the stability and security of sendmail is certainly an important concern for many. The name “sendmail” might lead one to think that this application is not very complicated. Anyone who has ever tried to configure a sendmail server knows otherwise.
Sendmail consists of approximately 70,000 lines of code, 32,000 individual executable statements, and 750 functions. 8.13.8 is the version under test.
How Source Analysis Works
A source code analyzer is usually run as a separate tool, independent
of the compiler used to build application code. Sometimes the analyzer
is built into the same compiler used to build production code (as is
the case with the Green
Hills analyzer).
The analyzer takes advantage of compiler-style dataflow algorithms in order to perform its bug-finding mission. One advantage of using a single tool for both compiling and analyzing is that the source code parsing need only be done once instead of twice. In addition, source analysis can be configured to cause build errors when flaws are detected so that developers will be encouraged to find and fix them quickly.
A typical compiler will issue warnings and errors for basic code problems, such as violations of the language standard or use of implementation-defined constructs. In contrast, the analyzer performs a full program analysis, finding bugs caused by complex interactions between pieces of code that may not even be in the same source file.The analyzer determines potential execution paths through code, including paths into and across subroutine calls, and how the values of program objects (such as standalone variables or fields within aggregates) could change across these paths. The objects could reside in memory or in machine registers.
The analyzer looks for many types of flaws. It looks for bugs that would normally compile without error or warning. The following is a list of some of the more common errors that the analyzer will detect:
* Potential NULL pointer
dereferences
* Access beyond an allocated area
(e.g. array or dynamically allocated buffer); otherwise known as a
buffer overflow
* Writes to potentially read-only
memory
* Reads of potentially uninitialized
objects
* Resource leaks (e.g. memory leaks
and file descriptor leaks)
* Use of memory that has already been
deallocated
* Out of scope memory usage (e.g.
returning the address of an automatic variable from a subroutine)
* Failure to set a return value from
a subroutine
* Buffer and array underflows
The analyzer understands the behavior of many standard runtime library subroutines. For example it knows that subroutines like free should be passed pointers to memory allocated by subroutines like malloc. The analyzer uses this information to detect errors in code that calls or uses the result of a call to these subroutines.
Limiting False Positives
The analyzer can also be taught about properties of user-defined
subroutines. For example if a custom memory allocation system is used,
the analyzer can be taught to look for misuses of this system.
By teaching the analyzer about properties of subroutines, users can reduce the number of false positives. A false positive is a potential flaw identified by the analyzer that could not actually occur during program execution. One of the major design goals of a source code analyzer is to limit the number of false positives so that developers can minimize time looking at them.
If an analyzer generates too many false positives, it will become irrelevant because the output will be ignored by engineers. The analyzer is much better at limiting false positives than traditional UNIX programming tools like lint. However, since an analyzer is not able to understand complete program semantics, it is not possible to totally eliminate false positives.
In some cases, a flaw found by the analyzer may not result in a fatal program fault, but could point to a questionable construct that should be fixed to improve code clarity. A good example of this is a write to a variable that is never subsequently read.
Complexity Control
Much has been published regarding the benefits of reducing complexity
at the subroutine level. Breaking up a software module into smaller
subroutines makes each subroutine easier to understand, maintain, and
test. A complexity limitation coding rule is easily enforced at build
time by calculating a complexity metric and generating a build-time
error when the complexity metric is exceeded.
The source analyzer can optionally check code complexity. Once again, since the analyzer is already traversing the code tree, it does not require significant additional time to apply a simple complexity computation, such as the popular McCabe complexity metric. Some analyzers can be configured to generate a build error pointing out the offending subroutine.
Thus, the developer is unable to accidentally create code that violates the rule. In general, a good analyzer can be used to help enforce coding standards that would otherwise need to be done with manual human reviews or non-integrated third party products.
Figure 1: Analyzer summary report |
Output Of The Analyzer
Output format differs amongst analyzers, but a common mechanism is to
generate an intuitive set of web pages, hosted by an integrated web
server. The user can browse high level summaries of the different flaws
found by the analyzer (Figure 1,
above) and then click on hyperlinks to investigate specific
problems. Within a specific problem display, the flaw is displayed
inline with the surrounding code, making it easy to understand (Figure 2, below).
Function names and other objects are hyperlinked for convenient browsing of the source code. Since the web pages are running under a web server, the results can easily be shared and browsed by any member of the development team on the network.
Figure 2. In-context display of flaw |
Analysis Time
Analysis time will obviously be a gating factor in the widespread
adoption of these tools. We performed some build and analysis time
comparisons using the Green Hills compiler and source code analyzer to
determine the added overhead of using the analyzer on a regular basis.
The build time for the Apache web server using a single desktop PC running Linux was 1.5 minutes. The source analysis time on the same PC was 3.5 minutes. Build time using distributed PCs was 30 seconds.
With the Green Hills distributed build system, source code processing is automatically parallelized across worker PCs on the network. The system only uses PCs that have cycles to spare. In our test environment, 15 PCs were configured to act as workers, and approximately 10 of them were used at any one time for a build.
The source analysis time using distributed processing was 1.0 minutes, significantly less than the standard compile time using a single dedicated PC. It seems clear that when 200,000 lines of code can be analyzed in a minute using commonly available PC resources, there really is no reason not to have all developers using these tools all the time.
Flaws Found. The following sections provide examples of actual flaws in Apache, OpenSSL, and sendmail that were discovered by the Green Hills source code analyzer. The results are grouped by error type, with one or more examples of each error type per section.
Potential Null Pointer Dereference
This was by far the most common flaw found by the analyzer in all three
suites under test. Some cases involved calls to memory allocation
subroutines that were followed by accesses of the returned pointer
without first checking for a NULL
return. This is a robustness issue. Ideally, all memory allocation
failures are handled gracefully.
If there is a temporary memory exhaustion condition, service may falter, but not terminate. This is of particular importance to server programs such as Apache and sendmail. Algorithms can be introduced that prevent denial of service in overload conditions such as that caused by a malicious attack.
The Apache web server, sendmail, and OpenSSL all make abundant use of C runtime library dynamic memory allocation. Unlike Java which performs automatic garbage collection, memory allocation using the standard C runtime requires that the application itself handle memory exhaustion failures. If a memory allocation call fails and returns a NULL pointer, a subsequent unguarded reference of the result pointer is all but guaranteed to cause a fatal crash.
On line 120 in the Apache source file scoreboard.c, we have the following memory allocation statement:
ap_scoreboard_image = calloc(1,
sizeof(scoreboard) + server_limit *
sizeof(worker_score *) + server_limit *
lb_limit * sizeof(lb_score *));
Clearly, this allocation of memory could be substantial. It would be a good idea to make sure that the allocation succeeds before referencing the contents of ap_scoreboard_image. However, soon after the allocation statement, we have this use:
ap_score_board_image->global =
(global_score *)more_storage;
The dereference is unguarded, making the application susceptible to a fatal crash. Another example from Apache can be found at line 765 in the file mod_auth_digest.c:
entry
= client_list->table[idx];
prev = NULL;
while (entry->next){/* find last
entry */
prev = entry;
entry = entry->next; …
}
Note that the variable entry is unconditionally dereferenced at the beginning of the loop. This alone would not cause the analyzer to report an error. At this point in the execution path, the analyzer has no specific evidence or hint that entry could be NULL or otherwise invalid. However, the following statement occurs after the loop:
if
(entry) {
…
}
By checking for a NULL entry pointer, the programmer has indicated that entry could be NULL. Tracing backwards, the analyzer now sees that the previous dereference to entry at the top of the loop is a possible NULL reference. The following similar example was detected in the sendmail application, at line 8547 of file queue.c, where the code unconditionally dereferences the pointer variable tempqfp:
errno = sm_io_error(tempqfp);
sm_io_error is a macro which resolves to a read of the tempqfp->f_flags field. Later, at line 8737, we have this NULL check:
if
(tempqfp != NULL)
sm_io_close(tempqfp, SM_TIME_DEFAULT);
with no intervening writes to tempqfp after the previously noted dereference. The NULL check, of course, implies that tempqfp could be NULL; if that was ever the case, the code would fault. If the pointer can never in practice be NULL, then the extra check is unnecessary and misleading. What may seem to some as harmless sloppiness can translate into catastrophic failure given the right (wrong) conditions.
In sendmail, there are many other examples of unguarded pointer dereferences that are either preceded or followed by NULL checks of the same pointer. One more example in this category comes from OpenSSL, at line 914 in file ssl_lib.c:
if
(s->handshake_func == 0) {
SSLerr(SSL_F_SSL_SHUTDOWN, SSL_R_UNINITIALIZED);
}
Shortly thereafter, we have a NULL check of the pointer s:
if ((s != NULL) && !SSL_in_init(s))
Again, the programmer is telling us that s could be NULL, yet the preceding deference is not guarded.
Buffer Underflow
A buffer underflow is defined as an attempt to access memory before an
allocated buffer or array. Similar to buffer overflows, buffer
underflows cause insidious problems due to the unexpected corruption of
memory. The following flaw was discovered at line 5208 of file queue.c in sendmail:
if ((qd == -1
|| qg == -1) &&
type != 120)
…
else {
switch (type) {
…
case 120:
if
(bitset(QP_SUBXF,
Queue[qg]->qg_qpaths[qd].qp_subdirs))
}
}
The if statement implies that it is possible for qd or qg to be -1 when type is 120. But in the subsequent switch statement, always executed when type is 120, the Queue array is unconditionally indexed through the variable qg.
If qg was -1, this is an underflow. The code was not studied exhaustively to determine whether qg can indeed be -1 when type is 120 and hence reach the fault. However, if qg can not be -1 when type is 120, then the initial if check is incorrect, misleading, and/or unnecessary.
Another example of buffer underflow is found at line 1213 of file ssl_lib.c in OpenSSL:
p =
buf;
sk = s->session->ciphers;
for (i = 0; i <
sk_SSL_CIPHER_num(sk); i++) {
...
*(p++)=':';
}
p[-1] = '\0';
The analyzer informs us that the underflow occurs when this code is called from line 1522 in file s_server.c.
From a look at the call site in s_server.c, you can see that the analyzer has detected that buf points to the beginning of a statically allocated buffer. Therefore, in the ssl_lib.c code, if there are no ciphers in the cipher stack sk, then the access p[-1] is an underflow.
This demonstrates the need for an intermodule analysis, since there would be no way of knowing what buf referenced without examining the caller.
If it is the case that the number of ciphers cannot actually be 0 in practice, then the for loop should be converted to a do loop in order to make it clear that the loop is always executed at least once (ensuring that p[-1] does not underflow).
Another problem is a potential buffer overflow. No check is made in the ssl_lib.c code to ensure that the number of ciphers does not exceed the size of the buf parameter. Instead of relying on convention, a better programming practice would be to pass in the length of buf and then add code to check that overflow does not occur.
Resource Leaks
In line 2564 of file speed.c in OpenSSL:
fds=malloc(multi*sizeof *fds);
fds is a local pointer and is never used to free the allocated memory prior to return from the subroutine. Furthermore, fds is not saved in another variable where it could be later freed. Clearly, this is a memory leak. A simple denial of service attack on OpenSSL would be to invoke or cause to be invoked the speed command until all of memory is exhausted.
Conclusion
What better applications to pick, than popular, open source Internet
communications, for demonstrating the importance of automated source
code analyzers? Not only is this software of tremendous importance to
the world at large, but the fact that the software is open source
would, as many would argue, indicate that the code quality is expected
to be relatively high.
According to the book Open Sources: Voices from the Open Source Revolution by DiBona, Ockman, and Stone (O’Reilly 1999): ”by sharing source code, Open Source developers make software more robust. Programs get used and tested in a wider variety of contexts than one programmer could generate, and bugs get uncovered that otherwise would not be found.”
Unfortunately, in a complex software application such as Apache, it is simply not feasible for all flaws to be found by manual inspection. There are a number of mechanisms available to help in the struggle to improve software, including improved testing and design paradigms.
But automated source code analyzers are one of the most promising technologies. Using a source code analyzer should be a required part of every software organization’s development process. Not only is it effective at locating anomalies, but it can be used with little or no impact on build times and easily integrated with the regular software development environment.
David N. Kleidermacher is Vice
President of Engineering at Green Hills Software, Inc.