Perspectives
Using the ASM library, COJAC was not really hard to implement. It nevertheless opens some interesting perspectives:
- It is straightforward to implement a new classloader that performs on-the-fly instrumentation.
- Java is not the only programming language that produces code for the JVM. Thus, COJAC is expected to be suitable for other languages equipped with a bytecode translator, like Python or Ruby.
- COJAC could be packaged as an Eclipse plug-in that automatically performs instrumentation on entire projects. Similarly, it would of course be possible to integrate the functionality as a Java compiler option (maybe considered in a next release of Java?).
- Floating-point operations do not suffer from overflow, but also have pitfalls; sometimes it would be useful to discover that a number did not change after the addition of a nonzero term or the product with a nonone factor. It would be easy to extend COJAC to report such situations.
COJAC is designed to be simple and robust, and this causes some inherent limitations:
- Compile-time evaluated expressions are not detected. The compiler evaluates expressions such as (3*Integer.MAX_VALUE), and only the result is stored in the bytecode.
- The instrumentation could potentially be confusing for the debugger and other tools that exploit the correspondence between source code and bytecode (e.g. a profiler).
- Because we didn't want to add a new class in the instrumented code, the default logging will repeatedly report the same location every time it causes an overflow. It is straightforward to define a better logging callback that filters already-seen locations, as in Example 4.
Our approach of bytecode instrumentation is not the only way of attacking the overflow problem. An idea to reduce the slowdown would be a probabilistic technique that decides at runtime whether an operation has to be checked or not. An alternative would be to write nonportable C code to consult the CPU overflow flag, and bind this code with JNI. Another interesting approach is static analysis: We could, for instance, extend JML tools (like ESCJava) to report at compile-time some of the potential overflows, as is already done for bad array indices. By the way, it can be argued that reporting overflows is a poor goal and that we should aim at completely suppressing the overflow risk, for example, automatically converting int/long to BigInteger objects; this is not trivial because it would cause deep transformations in the code (e.g. converting arrays to ArrayLists).
Finally, a totally different approach would be to rely on virtualization: Is it difficult to adapt QEmu or Xen, for example, so that the virtual machine signals to the host OS any overflow occurring in any running process of the guest OS?
Conclusion
Arithmetic overflow is an annoying feature of most programming languages: On rare occasions, it is exploited by the programmers as a computation property; but most of the time, it is simply a nasty source of bugs. Any attempt to help discovering bugs and deliver robust code is worth trying. That is why Java developers should keep an eye on recent developments. Just to arbitrarily name a few free new tools, look to Eclipse TPTP, JML verifiers (www.cs.ucf.edu/~leavens/JML), delta debugging support (www.st.cs.uni-sb.de/eclipse), the ODB "omniscient debugger" (www.lambdacs.com/debugger)or COJAC.
What About C/C++ ?
Java and C/C++ differ in their models of arithmetic expression evaluation, but the problem of arithmetic overflow is absolutely similar. Some C/C++ compilers offer an option to generate code that will check integer operations and report overflows. For instance, GCC comes with the -ftrapv flag, but it used to be implemented in a way that does not cover every situation: for instance, it doesn't signal overflows on shorts, and it is known to miss some overflows on int multiplications. Maybe this is why it doesn't seem to be popular. We found another attempt to provide instrumentation in GCC as a separate module using GEM, but the information is a bit laconic about the current state of this project (www.ecsl.cs.sunysb.edu/iop/index.html). |