Programming Excel COM Objects in C++
Philippe Lacoude and Grum Ketema
Excel is a powerful tool that you can use from a C++ program, once you get the protocol right.
Introduction
The Microsoft Office suite includes a number of popular applications. With the advent and growing popularity of the COM standard, these applications have been developed as a set of COM objects. Thus, a user may use applications like Excel directly as stand-alone applications the most popular choice or these applications may be programmed using a programming language like C++.
Consider an application that runs every night and prints a report summarizing the previous day's business activity. Wouldn't it be nice if, instead of wasting thousands of pieces of paper, this application could produce its results in a nice Excel spreadsheet and send copies via email to all the sales department executives, the CFO, and the CEO? Consider a website, which allows users to query large databases. It might also be very effective to enable users to get the results via email as an Excel spreadsheet. In both cases, users not only get the information, but they also gain the ability to manipulate it with Excel, certainly one of the most popular corporate data-crunching tools.
An application written in C++ can manipulate Excel COM objects directly using the COM interfaces exposed by the Excel application. This article describes how to program Excel applications using COM objects and Visual C++ native support to COM. The example code creates a small table, which contains three columns with text titles and some layout. The three main rows of the table contain a sequence of integers from one to nine. The last row of the table contains Excel formulas that compute the content of each column, as shown in Figure 1.
COM Background
COM is a standard used for object interoperability. Unlike other standards for object interoperability, such as CORBA, COM guarantees object interoperability at both interface- and object-layout (binary) level. Because COM objects are designed to be binary compatible, manipulating COM objects requires the use of not-so-friendly interfaces, methods on objects, and object identifiers (e.g., GUIDs, or Globally Unique Identifiers). Starting with Visual C++ v5.0, Microsoft introduced native COM compiler support by including new language extensions and new compiler directives. When these features are used, the compiler generates wrapper C++ classes and smart pointers for COM objects. The wrapper classes are regular C++ classes, which are easier to use than the native COM objects. The compiler generates these wrapper classes by reading the COM type libraries. The type libraries are generated when the COM objects are compiled from the object description. COM object descriptions are written in ODL/IDL (Object Description Language/Interface Description Language). Another compiler called MIDL generates the type libraries by compiling the ODL/IDL source files. Type information is stored in type libraries (.OLB files), dynamically linked libraries (.DLL files), or executables (.EXE files).
C++ Implementation
The implementation of our example application is shown in Listing 1. In this section we walk through the code, explaining what the various parts do.
The #import Directive
The #import directive is a non-portable construct specific to Microsoft compilers only. It tells the Visual C++ compiler to read the type libraries and generate the wrapper classes (Listing 1, lines 6-8). Assuming Microsoft Office 97 has been installed in C:\Program Files, the following statements (broken to fit magazine layout) generate a total of six files, three header files, and three implementation files:
#import <C:\Program Files\Microsoft \ Office\Office\mso97.dll> \ no_namespace rename("DocumentProperties", "DocumentPropertiesXL") #import <C:\Program Files\Common Files\\ MicrosoftShared\VBA\ \ vbeext1.olb> no_namespace #import <C:\Program Files\Microsoft \ Office\Office\excel8.olb> \ rename("DialogBox", "DialogBoxXL") \ rename("RGB", "RBGXL") \ rename("DocumentProperties", "DocumentPropertiesXL") \ no_dual_interfaces
The files generated are as follows:
mso97.tli, mso97.tlh vbeext1.tli, vbeext1.tlh excel8.tli, excel8.tlh
These header (.tlh) and implementation (.tli) files include the wrapper classes for each of the type libraries. The generated header files are read and compiled as if they have been named in an #include directive. The #import directive can optionally include one or more attributes, which tell the compiler to modify the contents of the generated header files. Here, the no_namespace and rename attributes are used. By default, the generated wrapper classes are wrapped within a namespace. If the no_namespace attribute is specified, the namespace is omitted. To use the namespace wrapper, but give it some other name, use the rename_namespace attribute. In the example, since the no_namespace attribute is used with all #import directives, the namespace wrapper is turned off in all generated .tlh files.
To avoid type-name collision problems, the rename attribute replaces all uses of a specific type name with another name. In the example, the DocumentProperties type name in mso97.tlh is replaced with DocumentPropertiesXL.
The details of the generated files are beyond the scope of this paper. The excel8.tlh and excel8.tli files contain the wrapper classes and class methods for the Excel application.
COM Initialization
In Listing 1, lines 23-26, a StartOle structure is defined and a variable named instStartOle is declared. The StartOle constructor calls CoInitialize, and the destructor calls CoUninitialize. CoInitialize must be called before calling any COM function or before using any COM object. CoInitialize initializes the COM libraries. CoUninitialize releases any resources COM uses. For every call to CoInitialize, there should be a corresponding call to CoUninitialize.
Wrapping the calls to CoInitialize and CoUninitialize in this StartOle struct guarantees that the two functions are called in matched pairs. The other nice advantage is that when an exception is thrown and the code is terminated, the destructor calls CoUninitialize when the instStartOle object is destroyed.
The Cells Utility Function
Listing 1, lines 27-38 shows the listing for the Cells utility function. To access an actual value stored in an Excel spreadsheet cell, the address of that cell needs to be formulated. In Excel, a cell is referred to by its column letter and its row number. Because there are not enough letters in the alphabet to cover the 256 columns available in an Excel spreadsheet, starting at the 27th column, Excel uses double lettering. Hence, column 27 is labeled AA, column 28 is AB, column 53 is BA, and so on until IV, the last and 256th column.
The Range method, a typical Excel object method, uses this numbering scheme. It takes a string composed of the column letter(s) and the row number. The Cells function simplifies use of methods like Range by converting column and row numbers to a string suitable for use as an argument. For example, Cells takes column and row numbers 28 and 28 and returns the formatted string "AB28". This string can be used with Range and other functions to access an Excel spreadsheet cell.
Creating the Excel Object Instance
Like every Microsoft Office application, the Excel COM objects set exposes an Application object that acts as the parent layer. The Application object's most important children are the Workbooks, Charts, AddIns, and VBProject. This article discusses the first two. A C++ program can insert values into an Excel spreadsheet and leave the layout or data analysis to a set of macros that are automatically triggered the first time the spreadsheet is opened.
A Workbook is a set of spreadsheets (called Worksheets), such as the one displayed in Figure 1. Each Worksheet contains cells. The cells are accessed through the Range method of the Worksheet object. In Listing 1, line 49, the Excel Application object is created. The visible attribute (Listing 1, line 50) controls the visibility of the Excel application window. In lines 51-54, a Workbook is added to the Excel application, and a title is assigned to the active sheet.
Building a Table
Populating the Excel spreadsheet (Listing 1, lines 55-85) is pretty straightforward. The ->Borders->Item[]->LineStyle method (Listing 1, lines 59-60) gives cells a layout (horizontal lines). The value method (Listing 1, lines 70-71, 75-76) gives each cell a value or a title. The only noticeable difficulty resides in the addressing of the xlEdgeTop and xlEdgeBottom borders. In VBA (Visual Basic for Applications, the interpreted language for Excel macros), the xlEdgeTop border can be referred to as ->Borders[xlEdgeTop]. This is not possible in Visual C++. The interface to Excel through Visual C++ has numerous subtle differences from VBA.
The Excel object's nicest feature is its capacity to perform statistical computations through built-in functions. In the given example, you store an Excel function in the last cell of each column, which computes the sum of that column (Listing 1, lines 80-81).
True, the computations could have been done by the C++ code itself. But in our example, the user of the generated spreadsheet could later ask: "What would have been the total for column B if instead of a 1, I had had a 10 in row 2?" Upon replacement of the value, the sum would be recomputed automatically. Not a single report generator matches Excel's aptitude for scenario testing.
No one would ever think of using Excel for computing or sorting values. However, thanks to Excel's built-in functions, a programmer could save precious development time by using Excel for complex tasks. For example, without using Excel, how would you generate a multi-variable linear regression, display the results in a 3-D chart, and save it as a Postscript document?
Building a Chart
Building a chart (Figure 2) is equally simple. All the cells except the last row of the table constitute the range of values to be charted. The variable pRange (Listing 1, line 87) stores the range to be charted. A chart object is then added to the Workbook (Listing 1, line 88), and the chart wizard (Listing 1, line 89) sets its range, its 3-D aspect, its layout, and its title.
Saving and Quitting Excel
As the last step, you save the spreadsheet as file Report.xls in the current directory. The program prompts the user with a warning if a file of the same name already exists (Listing 1, line 91). You can override this warning by setting the DisplayAlerts property of the Application object to false. This might be necessary when automating the report generation.
pXL->Quit() releases the Excel COM object pointer (Listing 1, line 93). This is necessary to ensure that repeated executions of the code do not lead to an accumulation of Excel instances in memory. Otherwise, Excel may continue to live after the reporting program has been executed.
Conclusion
Excel offers a wonderful set of objects to the C++ programmer. Because Excel has been out for so long, it is one of most stable COM components of the Microsoft Office suite. It shares with Access the privilege of having a very large set of objects and methods. However, unlike Access, Excel is still vastly unused, being replaced by numerous tools that would have to be combined at great cost to match the broad scope of Excel.
Philippe Lacoude is a Ph.D. candidate in Applied Mathematics. For two years, he was an assistant professor at the University of Paris, where he authored several books and scientific articles and completed his Ph.D. of Economics. He has four years of experience in VB, C/C++, and numerical analysis. He currently works in Washington, D.C., as a Senior Database Programmer. He can be reached at [email protected].
Grum Ketema is a consultant with SWIFT Inc.'s technology management group. He has more than 15 years of experience in software development with C/C++, Unix, Windows, Oracle, CORBA, and COM. He has extensive experience in telecommunications, GIS, and transaction management systems. Grum was a former MTS at AT&T Bell Laboratories and has worked for MIT and TASC. Grum holds a MSc from University of Colorado in EE and a MSc from the University of Denver in CS. He can be reached at [email protected].