To ask Pete a question about C or C++, send e-mail to [email protected], use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.
Q
QI have a struct that must contain different amounts of data depending on how I use it. I can do this with malloc, putting a pointer in my struct, but is there a way to do the same thing without that second allocation?
ANo and yes. There's an old trick for doing this that isn't, strictly speaking, valid in C, but it works correctly with every compiler that I know of. There's also a new trick in the proposed revision to the C language definition.
The old trick is to declare an array with one element as the last member in your struct. When you allocate space for the struct, you allocate as much space as you actually need and pretend that the array is larger than you said it was. For example:
struct datapacket { int size; int data[1]; ]; void f() { struct datapacket *pkt = malloc( sizeof(struct datapacket) + sizeof(int)*9 ); /* further code . . . */ }
After the call to malloc, pkt points to a struct that has room for the integer field size and an array of 10 ints [1]. You access these fields in the obvious way: pkt->size, pkt->data[3], and so on.
The theoretical problem with this approach is that it uses array indices that are too far past the end of the array. In both C and C++ you are allowed to use an index that points into an array or one past the end of the array. Any other index value results in undefined behavior. So it's valid to use pkt->data[0], and it's valid to talk about the address of pkt->data[1], but index values beyond 1 aren't required to work sensibly. The practical problem with this approach is that you might run into a compiler that checks array bounds, and it would complain about using indices that are beyond those allowed by the language definition.
There's a variation on this old trick that gets around the problem with array bounds checking. Instead of declaring the array to have one element, declare it to have a larger number of elements than you would ever actually use. As with the other version, allocate what you actually need:
struct datapacket { int size; int data[1000]; }; void f() { struct datapacket *pkt = malloc( sizeof(struct datapacket) - sizeof(int)*1000 + sizeof(int)*10 ); /* further code . . . */ }
Once again you've allocated space for the integer field size and an array of 10 ints, and once again you've lied to the compiler, so you can't be sure that this will work. However, just like the previous version, I don't know of a compiler that doesn't handle this the way you'd expect.
Ultimately, though, this is a common enough problem that the language definition ought to provide a solution. The new trick that's allowed under the current proposal has actually been around for several years [2], and will now receive official blessing. Syntactically it looks a lot like the previous examples: you declare an array at the end of your struct. The difference is that you don't specify the size of the array:
struct datapacket { int size; int data[]; }; void f() { struct datapacket *pkt = malloc( sizeof(struct datapacket) + sizeof(int)*10 ); /* further code . . . */ }
This, of course, is invalid in Standard C as it exists today you cannot have an array of unspecified size as a member of a struct. That's a big advantage over the old trick, because the old trick used a valid construct in a way that meant something different from its obvious meaning. The new trick doesn't have this problem. You don't need to worry that some maintenance programmer will think that this array really has only one element in it.
Notes
1. It declares an array of one int, and we added space for nine more.
2. All Borland compilers have allowed this, but only in C code and not in C++. I haven't investigated whether other compilers allow it, but I wouldn't be at all surprised if many of them do. o
Pete Becker is Technical Project Leader for Dinkumware, Ltd. He spent eight years in the C++ group at Borland International, both as a developer and manager. He is a member of the ANSI/ISO C++ standardization commmittee. He can be reached by email at [email protected].