Kyle Dawkins works for New York-based consulting firm Central Park Software, and can be reached at [email protected].
All object-oriented programming languages share a set of fundamental concepts that every developer has come to expect. Each language provides an abstraction of some kind of data and the code that acts on that data. Most provide the concept of classes and some notion of inheritance, and many also provide visibility levels for class or instance data. These concepts form a kind of "Universal Grammar" of object-oriented programming that allows a Python programmer to read some C# code and be able to get the gist. But often, what makes a language interesting is the way it offers features above and beyond these fundamental, shared concepts. In this article, we'll be taking a look at Objective-C, a language that takes the raw, unrestricted power of ANSI C and melds it with the elegance of Smalltalk, producing an amazingly dynamic, flexible (and fun) language. In fact, given the advanced nature of many of its properties, it may surprise you to learn that the language is pretty old.
Objective-C sprang into existence over 20 years ago, during the exciting personal-computer revolution of the 1980s. At the time, most programmers used procedural languages such as C, Fortran, and Pascal for the lion's share of engineering projects. The limitations of these languages began to show in large projects, especially with regard to the reuse of large portions of code. At the time, object-oriented programming was considered fairly new and quite academic, but object-oriented techniques were gradually filtering into the mainstream. Languages such as Smalltalk and Simula (and later, C++) were making waves with their new ideas of encapsulation and inheritance.
A programmer named Brad Cox, working for his own company, The Stepstone Corporation, saw that these powerful concepts could be applied to a language such as C. He was very familiar with both Smalltalk and C, so he set about grafting Smalltalk's unique syntax onto Standard C, and the result became what we now know as Objective-C.
The new language was first implemented as a front end to the UNIX cc compiler, running after the preprocessor and generating Standard C code that was then passed to cc. In the late 1980s, Steve Jobs's startup company, NeXT, licensed the Objective-C compiler from Stepstone. NeXT served as a kind of incubator for Objective-C; the language suited the needs of the company very well, and in return, NeXT engineers enhanced the language, extended the GCC compiler to support Objective-C, and built a framework that is still the de facto standard API for the language. In 1994, NeXT made the API public under the name "OpenStep," and in 1995, the company acquired all rights to the language from The Stepstone Corporation.
But even with all this activity, Objective-C was still very much a fringe language, used mainly in university computing labs and financial institutions, in part because of the extremely high price of the NeXT hardware. In 1997, however, everything changed when Apple purchased NeXT and used NeXT's technologyincluding Objective-Cto build the newest incarnation of the Macintosh operating system. Objective-C had found a new audience: Mac developers and hobbyists. Now, with the barrier to entry smashed aside, interest in Objective-C has been growing steadily. Apple provides the compiler and its entire set of rich development tools for free, so every Macintosh owner can dabble if he or she feels so inclined.
Note: Now that Objective-C is so heavily used in development for Mac OS X, or using the GNUStep libraries, all the examples shown will use classes that are found in both. However, it is entirely possible to use Objective-C without the Apple or GNUStep Foundation classes.
The Language And the Runtime
One of the chief design goals of the language was to ensure that Objective-C could be intermingled within projects and files with C in order to fit well with existing C code. As a consequence of this, Objective-C is a strict superset of C, unlike C++. This means that any C code can be passed through the Objective-C compiler, and it will be compiled as expected.
Furthermore, the number of additions to Standard C syntax is minimal: There is some new syntax for declarations (both of classes and of objects) and one syntactic addition for all interaction with classes and instances. That's it! All the syntax changes are clearly delineated from Standard C syntax, making it easy to read the code and also easy for an IDE to manipulate.
First, let's look at how you would create a class in Objective-C. A class needs two things to be defined: an interface that describes the class (its methods, class and instance data, and so on) and an implementation, which is the actual code for the class. This conceptual division is nice because it very clearly illustrates the encapsulation properties of a class, and to make the division even clearer, it is common practice in Objective-C to place the interface in a .h header file, and the implementation in a .m code file. For example, Listing 1 and Listing 2 are the .h and .m files for a simple class, respectively.
First, note the #import statement. This statement works exactly like an #include statement, except that it won't include a file that's already been included, saving you a lot of #ifdefs.
Next, you can see the definition of the class itselfits name and its parent class in the form:
@interface Class (Parent)
Following that come the declarations of instance variables. In this case, we have only one, isRude. Normally, instance variables can be accessed from outside the object using the C pointer dereference operator ->, but this behavior is generally frowned upon in favor of using accessor methods. Note the @private directive, instructing the compiler that the isRude variable is to be considered private, meaning it can only be accessed by instances of Diner and not its subclasses. There are two other levels of visibility available, @protected (the default) and @public, which should never be used except in extreme circumstances, as it breaks true encapsulation.
Next, you can see the declarations (in the .h file) and the definitions (in the .m file) of the instance methods. These all begin with a "-" sign. The standard format for an instance method is:
- (return type) method { ... method body ... }
The return type is optional, and if it's not specified, it defaults to (id), the Objective-C pseudotype meaning any Objective-C object.
Class methods are defined using exactly the same format, but with a "+" instead of a "-."
Messaging
Most object-oriented languages have syntax that enables the programmer to invoke methods on instances or classes directly. For example, C++ uses the "." operator:
MyObject.throwWeightAround();
This is a logical and consistent extension to the "." operator in Standard C, and makes a great deal of sense. Objective-C, however, takes a very different approach. Instead of calling methods on instances or classes, you send messages. This distinction is important, both conceptually and in practice, and it is one of the chief concepts of Objective-C that differentiates it from many other languages.
To invoke a method on an object, you send that object a message using the messaging syntax [receiver message]:
[MyObject throwWeightAround];
True, it doesn't seem to be that different from the C++ syntax, but what happens behind the scenes is important. The most crucial aspect of messaging, and one that differentiates Objective-C from languages such as Java, C++, and C, is that everything happens at run-time using dynamic binding. The decision of where to route a message call is entirely determined at the moment the message is sent, not by the compiler.
Moreover, as all messages are routed exclusively at runtime, it is trivial to route them to objects in other applications or on other computers; indeed, distributed messaging is built-in to the runtime framework, requiring very little work on the part of the programmer. This is in stark contrast to statically bound languages such as C++ and Java, which need to use stub classes and RPC libraries to achieve the same effect.
Overloading, and Selectors Versus Message Signatures
Many languages allow overloading of methods, where a method with a given name can be defined multiple times with different arguments, and the compiler chooses the correct version of the method based on the signature of the invocation. Objective-C, on the other hand, does not allow this at all, and the reason is very simple: Overloading promotes unreadable code. For example, what does this snippet of Java do?
if (t) { theCache.put(foo, bar, t); } else { theCache.put(foo, bar); }
It's not entirely obvious from the code because there are two invocations of the put() method. You'd need to either look up the documentation for the many incarnations of the put() method, or dredge through the source code (if you have it). Many programmers of languages such as C++, C#, or Java often rely on fancy IDEs that pop-up lists of method names as you type, but this behavior won't help all the time, especially if the method signatures are similar. For example, you might type:
theDate.se
and the IDE will helpfully pop-up a list of methods that start with "se" on your object. And perhaps it lists the "set" methods like this:
theDate. set(Number) set(String, String) set(String, String, String) set(String, String, Number) set(String, Number, Number)
which won't really help at all; you have no idea what each argument actually is.
Objective-C instead solves this problem by combining arguments and method names into the concept of a selector. This roughly corresponds to a method signature in C++ or Java, but goes one step further in that the expected arguments to the method are embedded within the signature. For example, an Objective-C version of the previous Java code might read:
if (t) { [theCache setValue:bar forKey:foo withTimeout:t]; } else { [theCache setValue:bar forKey:foo]; }
This way of including the naming of arguments, and the arguments themselves, within the selector, encourages good programming practice: Objective-C often just reads like plain-old English.
Protocols
A key concept introduced into Objective-C during its incubation at NeXT was the idea of a "protocol." This is identical to the Java concept of Interfaces or C++ abstract base classes with virtual methods. A protocol essentially comprises a set of methods that an object must implement in order to conform. You can then write consuming code that knows nothing about an object other than that it conforms to a given protocol.
In the example application, you can see a protocol in use. The Waiter and Busser classes both conform to the ServesTables protocol. The Busser class is shown in Listing 3, and the ServesTables protocol is shown in Listing 4. The full example application is available online at http://www.cuj.com/code/.
Protocols are used extensively throughout the entire Cocoa and GNUStep frameworks.
Categories
Buggy frameworks and libraries are a common programming problem. Often, these bugs are hard to find, and once they are found, you need to come up with workarounds that fix them until the library itself can be fixed. In statically bound languages, this often involves writing a wrapper around a method or methods, but this is problematic if the bug is in an instance method, particularly if you don't have access to the source code. The idea of "Categories" offers one possible solution to this problem.
The concept, as added to Objective-C by NeXT, is fairly unique among object-oriented languages, and demands a close look. Categories, like protocols, are families of methods, grouped together conveniently. However, categories also include the implementation of those methods and are grafted onto an object by the Objective-C runtime. This might not seem that important at first, but actually it's crucial. You are given the power to write methods and inject those methods into objects of another class, even if you don't have the source code to that class. You don't need to subclass anything: You can just tell the runtime to add a category to a class, and presto, all objects of that class (and any of its subclasses) now have those methods. The methods can even manipulate private data of the class they are attached to, and from the point of view of the runtime, they are absolutely no different than any other methods.
In the example program, we can see that the implementation of the Waiter class incorrectly inflates by $20 the amount on a check that the waiter gives to certain diners in the restaurant; see Example 1. If this were my restaurant, I would definitely want to fix this problem. If we didn't have access to the source code for the Waiter class, what could we do? Categories provide one possible (and very simple) solution.
Defining the interface and implementation of a category on a class is the same as declaring and implementing a class, except that you enclose the category name in parentheses:
@interface FooClass (BarCategory) ... @end
So by simply implementing a category called Honesty on the Waiter class, we can fix the broken method:
// Honesty.m #import "Honesty.h" @implementation Waiter (Honesty) - (double) bringCheckOfAmount: (double)amount toDiner: (Diner *)diner { NSLog(@"Waiter %@ has decided to be honest and not inflate the check", name); return amount; } @end
That is all that has to be done. If this is compiled and linked with the application, the runtime will find the category's method and use it instead of the implementation on the Waiter class:
Diner number 3 is having the Chicken Waiter Walter has decided to be honest and not inflate the check
Also note that the new method has access to the instance variable name. In fact, it has access to all the instance variables, even the private ones.
Posing
The last major feature of the Objective-C runtime that we'll look at here is called "posing." It is a simple concept: You can tell the runtime that you would like it to use one class instead of another. From that point onward, even in code you didn't write, the runtime will use your class instead. There are some restrictions; your replacement class must be a direct subclass of the class it is replacing. For example, we could use this feature to deal with the problem in our example by creating a subclass of Waiter called HonestWaiter, and overriding the method that's dishonest:
// HonestWaiter.m #import "HonestWaiter.h" @implementation HonestWaiter - (double) bringCheckOfAmount: (double)amount toDiner: (Diner *)diner { NSLog(@"Waiter %@ is an honest waiter", name); return amount; } @end
Then, all it takes is to simply tell the runtime to use your new class instead of the buggy Waiter class:
[HonestWaiter poseAsClass:[Waiter class]];
From that point onward, whenever the code attempts to instantiate a Waiter instance, it will actually be an HonestWaiter instance:
Diner number 3 is having the Chicken Waiter Walter is an honest waiter Walter gave the check to Diner number 1 asking for 30.00 dollars
These examples demonstrate both the flexibility of the runtime and its very simple API, but there is much more that is worth mentioning about Objective-C that is beyond the scope of this article, including exception handling, reflection, delegation of message-handling responsibility, distributed messaging, and other message-handling features.
Pitfalls
Objective-C is a powerful and elegant language, but like all languages, it has its pitfalls. Depending on your experience, you may be surprised by some of them when you first start to write Objective-C code.
Garbage Collection. For programmers of VM-based languages such as Java and C#, it often comes as a shock that Objective-C does not explicitly implement garbage collection in the expected way; it is left up to the programmer to free objects after they have been used. However, objects keep a reference count, and when that reference count reaches zero, the object is automatically deallocated. When you instantiate an object, it has a reference count of 1, and you must be careful to increment or decrement that as your references to it change. You do this using the [object retain] and [object release] messages. For example, it is common to write a setter method like this:
- (void) setCheese:(Cheese *)value { [cheese release]; cheese = [value retain]; }
so that the system decrements the reference count of a cheese object that it might already have, and then increments the reference count of the new cheese.
Numerous more-powerful methods of automatic release of objects have been developed over the years, including the Cocoa system of "Auto-Release Pools." This takes some of the effort out of the object management, but it's still not quite as trivial as it is in the higher level languages.
No operator overloading. Operator overloading is a popular feature of some languages, from C++ to Perl, but it does not exist in any way at all in Objective-C. The jury is still out as to whether this is a bad thing, because although operator overloading hides the implementation of basic operations on objects, it also can tempt the programmer into creating unclear code:
theDate = theDate + 5;
is harder to read than
theDate = [theDate dateByAddingDays:5];
No enforced namespaces. Objective-C doesn't have the concept of a namespace, the way Java has packages and Perl has modules, for example. It is left entirely up to you to make sure your class names don't clash with any others. To make namespace clashes infrequent, most developers adopt consistent prefixes for all their related code: Apple uses "NS" for all its Cocoa classes (remember, Cocoa evolved from NeXTStep).
No multiple inheritance. Although this would be expected by Java developers, programmers versed in C++ might be surprised by this. But multiple inheritance is a language feature that is hotly debated, and the designers of Objective-C have chosen not to implement it. Although it has potential uses, it can also create some extremely convoluted class hierarchies, and consequently some very tricky bugs.
Summary
Objective-C comes from a rich engineering background, and was created specifically to deal with problems of encapsulation and code reuse. Although both the language and the runtime have features that are unique and unfamiliar, it won't be long before an Objective-C rookie is reveling in the freedom the language brings to development. Moreover, after becoming accustomed to many of its features, developers often find it difficult to use any other language.
Those of you who are interested in trying it out can find a rich community of developers and resources available for your every question, whether you're using Cocoa on Mac OS X or GNUStep on Windows or Linux. All of the tools are available for free.
Further Information
Apple's Cocoa: http://developer.apple.com/documentation/Cocoa/ Conceptual/ObjectiveC/index.html.
GNUStep: http://www.gnustep.org/developers/documentation.html.
CUJ