The creator of Perl talks about language design and Perl
In both language size and popularity, the Practical Extraction and Reporting Language (Perl) is probably the biggest of the little languages. While its text- and file-processing capabilities have made it the language of choice for system administrators and web developers, Perl's virtues as a rapid prototyping tool and full-fledged application language have become apparent as well.
Given Perl's idiomatic nature, it isn't surprising that Larry Wall, Perl's creator, has a background in linguistics. Larry is no stranger to useful, free software, having written and given away programs like metaconfig, rn, and patch. His active involvement and ever-present sense of humor -- which permeates his newsgroup posts, source code, and Programming Perl (the book he coauthored with Randal Schwartz) -- is largely responsible for the community that has grown and flourished around the language.
A recipient of Dr. Dobb's 1996 Excellence in Programming Award and currently a developer at O'Reilly & Associates, Larry recently sat down with DDJ technical editor Eugene Eric Kim to discuss the state of Perl, scripting language design, and the importance of culture to a language.
DDJ: What's new in the upcoming version 5.005 of Perl?
LW: Well, other than the usual round of bug fixes, there will be support for threading and compilation down to C code. Those are the main things.
DDJ: Is Perl 5.005 what you envisioned Perl to be when you set out to do it?
LW: That assumes that I'm smart enough to envision something as complicated as Perl. I knew that Perl would be good at some things, and would be good at more things as time went on. So, in a sense, I'm sort of blessed with natural stupidity -- as opposed to artificial intelligence -- in the sense that I know what my intellectual limits are.
I'm not one of these people who can sit down and design an entire system from scratch and figure out how everything relates to everything else, so I knew from the start that I had to take the bear-of-very-little-brain approach, and design the thing to evolve. But that fit in with my background in linguistics, because natural languages evolve over time.
You can apply biological metaphors to languages. They move into niches, and as new needs arise, languages change over time. It's actually a practical way to design a computer language. Not all computer programs can be designed that way, but I think more can be designed that way than have been. A lot of the majestic failures that have occurred in computer science have been because people thought they could design the whole thing in advance.
DDJ: How do you design a language to evolve?
LW: There are several aspects to that, depending on whether you are talking about syntax or semantics. On a syntactic level, in the particular case of Perl, I placed variable names in a separate namespace from reserved words. That's one of the reasons there are funny characters on the front of variable names -- dollar signs and so forth. That allowed me to add new reserved words without breaking old programs.
DDJ: What is a scripting language? Does Perl fall into the category of a scripting language?
LW: Well, being a linguist, I tend to go back to the etymological meanings of "script" and "program," though, of course, that's fallacious in terms of what they mean nowadays. A script is what you hand to the actors, and a program is what you hand to the audience. Now hopefully, the program is already locked in by the time you hand that out, whereas the script is something you can tinker with. I think of phrases like "following the script," or "breaking from the script." The notion that you can evolve your script ties into the notion of rapid prototyping.
A script is something that is easy to tweak, and a program is something that is locked in. There are all sorts of metaphorical tie-ins that tend to make programs static and scripts dynamic, but of course, it's a continuum. You can write Perl programs, and you can write C scripts. People do talk more about Perl programs than C scripts. Maybe that just means Perl is more versatile.
DDJ: So, you don't think it's necessarily a language restriction? A language isn't necessarily a scripting language.
LW: Well, the typical usage is that you call something a scripting language if it's interpreted.
DDJ: Where's the line between an interpreted and a compiled language?
LW: There isn't a fixed one. You can take an interpreter and burn it into firmware, you can build a chip that interprets, and then, do you have a compiled language? You can take a language like C and build an interpreter for it -- people have. There really isn't a clear boundary between the two other than that interpreters do tend to defer decisions until run time -- late binding, if you will -- and compilers tend to make decisions earlier.
DDJ: Would that be a better distinction than interpreted versus compiled -- run-time versus compile-time binding?
LW: It's a more useful distinction in many ways because, with late-binding languages like Perl or Java, you cannot make up your mind about what the real meaning of it is until the last moment. But there are different definitions of what the last moment is. Computer scientists would say there are really different "latenesses" of binding.
A good language actually gives you a range, a wide dynamic range, of your level of discipline. We're starting to move in that direction with Perl. The initial Perl was lackadaisical about requiring things to be defined or declared or what have you. Perl 5 has some declarations that you can use if you want to increase your level of discipline. But it's optional. So you can say "use strict," or you can turn on warnings, or you can do various sorts of declarations.
DDJ: Would it be accurate to say that Perl doesn't enforce good design?
LW: No, it does not. It tries to give you some tools to help if you want to do that, but I'm a firm believer that a language -- whether it's a natural language or a computer language -- ought to be an amoral artistic medium.
You can write pretty poems or you can write ugly poems, but that doesn't say whether English is pretty or ugly. So, while I kind of like to see beautiful computer programs, I don't think the chief virtue of a language is beauty. That's like asking an artist whether they use beautiful paints and a beautiful canvas and a beautiful palette. A language should be a medium of expression, which does not restrict your feeling unless you ask it to.
DDJ: Where does the beauty of a program lie? In the underlying algorithms, in the syntax of the description?
LW: Well, there are many different definitions of artistic beauty. It can be argued that it's symmetry, which in a computer language might be considered orthogonality. It's also been argued that broken symmetry is what is considered most beautiful and most artistic and diverse. Symmetry breaking is the root of our whole universe according to physicists, so if God is an artist, then maybe that's his definition of what beauty is.
This actually ties back in with the built-to-evolve concept on the semantic level. A lot of computer languages were defined to be naturally orthogonal, or at least the computer scientists who designed them were giving lip service to orthogonality. And that's all very well if you're trying to define a position in a space. But that's not how people think. It's not how natural languages work. Natural languages are not orthogonal, they're diagonal. They give you hypotenuses.
Suppose you're flying from California to Quebec. You don't fly due east, and take a left turn over Nashville, and then go due north. You fly straight, more or less, from here to there. And it's a network. And it's actually sort of a fractal network, where your big link is straight, and you have little "fractally" things at the end for your taxi and bicycle and whatever the mode of transport you use. Languages work the same way. And they're designed to get you most of the way here, and then have ways of refining the additional shades of meaning.
When they first built the University of California at Irvine campus, they just put the buildings in. They did not put any sidewalks, they just planted grass. The next year, they came back and built the sidewalks where the trails were in the grass. Perl is that kind of a language. It is not designed from first principles. Perl is those sidewalks in the grass. Those trails that were there before were the previous computer languages that Perl has borrowed ideas from. And Perl has unashamedly borrowed ideas from many, many different languages. Those paths can go diagonally. We want shortcuts. Sometimes we want to be able to do the orthogonal thing, so Perl generally allows the orthogonal approach also. But it also allows a certain number of shortcuts, and being able to insert those shortcuts is part of that evolutionary thing.
I don't want to claim that this is the only way to design a computer language, or that everyone is going to actually enjoy a computer language that is designed in this way. Obviously, some people speak other languages. But Perl was an experiment in trying to come up with not a large language -- not as large as English -- but a medium-sized language, and to try to see if, by adding certain kinds of complexity from natural language, the expressiveness of the language grew faster than the pain of using it. And, by and large, I think that experiment has been successful.
DDJ: Give an example of one of the things you think is expressive about Perl that you wouldn't find in other languages.
LW: The fact that regular-expression parsing and the use of regular expressions is built right into the language. If you used the regular expression in a list context, it will pass back a list of the various subexpressions that it matched. A different computer language may add regular expressions, even have a module that's called Perl 5 regular expressions, but it won't be integrated into the language. You'll have to jump through an extra hoop, take that right angle turn, in order to say, "Okay, well here, now apply the regular expression, now let's pull the things out of the regular expression," rather than being able to use the thing in a particular context and have it do something meaningful.
The school of linguistics I happened to come up through is called tagmemics, and it makes a big deal about context. In a real language -- this is a tagmemic idea -- you can distinguish between what the conventional meaning of the "thing" is and how it's being used. You think of "dog" primarily as a noun, but you can use it as a verb. That's the prototypical example, but the "thing" applies at many different levels. You think of a sentence as a sentence. Transformational grammar was built on the notion of analyzing a sentence. And they had all their cute rules, and they eventually ended up throwing most of them back out again.
But in the tagmemic view, you can take a sentence as a unit and use it differently. You can say a sentence like, "I don't like your I-can-use-anything-like-a-sentence attitude." There, I've used the sentence as an adjective. The sentence isn't an adjective if you analyze it, any way you want to analyze it. But this is the way people think. If there's a way to make sense of something in a particular context, they'll do so. And Perl is just trying to make those things make sense. There's the basic distinction in Perl between singular and plural context -- call it list context and scalar context, if you will. But you can use a particular construct in a singular context that has one meaning that sort of makes sense using the list context, and it may have a different meaning that makes sense in the plural context.
That is where the expressiveness comes from. In English, you read essays by people who say, "Well, how does this metaphor thing work?" Owen Barfield talks about this. You say one thing and mean another. That's how metaphors arise. Or you take two things and jam them together. I think it was Owen Barfield, or maybe it was C.S. Lewis, who talked about "a piercing sweetness." And we know what "piercing" is, and we know what "sweetness" is, but you put those two together, and you've created a new meaning. And that's how languages ought to work.
DDJ: Is a more expressive language more difficult to learn?
LW: Yes. It was a conscious tradeoff at the beginning of Perl that it would be more difficult to master the whole language. However, taking another clue from a natural language, we do not require 5-year olds to speak with the same diction as 50-year olds. It is okay for you to use the subset of a language that you are comfortable with, and to learn as you go. This is not true of so many computer-science languages. If you program C++ in a subset that corresponds to C, you get laughed out of the office.
There's a whole subject that we haven't touched here. A language is not a set of syntax rules. It is not just a set of semantics. It's the entire culture surrounding the language itself. So part of the cultural context in which you analyze a language includes all the personalities and people involved -- how everybody sees the language, how they propagate the language to other people, how it gets taught, the attitudes of people who are helping each other learn the language -- all of this goes into the pot of context.
Because I had already put out other freeware projects (rn and patch), I realized before I ever wrote Perl that a great deal of the value of those things was from collaboration. Many of the really good ideas in rn and Perl came from other people.
I think that Perl is in its adolescence right now. There are places where it is grown up, and places where it's still throwing tantrums. I have a couple of teenagers, and the thing you notice about teenagers is that they're always plus or minus ten years from their real age. So if you've got a 15-year old, they're either acting 25 or they're acting 5. Sometimes simultaneously! And Perl is a little that way, but that's okay.
DDJ: What part of Perl isn't quite grown up?
LW: Well, I think that the part of Perl, which has not been realistic up until now has been on the order of how you enable people in certain business situations to actually use it properly. There are a lot of people who cannot use freeware because it is, you know, schlocky. Their bosses won't let them, their government won't let them, or they think their government won't let them. There are a lot of people who, unknown to their bosses or their government, are using Perl.
DDJ: So these aren't technical issues.
LW: I suppose it depends on how you define technology. Some of it is perceptions, some of it is business models, and things like that. I'm trying to generate a new symbiosis between the commercial and the freeware interests. I think there's an artificial dividing line between those groups and that they could be more collaborative.
As a linguist, the generation of a linguistic culture is a technical issue. So, these adjustments we might make in people's attitudes toward commercial operations or in how Perl is being supported, distributed, advertised, and marketed -- not in terms of trying to make bucks, but just how we propagate the culture -- these are technical ideas in the psychological and the linguistic sense. They are, of course, not technical in the computer-science sense. But I think that's where Perl has really excelled -- its growth has not been driven solely by technical merits.
DDJ: What are the things that you do when you set out to create a culture around the software that you write?
LW: In the beginning, I just tried to help everybody. Particularly being on USENET. You know, there are even some sneaky things in there -- like looking for people's Perl questions in many different newsgroups. For a long time, I resisted creating a newsgroup for Perl, specifically because I did not want it to be ghettoized. You know, if someone can say, "Oh, this is a discussion about Perl, take it over to the Perl newsgroup," then they shut off the discussion in the shell newsgroup. If there are only the shell newsgroups, and someone says, "Oh, by the way, in Perl, you can solve it like this," that's free advertising. So, it's fuzzy. We had proposed Perl as a newsgroup probably a year or two before we actually created it. It eventually came to the point where the time was right for it, and we did that.
DDJ: Perl has really been pigeonholed as a language of the Web. One result is that people mistakenly try to compare Perl to Java. Why do you think people make the comparison in the first place? Is there anything to compare?
LW: Well, people always compare everything.
DDJ: Do you agree that Perl has been pigeonholed?
LW: Yes, but I'm not sure that it bothers me. Before it was pigeonholed as a web language, it was pigeonholed as a system-administration language, and I think that -- this goes counter to what I was saying earlier about marketing Perl -- if the abilities are there to do a particular job, there will be somebody there to apply it, generally speaking. So I'm not too worried about Perl moving into new ecological niches, as long as it has the capability of surviving in there.
Perl is actually a scrappy language for surviving in a particular ecological niche. (Can you tell I like biological metaphors?) You've got to understand that it first went up against C and against shell, both of which were much loved in the UNIX community, and it succeeded against them. So that early competition actually makes it quite a fit competitor in many other realms, too.
For most web applications, Perl is severely underutilized. Your typical CGI script says print, print, print, print, print, print, print. But in a sense, it's the dynamic range of Perl that allows for that. You don't have to say a whole lot to write a simple Perl script, whereas your minimal Java program is, you know, eight or ten lines long anyway. Many of the features that made it competitive in the UNIX space will make it competitive in other spaces.
Now, there are things that Perl can't do. One of the things that you can't do with Perl right now is compile it down to Java bytecode. And if that, in the long run, becomes a large ecological niche (and this is not yet a sure thing), then that is a capability I want to be certain that Perl has.
DDJ: There's been a movement to merge the two development paths between the ActiveWare Perl for Windows and the main distribution of Perl. You were talking about ecological niches earlier, and how Perl started off as a text-processing language. The scripting languages that are dominant on the Microsoft platforms -- like VB -- tend to be more visual than textual. Given Perl's UNIX origins -- awk, sed, and C, for that matter -- do you think that Perl, as it currently stands, has the tools to fit into a Windows niche?
LW: Yes and no. It depends on your problem domain and who's trying to solve the problem. There are problems that only need a textual solution or don't need a visual solution. Automation things of certain sorts don't need to interact with the desktop, so for those sorts of things -- and for the programmers who aren't really all that interested in visual programming -- it's already good for that. And people are already using it for that. Certainly, there is a group of people who would be enabled to use Perl if it had more of a visual interface, and one of the things we're talking about doing for the O'Reilly NT Perl Resource Kit is some sort of a visual interface.
A lot of what Windows is designed to do is to get mere mortals from 0 to 60, and there are some people who want to get from 60 to 100. We are not really interested in being in Microsoft's crosshairs. We're not actually interested in competing head-to-head with Visual Basic, and to the extent that we do compete with them, it's going to be kind of subtle. There has to be some way to get people from the slow lane to the fast lane. It's one thing to give them a way to get from 60 to 100, but if they have to spin out to get from the slow lane to the fast lane, then that's not going to work either.
Over the years, much of the work of making Perl work for people has been in designing ways for people to come to Perl. I actually delayed the first version of Perl for a couple of months until I had a sed-to-Perl and an awk-to-Perl translator. One of the benefits of borrowing features from various other languages is that those subsets of Perl that use those features are familiar to people coming from that other culture. What would be best, in my book, is if someone had a way of saying, "Well, I've got this thing in Visual Basic. Now, can I just rewrite some of these things in Perl?"
We're already doing this with Java. On our UNIX Perl Resource Kit, I've got a hybrid language called "jpl" -- that's partly a pun on my old alma mater, Jet Propulsion Laboratory, and partly for Java, Perl...Lingo, there we go! That's good. "Java Perl Lingo." You've heard it first here! jpl lets you take a Java program and magically turn one of the methods into a chunk of Perl right there inline. It turns Perl code into a native method, and automates the linkage so that when you pull in the Java code, it also pulls in the Perl code, and the interpreter, and everything else. It's actually calling out from Java's Virtual Machine into Perl's virtual machine. And we can call in the other direction, too. You can embed Java in Perl, except that there's a bug in JDK having to do with threads that prevents us from doing any I/O. But that's Java's problem.
It's a way of letting somebody evolve from a purely Java solution into, at least partly, a Perl solution. It's important not only to make Perl evolve, but to make it so that people can evolve their own programs. It's how I program, and I think a lot of people program that way. Most of us are too stupid to know what we want at the beginning.
DDJ: Is there hope down the line to present Perl to a standardization body?
LW: Well, I have said in jest that people will be free to standardize Perl when I'm dead. There may come a time when that is the right thing to do, but it doesn't seem appropriate yet.
DDJ: When would that time be?
LW: Oh, maybe when the federal government declares that we can't export Perl unless it's standardized or something.
DDJ: Only when you're forced to, basically.
LW: Yeah. To me, once things get to a standards body, it's not very interesting anymore. The most efficient form of government is a benevolent dictatorship. I remember walking into some BOF that USENIX held six or seven years ago, and John Quarterman was running it, and he saw me sneak in, sit in the back corner, and he said, "Oh, here comes Larry Wall! He's a standards committee all of his own!"
A great deal of the success of Perl so far has been based on some of my own idiosyncrasies. And I recognize that they are idiosyncrasies, and I try to let people argue me out of them whenever appropriate. But there are still ways of looking at things that I seem to do differently than anybody else. It may well be that perl5-porters will one day degenerate into a standards committee. So far, I have not abused my authority to the point that people have written me off, and so I am still allowed to exercise a certain amount of absolute power over the Perl core.
I just think headless standards committees tend to reduce everything to mush. There is a conservatism that committees have that individuals don't, and there are times when you want to have that conservatism and times you don't. I try to exercise my authority where we don't want that conservatism. And I try not to exercise it at other times.
DDJ: How did you get involved in computer science? You're a linguist by background?
LW: Because I talk to computer scientists more than I talk to linguists, I wear the linguistics mantle more than I wear the computer-science mantle, but they actually came along in parallel, and I'm probably a 50/50 hybrid. You know, basically, I'm no good at either linguistics or computer science.
DDJ: So you took computer-science courses in college?
LW: In college, yeah. In college, I had various majors, but what I eventually graduated in -- I'm one of those people that packed four years into eight -- what I eventually graduated in was a self-constructed major, and it was Natural and Artificial Languages, which seems positively prescient considering where I ended up.
DDJ: When did you join O'Reilly as a salaried employee? And how did that come about?
LW: A year-and-a-half ago. It was partly because my previous job was kind of winding down.
DDJ: What was your previous job?
LW: I was working for Seagate Software. They were shutting down that branch of operations there. So, I was just starting to look around a little bit, and Tim noticed me looking around and said, "Well, you know, I've wanted to hire you for a long time," so we talked. And Gina Blaber (O'Reilly's software director) and I met. So, they more or less offered to pay me to mess around with Perl.
So it's sort of my dream job. I get to work from home, and if I feel like taking a nap in the afternoon, I can take a nap in the afternoon and work all night.
DDJ: Do you have any final comments, or tips for aspiring programmers? Or aspiring Perl programmers?
LW: Assume that your first idea is wrong, and try to think through the various options. I think that the biggest mistake people make is latching onto the first idea that comes to them and trying to do that. It really comes to a thing that my folks taught me about money. Don't buy something unless you've wanted it three times. Similarly, don't throw in a feature when you first think of it. Think if there's a way to generalize it, think if it should be generalized. Sometimes you can generalize things too much. I think like the things in Scheme were generalized too much. There is a level of abstraction beyond which people don't want to go. Take a good look at what you want to do, and try to come up with the long-term lazy way, not the short-term lazy way.
DDJ
Copyright © 1998, Dr. Dobb's Journal