This month I thought I'd share my
experience in looking for a good HTML book. In addition, there's the third
installment in my survey of alternative programming paradigms, or "Languages
That Are Not C."
I got roped into a project to develop a one-day, intensive course in HTML for raw
beginners. My task was to find a text for the course, a book on which we could
base the lectures. I suspected right from the start that this was either
impossible or unwise. Any book that could be covered in a day would have to be
pretty skimpy. A book that we could give the students for further study, sure,
that might make sense, but a text for the course? I was skeptical.
Over a dozen books later, my suspicion had turned into a conviction. No book was
going to work. I made the obvious proposal: that we write the class materials
from scratch and not try to cut corners. Having done the research, though, I
found that I was in a good position to recommend HTML books to the students. Or
maybe to others. Because it occurred to me that, while some of the books I read
were written for the raw beginner, some assume a level of sophistication that you
would only find--well, among readers of this magazine.
So. If you sometimes get questions from people hoping to write HTML, if you
occasionally write or expect to write HTML, if you have responsibility for
maintaining a Web site or for maintaining in-over-their-heads Web site
maintainers, or if you have noticed that everybody in the world seems to want to
put up a Web page and you figure you might as well put together a course or
seminar and make a buck off this madness, here's my take on the HTML books I
examined.
Although I read more, I've whittled the list down to five books. There are
probably good books I've left off the list, but every book on the list is worth
buying. Some of these books were first published in 1994, which makes them
unusably ancient. They're all good enough that their publishers should be
updating them regularly, but I haven't tried to project which ones will have new
1996 editions out by the time you read this.
HTML Manual of Style (Ziff-Davis Press, 1994, ISBN 1-56276-300-8) is Larry
Aronson's first book, and he's to be commended. The book is short (132 pages),
simple, and clearly organized. By the end of the first 30 pages, he's introduced
and exemplified most of the vocabulary of HTML 2.0.
Aronson gets purity points for recommending that links be incorporated into the
flow of a paragraph rather than laid out in lists or detached from context
("Click HERE for my resume.") In this way he honors the hypertext
intent of the Web. Incorporating links into the flow of the text works great for
budding hyperfiction writers and researchers trying to smooth the footnote bumps
out of their reports. I imagine, though, that we're going to see more and more
violations of this design ideal as more and more people bend the Web to uses for
which it was not originally intended. Aronson has good advice on the quirks of
particular browsers, in particular in how they handle partial URLs.
HTML Manual of Style lacks some reference material that I'd like to see.
There's no complete reference on URLs, nothing on CGI or multimedia or server
issues. And the HTML tag reference could be more complete; it lists, but doesn't
explain, the values for tag attributes.
Aronson uses real Web pages as examples and does real critiques on them. There
are two schools of thought on examples: Some people think that only made-up
examples can get across their pedagogical points, while others think that
real-world examples are the best way to point out real problems. I'm with the
latter group. Maybe it's because I get a perverse pleasure out of watching real
people's work being picked apart. Aronson isn't cruel, but he is honest. He cites
errors and evaluates their seriousness. Each of the pages he's chosen exemplifies
some virtue, so most of what he has to say is positive (and useful). One of the
best pages he presents is by John December. The first 30 pages of this book are
the closest I got in my research to what I was looking for: a truly short course
in HTML. The rest of this book is style advice and reference. A clear,
well-organized HTML manual.
John December and Neil Randall have produced a huge book entitled The World
Wide Web Unleashed (Sams Publishing, 1994, ISBN 0-672-30617-4). It attempts
to cover everything about the Web: how to connect to the Web as a user, reviews
of browsers, a tour of Web sites, HTML, and the future of electronic commerce.
The book is 1058 pages long; nearly 100 pages of that is appendices, and these
are nearly all links to useful sites. These sites are the most important part of
the book. John December, well known on the Web for his lists of resources and
informed commentary, wrote the HTML and Web-development sections. Randall and
several others joined December in writing the less technical material.
I guess I'll give the authors purity points for wondering why Web browsers make
it possible to print out pages. Larry Aronson knows one answer that should be
obvious to December and Randall: so that you can produce decent examples of Web
pages in a book on Web-page design.
HTML is covered in the broader context of designing Web pages. It's not concise,
but it's solid. There is nothing substantial on CGI or multimedia, and there is
no comprehensive URL reference--surprising in a book this fat.
The book is way fatter than it needs to be: The authors are not concise, the very
useflinks in the appendices belong on a disk or a disc or a Website, and some of
the chapters are merely informed or interesting speculation. Nevertheless, there
is a lot of good information here, and if you are your company's Web guru, you
probably will appreciate having it on your shelf.
Mary E.S. Morris takes a no-nonsense approach to her subject. Her HTML for Fun
and Profit (Prentice Hall, 1995, ISBN 0-13-359290-1) jumps right into the
details of setting up a server. By the time she gets through with that, she has
probably weeded out the people who just want to mark up documents and has
narrowed down her readership to people capable of and willing to take on the
whole process of creating and publishing Web pages.
This 264-page book has more information on the specific topic of relative URLs
than any book I've looked at. I'd picked this topic deliberately as one benchmark
of completeness. The coverage Morris provides would be very useful if you were
setting up a site that you might later need to move to another machine, or that
you might want to mirror.
Morris's coverage of server includes and CGI scripting is strong. She gives a
good introduction for UNIX, Mac, and NT systems and provides a number of useful
Perl scripts. As for HTML specifically, the coverage is good. The reference table
on HTML tags doesn't indicate permissible nesting, which some of the other books
are clear about. Her discussion of forms is clear and seems exhaustive. This is a
good book for people setting up and running Web servers.
Teach Yourself Web Publishing with HTML in a Week (Sams Publishing, 1995,
ISBN 0-672-30667-0) is the first of two books on HTML by Laura LeMay. The second,
which I haven't seen, is Teach Yourself More Web Publishing with HTML in a
Week. Now, here is a book (two, actually) obviously designed for a course.
And, equally obviously, not for a one-day course. LeMay's 403-page book
implicitly makes the case that the one-day intensive course is a bad idea. If it
takes her a week (or two) and she's bragging about it, well.... I found several
of the things that I was looking for specifically. Her HTML reference indicates
the permissible nesting of tags, and hers is the only book that contained what I
considered a sufficiently thoughtful discussion of the pros and cons of lumping
apparently separate pages into one file, linked using named anchors.
LeMay gives a lot of design advice, including a discussion of storyboarding. Her
URL reference includes the specification for the URL for nonanonymous ftp, which
is rare in HTML books. She has good advice on URLs, such as when not to use File
URLs. She gives an overview of CGI scripting and imagemaps. Possibly she gets
into these subjects more deeply in week two.
LeMay is of the made-up-examples school. It works well for her. This is a good
course book and a good reference; although I don't think I'd call it "the
most complete HTML reference I have seen" as the cover blurb does.
It's probably foolish to talk about "the most complete HTML reference I have
seen," since I'm sure I'll see three or four more by the time this column
sees print. Nonetheless, if I were to nominate a most complete, it would probably
be Ian Graham's HTML Sourcebook (John Wiley & Sons, 1995, ISBN
0-471-11849-4). This 416-page book is an introduction to, and reference for,
HTML, URLs, HTTP, and CGI scripting.
The HTML coverage is clear and readable. I found most of the things I was looking
for. The coverage of URLs is the best I've seen. Like LeMay, Graham gives the URL
for nonanonymous ftp; his discussion of Gopher URLs suggests, as the other books
don't, that you might actually want to support the Gopher protocol. He also
discusses the rlogin URL, personal directories using the tilde character, and
fragment reference using the # character. His is the only book in which I
could find out what to do with a filename that includes a forward slash.
The discussion of CGI scripting is mostly an overview with links to tools. There
is a good chapter on the HTTP protocol, and an appendix on MIME.
The rest of this column is the third installment in my ongoing look at
alternative programming paradigms.
These tend to be embodied in little languages, sometimes in the Jon Bentley sense
and sometimes in the sense of a pared-down, single-author implementation of a
paradigm that typically tromps a bigger footprint on the disk. Tiny Ada, as it
were. As it is, in fact, although not this month. What I'm looking for when I
look at these little languages is what makes the paradigm distinctive, and the
state of its health. My interest is not in the merits of the paradigms or of the
implementation so much as in their distinctiveness and their chances for survival
in the Darwinian struggle. I confess that this comes out of a blind, a priori
faith in the value of diversity. Save the paradigmatic rain forest, that's my
motto.
Roy Ward ([email protected]) has written an interesting little language
called "ReWrite." It runs only on the Macintosh, requiring at least a
68020 and System 7.0, and will run in emulation on a PowerMac. It is a compiled
language, and the ReWrite compiler is written in ReWrite. ReWrite is interesting
specifically as a testbed for exploring the rewrite-rule programming paradigm.
Programming using rewrite-rule syntax is usually encountered in functional
languages like Haskell, ML, Miranda, Clean, or in the functional mode of
Mathematica. When you use ReWrite, you feel as though you're using one of these
functional languages. You define functions with rewrite rules. Example 1(a) is the definition of the factorial
function in ReWrite. The basic syntax is a list of rules specifying a
transformation; see Example 1(b). You can add
conditions (or guards) to these rules; Example
1(c) is that factorial function in a more-robust form. The modification
inside the brackets specifies a condition for the match to take place, and the
one outside the brackets specifies an additional condition (beyond the match)
that must be satisfied for the rule to apply. More precisely, these two forms of
rule syntax are as in Example 1(d), where
name is a token, patterns is zero or more patterns separated by
commas, condition is an expression, and results is zero or more
expressions separated by commas. A pattern can be a constant (optionally
coerced to a type), a token (which matches any single value, optionally
conditioned to a type), or a list or portion of a list of values. The list
representation is somewhat Lisp-like, and allows ReWrite to define core Lisp
functions, as in Example 1(e).
As is the case with proper functional languages, these functions don't have side
effects (other than obvious ones like screen output).
But underneath the rewrite syntax, ReWrite is working in an applicative way. That
is, the code is fully compiled and there is no eval mechanism, no garbage
collection, and no other complicated memory management. Functions clean up after
themselves, including all the list-processing functions. And code compiles to
"moderately efficient" 68020/68030 machine code. Ward presents an
example program in ReWrite and Pascal to find the nth prime. In one test,
nprime [2000] takes 32 ticks in Pascal and 134 ticks in ReWrite. This
isn't bad, surely, for a language implementation designed for exploratory
purposes and only in an early rev, but it does require a lot of explicit typing
of variables. A naive ReWrite version of this program is a lot slower.
On the other hand, ReWrite is a lot faster than Mathematica, which has a similar
syntax. The reason is that Mathematica is interpreted, and ReWrite is strictly
compiled. And Mathematica costs money, while ReWrite is freeware.
On the third through fifth hands, Mathematica is a robust, professional,
supported product that runs on many platforms, and a major new version of
Mathematica is due out imminently. I hope to write about it soon.
Quick Study
Massive Tome
No Nonsense
Charting a Course
Duke of URLs
Potion for the C-Sick
Rewriting ReWrite
No Garbage
Example 1: Using ReWrite.
(a)
factorial [0] -> 1;
factorial [n] -> n * factorial [n-1];
(b)
rule [pattern] -> result;
(c)
factorial [0] -> 1;
factorial [n : int] :: n>0 -> n * factorial [n-1];
(d)
name [patterns] -> results;
name [patterns] :: condition -> results;
(e)
car [ {x, . rest} ] -> x;
cdr [ {x, . rest} ] -> rest;
cons [ x, rest ] -> {x, . rest};