Tool Abuse
It's said that a craftsman can abuse a tool in at least five different ways.
XML confirms this adage: I've witnessed my fair share of XML bloopers and compiled
a list of annoying, pointless or inefficient ways to employ the language. While
some of these designs may allow the architect to claim "buzzword compliance,"
they're unlikely to assist in the development effort.
XML + Notepad = IDE
I can't count the number of vendor presentations I've attended where the answer
to flexibility and configurability was "This is all driven by an XML file."
We know that one of the strong points of XML is human readability, but may forget
that human readability does not imply human writability! Even the W3C reminds
us that "XML files are text files that people shouldn't have to read, but
may when the need arises." (www.w3.org/XML/1999/
XML-in-10-points.html.en)
We're starting to see some improvement in metadata-aware XML editors (as long as you have a DTD or a Schema), but in most cases, the desperate attempts to make any meaningful updates to such files still result in the copy-paste methodology: Find a piece of XML that looks similar to what you want to do, copy it, make a few changes and see what happens.
Extra, Extra Large
XML-based configuration files do have advantages. They integrate well into source-control
systems, can be converted into HTML documentation via XSL andif need becan
be inspected by humans. It's a shame, then, that many XML configuration files
come only in one size: Extra, Extra Large. A single XML file consisting of thousands
of lines eliminates most of the potential benefits by making version control
and concurrent editing by multiple developers impossible. Do we store all our
Java code in a single file?
Loose Cannons
Loose coupling is a valuable architectural principle. Loosely coupled systems
make fewer assumptions about each other and can be implemented in different
languages or on different platforms. Due to its platform and language independence,
XML data exchange supports loosely coupled architectures. However, the various
architectural advantages of loose coupling can turn into development disadvantages.
If all my methods simply take a string argument, which is supposed to contain
an XML document, it's clear that I won't ever have to change the syntax of my
function calls. Sounds good, right? Well, maybe not. The reason I won't have
to change the syntax of my methods is that I decoupled syntax and semantics.
The method signature (syntax) tells me nothing about the meaning of the method
(semantics) or what data I'm supposed to pass. If I'm lucky, I can look up the
data format in some cryptic DTD or I get an example XML document. Either way,
I lose any compile-time validationif I pass an invalid document, I won't find
out until runtime, when I get an obscure error message or things simply don't
work.
With explicit, strongly typed value (or transfer) objects, on the other hand, my IDE can offer me a drop-down list of all the properties and methods as I type (nice!). Better yet, I can define custom types and constraints for each field (the semantics!). If I try to pass invalid data types, my compiler will warn me before my code goes into testing.
Loose coupling has its place in enterprise architecture. However, consider the trade-offsI don't have to loosely couple every object I call within my application. If you're working with an XML data interchange across systems, consider using XML data binding frameworks such as JAXB (http://java.sun.com/xml/jaxb) or Castor (http://castor.exolab.org) to create strongly typed objects that represent the XML document.
Desperanto
Integration is a hot topic these days. Most applications that want to sport
the label of an "enterprise" application must offer some form of integration.
All too often, this integration takes the form of "We can receive XML dataso
we can interface with any system." I liken this to the use of the Roman
alphabet: Just because I type this article in characters of that script doesn't
mean that every person in the Western hemisphere can actually read and understand
what I writemany languages use the same characters. XML is similar: It solves
many issues related to data representations, but some of the stickiest problems
in integration are structural transformations and semantic transformations (comparable
to having to translate this article into Danish). To be fair, XML wasn't meant
to address all these problems, so let's stop pretending that it does. On the
upside, XSL helps us quite a bit in implementing transformations, but there's
little doubt that integration and transformation remains a difficult problem
that's usually solved not by XML magic, but by plain, old hard work.
Metadata = More Than Data
One of the great boons of modern programming languages such as Java, C# or Smalltalk
is the ability to use reflection, allowing programs access to other objects'
and classes' metadata, such as a list of all methods and the parameters they
accept. Many development environments, compilers and linkers (yep, my age is
showing) use this feature to make the programmer's life easier. In the case
of XML data structures, the metadata is defined separately in form of Document
Type Definitions (DTDs) or XML Schemas.
Metadata is a critical part of XML. XML documents without associated DTD or Schema are not very useful for developerswe can't be sure if our documents are valid or which constraints apply. Again, too often, we rely on the "interface contract by example" method: If your XML looks like this, it's probably valid.
Bruiser Interface
The UI-agnostic application is one of the most publicized uses of XML. As long
as our data is represented as XML documents, we'll be able to support new user
interfaces such as cell phones, PDAs, speech synthesis, brain waves, what have
you. I actually render my website (see www.enterpriseintegrationpatterns.com)
from XML source documents and am working on rendering PDF files from the same
source. XML does a great job there, but then I'm dealing with written documents,
not an interactive online application.
I also render to quite similar presentation media (HTML and PDF). Rendering a complete, interactive user interface to a variety of devices generally requires a user interface redesign to ensure ease of use, plus some heavy-duty transformation work. Once the requirement for multiple access paths becomes real (for example, via Web services support), it makes sense to evaluate the use of XML. Until then, carefully weigh the trade-offs between XML and native data representation.
Not Everything Is a Nail
With a tool as widely applicable as XML, let's focus on using it where it makes
sense. In most cases, "XML everywhere" is not the best choice. And
when the next software application vendor tells you that "We have an XML
file," tell them to start reading Software Development!
Gregor Hohpe is a senior architect with ThoughtWorks, an Internet systems integrator and consulting company. His current interests include agile methodologies and patterns of enterprise integration. Reach him at [email protected].