Grid Languages
Regardless of when transactional applications ultimately wind up on a grid, IT is already engaging in a subtle paradigm shift, moving away from larger SMP boxes running proprietary flavors of UNIX, and moving toward large grids of one- to two-processor x86 machines running Linux. These machines already dominate the front-tier web server market. Now, they are starting to appear on the back end with products like Oracle RAC, the grid-enabled version of Oracle. The transition to grid will soon affect the middle tier, but it is held back by J2EE implementations. These apps were built to run on small clusters of multiprocessor machines rather than large clusters of unit-processor machines.
Unlike earlier architectures, grid has no pressing requirement for portability. Companies are no longer locked in by a vendor when they run Linux on x86 white boxes. Consequently, they have no problem with applications that only run on Linux/x86. The footnote to this portability rule concerns corporations that require applications be developed on Windows-based machines. For these companies, the only portability requirement is the ability to develop on Windows and deploy on Linux.
Basically, today's corporate applications all produce text, whether HTML for web browsers or XML for other applications. With the onslaught of web services, all back-end resources will soon be providing XML rather than binary data. The average corporate application will be a big text pump, taking in XML from the back end, transforming it somewhat, and producing either HTML or XML.
With this in mind, clear requirements emerge for a programming language best suited to support corporate applications in a grid environment:
- Fast handling of XML (dynamic data with fluctuating types).
- Fast processing of text into objects and out of objects.
- Optimal handling of control flow, which is the bulk of most applications' limited logic.
- Minimal portability (Linux/x86 and Windows/x86).
- Minimal abstraction (very thin veneer over the operating system for system services).
- Specific tuning for one- or two-processor x86 machines.
Considering these requirements, Java does not fare well:
- Java is a strongly typed language that does not easily handle XML data, which is inherently unstructured.
- Java is painfully slow at processing text because it cannot manipulate strings directly.
- Java is great for complicated applications but not ideally suited for specifying control flow.
- Java provides maximum portability, which is overkill for grid apps.
- Java provides maximum abstraction with a huge virtual machine that sits between the application and the operating system and is overkill for grid applications.
- Most J2EE implementations are tuned for 4-16 processor SMP boxes.
For applications deployed on grid architectures, Java does not suffice. What developers need is a scripting language that is loosely typed to facilitate XML encapsulation and that can efficiently process text. The language should be very well suited for specifying control flow: It should be a thin veneer over the operating system.
Most Linux distributions already bundle three such languagesPHP, Python, and Perl. PHP is by far the most popular. Python is considered the most elegant, if not odd. Perl is the tried-and-true workhorse. All three languages are open source and free. As Figure 3 illustrates, PHP use has skyrocketed over the past few years.