Dr. Dobb's | PHP: The Power Behind Web 2.0

PHP: The Power Behind Web 2.0

Andy and Cal use PHP on the back end and JavaScript on the front end to build the "Flickr News Network" which lets you show pictures from Flickr to augment an article from a given news feed.

December 10, 2007
URL:http://drdobbs.com/web-development/php-the-power-behind-web-20/204800652

Andi is cofounder and cochief technology officer of Zend Technologies. Cal is editor of Zend's DevZone. They can be contacted at www.zend.com.

To understand why websites and applications might be considered Web 2.0, we first need to agree on a definition of the term. While there is no formal definition for it, we believe there are three essential components to Web 2.0:

Delivering a richer desktop-like user experience (Ajax).
Exposing functionality as easily consumable services (web services).
Leveraging the user-base to create, enhance, and categorize information.

The notion of Rich Internet Applications is all about delivering desktop-like experiences in the browser. JavaScript and Ajax are commonly used on the client side to deliver this experience. The main advantage of these technologies is their presence in popular web browsers, and that they can be implemented incrementally without requiring rewrites of existing web applications. In this model, the client depends on the server for orchestrating and delivering its data. PHP has been popular with Ajax applications because it's exceptionally well suited for this task. Not only is PHP a rich integration platform that includes connectivity to all popular databases, languages, file formats, and services, but it also has user-friendly and robust support for both the XML and JSON formats, typically used in client/server communications.

The second component of Web 2.0 is exposing functionality via web services. Although Service-Oriented Architectures (SOA) have been around for a few years, no one has been exactly sure how to bring them to life. Web services, with their ubiquity and ease of implementation, deliver on the promise of SOA. Likewise, an even more popular buzzword for Web 2.0 is "mash-up." Strictly speaking, mash-ups are any web service that combines two or more services to create a new service. Mash-ups are the poster-children for what the Web can be. Again, PHP has excellent support for web services, with its built-in support for SOAP, XML, and HTTP.

Finally, user-generated content is where these applications leverage users not only for creating content, but also for enhancing it by categorization.

In this article, we show the power of these communities by leveraging Flickr's photo collection and the categorization its users have contributed. For this project, we use pieces of the Zend Framework (framework.zend.com) on the back end and the Prototype.js (www.prototypejs.org) and Script.aculo.us libraries on the front end. The services we mash up are:

A news feed.
The Yahoo Content Analyzer service (developer.yahoo.com/ search/content/V1/termExtraction.html).
Flickr's API (www.flickr.com/services/api).

We call this mash-up the "Flickr News Network" (FNN) (the complete source code for FNN is available at www.ddj.com/code/). As Figure 1 illustrates, the concept behind FNN is simple: We want to show pictures from Flickr to augment an article from a given news feed. The implementation is straightforward—take the articles from a news feed and select pictures for it.

[Click image to view at full size]

Figure 1: The Flicker News Network architecture.

Because Flickr lets people tag photos, all we need to do is identify the keywords in the article and ask Flickr to search the tags for those keywords. Luckily, Yahoo has an API, the Yahoo Content Analyzer, which does just this. It takes in as a parameter the content you want analyzed, then returns a list of what it feels are the keywords.

When searching Flickr for keywords, you usually get multiple hits. For simplicity, we take the one with the highest "interestingness" rating and display it with the article.

In the process, we take advantage of all three Web 2.0 themes: We make heavy use of Ajax and JavaScript in the client, leverage web services (including RSS feeds and services from Yahoo), and take advantage of the tagging contributed by Flickr's user-base.

As you see in Listing One, there's not much HTML. Typical of many mash-ups, the heavy lifting is done by JavaScript, not HTML. As a disclaimer, the sample code we present illustrates one way the code could be written, but it's not the only way. In many cases, we omit important steps (filtering input, escaping output, error checking, and other important security measures) that production applications would include.

<?php
 session_start();
 $_SESSION['sak'] = md5(mktime());
?>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>Flickr News Network</title>
    <script src="/js/prototype.js"></script>
    <script src="/js/scriptaculous.js"></script>
    <script src="/js/EventBus.js"></script>
    <script src="/js/Articles.js"></script>
    <script src="/js/NewsArticle.js"></script>
    <script src="/js/RequestWatcher.js"></script>

<link rel="stylesheet" type="text/css" href="/css/main.css" media="all" />
<script>
function pageLoad()
{
    Articles.sak = '<?PHP echo $_SESSION['sak'];?>';
    Articles.getEventBus().addObserver(new RequestWatcher('requestCount'),'requests');
    Articles.fetchArticleList();
} // function pageLoad()

</script>
</head>
<body onLoad="pageLoad();">

<center>
<div class="welcome">
<h2>Welcome to the Flickr News Network.</h2>
// On FNN, we mash-up a newsfeeds with Flickr to bring you pictures from Flickr 
// that may or may not be relevant depending on how good a job people do tagging them.
</div>
</center>
<br /><br />
<div class="monitor" >
    Requests Active: <span id="requestCount">0</span>
</div>
<br /><br />
<div id="stories"></div>
</body>
</html>

Listing One

Other than the list of included JavaScript libraries in the <head>, the only JavaScript we actually show in the HTML is the function called on page load. pageLoad() handles the things that we can't do inside the JavaScript files or that have to be done at runtime. Specifically, since the web server's PHP interpreter does not interpret the JavaScript files and we need to set a value from PHP, we set that value in the pageLoad(). The other thing we do is set up a RequestWatcher so that whenever anything changes, we can update our outstanding requests display.

The PHP

All communication with the necessary service partners is through Ajax. Because of security restrictions, Ajax calls can only be made to the same server that served the original page.

The heart of service.php (available at www.ddj.com/code/) are the two lines that get a proxy from the factory and call its main() function. Proxy:;factory() takes the action passed in. It then looks to see if a class named Proxy/ACTION.php is anywhere in the include path.

Once we have the proxy class, we call its main() function. The factory handed our list of unfiltered options into the constructor. The generic options that apply to all Proxy classes were filtered in the base class, Proxy_Abstract. Each proxy class can filter additional options by overriding _filterOptions(). Make sure that if you override _filterOptions(), you call the parent method.

Caching

As for caching, the first thing we do is see if we've got a recent cache of this feed (see feed.php, available online). So that we don't unduly burden service providers, we cache everything from them. Thanks to Zend_Cache, caching is simple.

Not all caches are created equal, though. Because our feed can change frequently, we cache it for 15 minutes. Once we've fetched the keywords for a given article, chances are that Yahoo isn't going to change its mind about them, so we cache them for 24 hours. Flickr, however, may have new pictures uploaded that are more relevant than the ones we currently have, so we cache Flickr responses for two hours.

Caching does two important things for our service:

It speeds up the application as calls to other services are expensive, time wise.
By caching the responses for a reasonable time, we protect ourselves from the failures of our partners. Because we are reading from the cache most of the time, if a partner's site becomes unresponsive or unavailable, our application still operates normally. If this were a serious problem, we could turn off the automatic cleaning of the Zend_Cache and only remove old cache entries once it has expired and we know we have a new entry to replace it. This adds a few more lines of code to our Proxy classes, but nothing that is terribly difficult and could make the difference between showing users results and a blank page.

If we do not have a valid cache, the code inside the IF statement triggers and we fetch the current feed. Using the Zend_Feed_RSS class, we instantiate it with the URI of the feed. Since we don't need everything in the feed, we create an array of the relevant pieces of it. We are fetching the feed because we didn't have a valid cache, so before we finish, we save this array in the cache for the next request.

In all the Proxy-based classes for this project, we use an md5() hash as the cache identifier. In the case of the Feed cache, we use an md5() of the URL. For keywords, we hash the content used to generate the keywords; for photos, we use a hash of the keyword we are requesting a photo for. The Zend_Cache class requires that the cache identifier only be alphanumeric. Using md5() is a quick way to ensure that the rule is followed and that we have a unique and retrievable identifier.

Finally, we load the response property of our Proxy class with the array and return so that the service can hand it back to our JavaScript.

Proxies Are All the Same

The three Proxy classes we use in this project behave the same. For this reason, we won't go over them each in detail, but we do want to highlight one significant difference.

The Keywords.php proxy uses the Zend_Client_Rest class to talk to Yahoo's Content Analyzer. The Yahoo Content Analyzer is different in that while we still use a Representational State Transfer (REST) API to talk to it, we use an HTTP POST instead of a GET. The reason is that the description of the article could potentially be longer than the browser can send via a GET. This is why, when preparing the call to the Proxy factory, we use the $_REQUEST array instead of either $_GET or $_POST.

With Keywords and Photos, you need to register with the services and get the proper API keys to gain access. Most news feeds do not require you to register first.

Using pieces of the Zend Framework simplified the programming. Also, by structuring service.php as a factory, this solution can be reused and additional service proxies can easily be added.

The JavaScript

Almost all of the JavaScript in FNN is based on Prototype.js. There are other good JavaScript libraries such as Dojo and YUI!. Prototype.js, however, is simple to understand and implement, yet advanced enough to handle the needs of FNN. (See "Ajax: Selecting the Framework that Fits," by Andrew Turner and Chao Wang, DDJ, June 2007.)

The bulk of the JavaScript code is contained in two classes (both available online). The first of these is a class that we treat as a Static class, even though JavaScript does not have the facility to declare it as such. The class Articles contains everything we need to get the ball rolling. Because of its static nature, Articles is the one class that is not a subclass of Prototype's Class object. Because it's a static, we never instantiate it, we simply make calls to its member methods. The first call you see is in the pageLoad() method where we add a watcher to the Event Bus.

The primary method of entry is Articles.fetchArticleList() (available online), which fires off the request to services.php to fetch the feed. It passes the necessary parameters to service.php, including the action name, to properly set up and execute the proxy object. Prototype.js has a unique feature. If you are returning JSON from your service, you can set an HTTP header of X-JSON. The contents of the header are automatically evaluated creating a JavaScript object for you to use. This object is passed in as the second parameter to the onComplete method.

The thought process behind the fetching of the articles, keywords, and photos is straightforward and so is the JavaScript implementation—get articles, get the keywords for each article, get a photo for each keyword.

The second important JavaScript class in this project is NewsArticle. Once Articles has fetched the feed, it instantiates one copy of NewsArticle for each article in the feed. Each NewsArticle then goes off on its own and fetches the keywords for its story and then the best photo for each keyword according to Flickr. Finally it displays each of these items.

While the NewsArticle class contains the bulk of the code that will be executed, the concepts behind it are all pretty mundane.

Buses and Observers

The other code of interest is the Event Bus and Observer classes previously mentioned. The Event Bus class lets you marshal observers for a given event. These are not browser-based events but code-based ones. You can create as many as you want and your observers will contain the code necessary to process these events when they happen. In the case of this sample application, there are only two events of interest. When we initiate a newAjax request, we want to increment the number of currently outstanding requests. The corollary to that action is that when an Ajax action completes, we want to decrement the number of outstanding events. To register the observer, we include this line in the pageLoad() method:

Articles.getEventBus().addObserver
  (newRequestWatcher('requestCount'),'requests');

Now let's take a look at the EventBus.js code (available online). Whenever we want to fire an event, we call EventBus.notify() with a message (or event name) and a payload. The message is used to determine which observers to notify.

When registering an observer, you specify two parameters—an instance of the observer itself and the message to subscribe to. Outside of its constructor, RequestWatcher.js (available online) implements the single method, notify(). Notify takes the payload, in this case the number of remaining requests, and updates the output container with it. When we instantiated it in pageLoad(), the parameter we handed it was requestCount. This corresponds with the ID of a span tag in the HTML to be updated.

Recap!

By selecting the right tools for the job, even a three-way mash-up like this can be thrown together quickly. In the case of FNN, we chose the Zend Framework as the base because it contains all the pieces we needed to get our service written quickly. Additionally, because of the way the Zend Framework is structured, we were able to easily pull out the pieces we wanted and use them independently.

The other piece we selected was Prototype.js. Prototype's light footprint and simple interface let us quickly put the JavaScript together.

Second, by using good object-oriented techniques and practices, the server-side piece is now a framework that can be extended to consume resources from other services by building new Proxy classes.

On the client side, the OO is a bit more difficult because of language limitations, but we adhere to OO design theory whenever possible. That lends itself to well-segregated code.

Finally, the one unspoken concept we covered—building web services, even services that consume other's services, is painless. FNN talks to a simple PHP page named "service.php," which consumes other's services, but also publishes its own service. It converts the data from our news feeds, Yahoo, and Flickr, into an easily digestible format.

With the tools we've examined, you can now easily find the unique data or service that you have and build a service around it.