The interplay between AJAX and federated search | Federated Search BlogFederated Search
7
Apr

Even if you’ve never heard of AJAX, you’ve very likely experienced it. AJAX is the technology that makes pieces of web pages update smoothly. Google Maps is a prime example of AJAX in action. You give Google a street address, it draws a map, and you can drag the map in all directions. Without AJAX, if you wanted to “drag” the map, you’d have to click a button to shift the map east, west, north or south. Then, the whole web page would redraw itself. The map navigation process is much smoother thanks to AJAX.

AJAX stands for Asynchronous JavaScript and XML. Asynchronous refers to the fact that a program using AJAX can request an update to bits of a web page without having to reload the entire web page. JavaScript provides the mechanism that the web page uses to communicate with the HTTP (web) server. XML is the standard that is sometimes, but certainly not always, used to encode the data given to the web server. AJAX is basically a set of standards and techniques that a web programmer can use to create HTML-based web applications that are browser-independent where parts of the page refresh smoothly without requiring entire page reloads.

So, what does AJAX have to do with federated search? Federated search is a kind of application that is perfectly suited for using AJAX. The display of incremental search results that I wrote about recently is handled by AJAX techniques. The web page that displays the search results polls (queries) the web server, via JavaScript, to determine how many results have been received for a particular search. A piece of the web page can then be updated to show the new results. The ability to display a progress bar showing how many sources have returned their results is also due to AJAX. In this case, JavaScript on the browser can query the web server, determine how many sources have completed their work, and redraw the progress bar.

Note that there are two things going in when AJAX is used. First, a piece of a web page needs to be updated. This is handled by rewriting what’s called the Document Object Model (DOM) of the web page. the DOM contains the HTML of the web page and the structure, or organization, of the page. The structure has to do with HTML tags, the text inside of tags, and which tags are embedded within which other tags. AJAX manipulates the DOM, adding or removing rows from HTML tables, or editing HTML on the fly in other ways. Only isolated parts of the DOM are manipulated and only those affected parts of the HTML page are redrawn. The second thing that needs to happen when AJAX is employed in a web application is on the server side. The server must, in some cases, update relevant information. If a user is deleting a mail message in Gmail, for example, which utilizes AJAX heavily, then not only must the list of mail messages be redrawn to omit the just deleted message, but the Gmail application must update its database of mail messages to mark a particular message as being deleted.

While AJAX is a wonderful way to enhance a user’s federated search experience, there are several things that developers need to consider. The first consideration is that AJAX programming is more difficult than the simpler style of programming where the entire HTML page is refreshed whenever anything on it changes. A second consideration is that while AJAX is, in theory, browser independent, in reality, no complex application is browser independent. An AJAX programmer will need to be especially careful to test all AJAX-related functionality against all support browsers and may need to code special handling of certain functions for different browsers.

A final consideration with AJAX, which is mainly of concern to those who want to scrape the HTML of an AJAX-intensive application is that screen scraping can be extremely difficult, if not impossible, for such applications. Creators of connectors for federated search applications rely on search results information being represented in HTML, XML, or some other format which they can parse. When AJAX is used, the search results are not always easily available. They might not be encoded in the HTML but might instead only be accessible by JavaScript interacting with the server to get results as needed via AJAX. In extreme cases, the programmer might need to contact the content publisher to see if a more “machine-friendly” interface to the searching and retrieving results is available.

As Internet technology advances, and vendors build federated search applications in more user-friendly ways, the end user benefits, which is what matters the most. The programmer works harder and the connector builder faces the reality that he may not be able to get past the AJAX to build an interface to a source. Hopefully, the growing influence of AJAX will encourage more content providers to provide programmatic interfaces to their content that bypass all of the HTML, JavaScript, and everything else that makes their content hard to access.

Note: I wrote a four-part series on content access basics. You can find links to these, and other, articles in the Articles page of this blog.

If you enjoyed this post, make sure you subscribe to my RSS feed!

Tags:

This entry was posted on Monday, April 7th, 2008 at 9:52 am and is filed under basics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or TrackBack URI from your own site.

Leave a reply

Name (*)
Mail (*)
URI
Comment