The May/June edition of D-Lib Magazine has an interesting article about a prototype tool that helps writers to find relevant content and references by watching what they type and performing searches based on the context of their writing.

The article’s abstract gives the motivation and a general description of the tool:

Information awareness is distinct from explicit information seeking, such as searching. In this article we describe an information awareness tool that supports text composition by providing awareness of relevant content and references proactively and non-intrusively. As a user composes text, the tool automatically searches multiple sources, retrieves results, and displays links to the results. A working prototype of the tool has been implemented using Web 2.0 and Digital Library 2.0 technologies, and is flexible and highly configurable for both Web search engines and deep web targets.

This is an innovative tool, especially for all of those college students who need to write research papers and don’t enjoy using the federated search applications available to them. While this particular tool isn’t specific to federated search, the more scholarly the writing, the more value there is to federating deep web sources.

The innovative part, of course, isn’t performing the search. It’s the construction of the search. How does the Content Awareness Tool (CAT) create searches from user text?

Awareness tools can typically function without user intervention. The CAT uses a Javascript client, which captures user keystrokes while users compose text. Phrases or sentences, as denoted in English language text by punctuation such as commas, semicolons and periods, cause the CAT to perform an AJAX request to its server component. The server component may simply omit stop words and use the remaining terms as the basis for a query, or call an intermediate service for term extraction, such as Yahoo’s Term Extraction Web Service or OpenCalais from Reuters, and use the response to formulate queries against the defined search targets.

While I like this idea of this and other JITTR (just in time text retrieval) tools, I do think that there’s a chicken-and-egg problem here. If a student is staring at a blank document and doesn’t know where to start then how does he or she get ideas from the tool? This problem aside I imagine that students will really like this interesting twist on search. They don’t have to create queries and they don’t have to select sources. Interesting information just comes to them.

If you enjoyed this post, make sure you subscribe to the RSS feed!


This entry was posted on Monday, June 1st, 2009 at 9:03 pm and is filed under viewpoints. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or TrackBack URI from your own site.

2 Responses so far to "“Query-free” federated search"

  1. 1 Andy
    July 2nd, 2009 at 9:11 am  

    I would love to add this tool to our library. Any idea if Los Alamos plans to release the code?

  2. 2 Peter Noerr
    July 3rd, 2009 at 1:59 pm  

    As a historical note I would point readers to a product now called “ClickSurge” from http://www.mediariver.com. This started life in 2006 as a tool called Watson, and the company as Open Road which morphed to Intellext, and then its current name.

    The technology in this product does exactly what is described in Sol’s post - create searches from the typings of the user.

    Another product performing the same functionality, though in a different use case, is gClick from DowJones. This provides contextually relevant company and people information for a retrieved web page.

    Of course both of these are commercial products, so Andy would need to ask about how they can be used in a library web site.

Leave a reply

Name (*)
Mail (*)