Jun
[ Editor's Note: I'm republishing this article, by Brian DeSpain, from the Deep Web Technologies Blog. It does great job of explaining how their clustering solution adds value to federated search. ]
Clusters that think
One of the most interesting features of our Explorit search product is our clustering engine, which analyzes results and produces “clusters” that represent a new and powerful way to navigate search results. The true power of these clusters is often overlooked, for they superficially resemble the output generated by the keyword-based systems and fixed taxonomies of other search engines. Our clustering technology, however, is more akin to a document-discovery engine, which provides a significant improvement over the alternatives in the library world.
The Explorit engine provides a unique approach to clustering taken from Latent Semantic Analysis (or LSA). We took a look at some of the traditional methods at taxonomy generation (i.e. learning approaches, semantic knowledge bases, and word nets) and after carefully examining their advantages and shortcomings, we chose latent semantic analysis, and a “description comes first” approach, to provide a rich result analysis tool for customers. LSA is a fully automatic mathematical/statistical technique for extracting and inferring relations of contextual usage of words in search results. This technology provides a concept-based approach to analyzing and clustering results from a result set. Applying the LSA approach, our clustering engine analyzes the relationships between a set of documents and the terms contained within the documents to produce a set of concepts related to the results. In other words, our search engines can generate more sophisticated and nuanced result clusters, which will help to cut down on the time and tries it takes for users to find the desired information.
Read the rest of this entry »
If you enjoyed this post, make sure you subscribe to the RSS feed!
After the dust settled for Ken Varnum I had the opportunity to interview him about
Discovery services have begun to appear in the search landscape. Discovery services provide access to documents from publishers with which they have relationships by indexing the publishers’ metadata and/or full text. Discovery services are marketed to libraries where patrons appreciate near-instantaneous search results and where library staff is willing to restrict access to sources available from the service (and optionally the library’s own holdings.) While these services tout themselves as improvements to federated search, the reality is that there is no alternative to federated search for a number of important applications.
When I think of real-time search and automated retrieval I think of a federated search solution. Well, here’s a different kind of such a system.
Helen Mitchell, enterprise search consultant and one of our volunteer judges for this year’s Federated Search Blog contest, will be teaching a one-day course at SLA in New Orleans in June. Mitchell has over 30 years of experience in enterprise search. See her bio in 
