This is the third part in a series of articles that explore how federated search engines (FSEs), especially those that search the deep web, process search results from search engines. Part I looked at screen scraping of search result data from search engines that only provide HTML intended for human consumption. Part II looked at the more pleasant situation of processing XML that a growing number of search engines are returning. This article looks at the emerging OpenSearch standard and how FSEs can benefit from it.

Wikipedia summarizes OpenSearch pretty well:

OpenSearch is a collection of technologies that allow publishing of search results in a format suitable for syndication and aggregation. It is a way for websites and search engines to publish search results in a standard and accessible format. OpenSearch was developed by Amazon.com subsidiary A9 and the first version, OpenSearch 1.0, was unveiled by Jeff Bezos at the Web 2.0 in March, 2005. Draft versions of OpenSearch 1.1 were released during September and December 2005. The OpenSearch specification is licensed by A9 under the Creative Commons Attribution-ShareAlike 2.5 License.

The “format suitable for syndication and aggregation” mentioned above refers to two standards, RSS 2.0, and Atom 1.0, both of which present their data in XML.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!