Discovery services have begun to spring up. This article is my attempt to catalog and characterize them. Consider this article to be an introduction that sets the stage for future analysis articles.
What is a discovery service?
A discovery service is a search interface to pre-indexed meta data and/or full text documents. Discovery services differ from federated search applications in that discovery services don’t search live sources. By searching pre-indexed data discovery services return search results very quickly. Discovery services are touted as an evolution beyond federated search and in some ways they are. Some discovery services either provide integration with federated search or provide an API for others to do the integration. I believe that hybrid “federated discovery” services are likely to prevail over pure discovery services and I will dedicate an article to them.
It’s useful to note that discovery services aren’t new. IngentaConnect makes 4.5 million documents searchable from over 13,000 publishers. Infotrieve provides a document search and delivery service. And, there’s Thomson Reuters’ Web of Science. These are just three examples of discovery services that have existed for a long time. What is new about the recently introduced discovery services is the focus on integration with other content, typically the library’s OPAC. I’ll discuss integration in a separate article.
What is a unified search index?
The terms “unified index” and “unified search index” are associated with discovery services. Just as the terms imply, discovery services use a unified search index to search content from all sources they have access to from a single index. The discovery service must deal with differences in the structure of meta data (e.g. names and contents of fields) from different sources to produce the unified search index.
What is the motivation for discovery services?
In a word, speed. It’s no surprise that users don’t like to wait tens of seconds for their search results. In terms of response time, live searching can’t compete with index searching. A second factor driving the creation of discovery services is the willingness of publishers and content aggregators to form partnerships with developers of the services. Given the pressure to deliver search results in “Google time,” publishers have an incentive to cooperate with one another and with discovery service providers.
Some people say that a third driving factor is cost. While it’s possible that libraries could save money accessing sources via discovery services vs. via federated search, cost figures are very difficult to come by for either so cost may or may not, in reality, be a factor.
Another reason for the big interest in discovery services is that the onerous task of building, monitoring, and repairing connectors disappears since there are no connectors.
Unified indexes provide benefits due to their “homogenization” of meta data. Duplicates should be much easier to remove via discovery services than by federated search engines. And, discovery services will produce more “complete” results, i.e. results with titles, authors, publications dates and other fields of interest that federated search can’t reliably get. With better fielded results it will be easier to cluster and otherwise organize search results.
A potential benefit, but also a potential concern, is relevance ranking. It may be better or worse with discovery services depending on how search is performed. See the next section for further discussion.
Are there downsides to discovery services?
Yes – source lock-in. I’ve written, perhaps ad nauseam, about my concern that discovery services, if not integrated with federated search, force organizations that want a single search tool to choose one service or the other. Federated search is very important for organizations that have particular sources they want to search that are not available from one of the discovery services.
Even if an organization is happy with the set of sources provided through a discovery service, the availability of sources is dependent on the relationship with the publishers (and/or aggregators.) Discovery services are too new to know how publisher relationships will evolve, especially given the competition.
It’s also not clear how discovery services perform search. Let’s say that a particular discovery service has an index that’s built from meta data of its documents and not from its full text. In that case searching the index won’t produce results that are as relevant as results obtained by searching the native source, assuming the native source provides full-text search capability.
Another concern with discovery services is how current their indexes are. When one searches a source via federated search, the content is current because it is searched live. It’s not clear how frequently the discovery service indexes are updated.
The Oregon State University (OSU) Libraries evaluated WorldCat Local and other discovery services and recommended further evaluation and testing. See the OSU Libraries report and the “New Discovery Tools” article for more information. Links are in the references section.
Who is providing discovery services?
Product: EBSCOhost Discovery Service
Product Web-Site: http://ebscohost.com/thisTopic.php?marketID=1&topicID=1245
Comments: See Library Journal article. See also EBSCOhost Integrated Search.
Company/Organization: Ex Libris
Product: Primo Central
Product Web-Site: http://www.exlibrisgroup.com/category/PrimoCentral
Comments: See press release.
Company/Organization: Innovative Interfaces
Product: Encore Discovery
Product Web-Site: http://encoreforlibraries.com/products
Comments: See Library Technology Guides article.
Company/Organization: University of Virginia Library
Product Web-Site: http://www.lib.virginia.edu/digital/resndev/blacklight.html
Comments: Uses Solr to index and search text and/or metadata, and it has a highly configurable Ruby on Rails front-end.
Product: WorldCat Local
Product Web-Site: http://www.oclc.org/us/en/worldcatlocal/default.htm
Comments: Partnered with EBSCO so that “whether a search begins in OCLC’s WorldCat Local or EBSCOhost Integrated Search, users will have access to the resources of the entire library since catalog data will be available alongside journal information.” See press release.
Company/Organization: Oregon State University Libraries
Product Web-Site: http://libraryfind.org/home
Comments: Built with Ruby on Rails. See requirements.
Company/Organization: Serials Solutions
Product Web-Site: http://www.serialssolutions.com/summon/
Comments: See Information Today article.
Demos of Discovery Services
- Blacklight Beta at University of Virginia Library
- LibraryFind Demo at Oregon State University Libraries
- Summon Beta at Dartmouth College Library
- Summon Beta at University of Liverpool Library
If you know of other demos I’ll happily add them here.
- Beyond federated search?
- Beyond federated search? The conversation continues
- Beyond Federated Search – Winning the Battle and Losing the War?
- Extensible Catalog Project
- New Discovery Tools for Online Resources From OCLC and EBSCO
- Oregon State University Libraries WorldCat Local Task Force
Report to LAMP
- ProQuest Proposes Pathway to New Platform
- SLA2009: Unified Discovery Services
- The difference between federated search and discovery services
- Top Technology Trends: July 2009
- Unified Discovery Services
Tags: federated search