I think it’s safe to say that, given the choice between searching a content source in real time vs. searching it from an index, we’d all opt for searching the index. This assumes, of course, the index is as current as the content that might be federated. I’ll be the first to admit that federated search is a necessary evil. But, necessary it is. I’ve been hearing people talk about life beyond federated search and I just don’t get it. Until every single content provider makes the full-text of all of their documents that can be federated available for harvesting and indexing, federated search isn’t going away.
Serials Solutions’ new Summon Unified Discovery Service is touted as going beyond federated search. The promotional video boasts how there are no connectors, no inconsistent metadata, and no waiting for results to come back. This is all well and good but how do you deal with quality content sources that are not available through the service?
I need to say that I don’t have an objection to Summon. Serials Solutions has done a very impressive job of lining up a number of major publishers to make tons of content available to subscribers. And, just because Serials Solutions is a competitor to blog sponsor Deep Web Technologies, I’m not dissing their service. My one and only complaint is with the message that the service somehow eliminates the need for federated search.
I do think that harvesting and indexing technologies have a very important role in search solutions. In particular, when you have full text of articles you can perform much better relevance ranking than when you’ve got only title, author, and abstract or snippet. But, you can’t (or shouldn’t) ignore content that can only be federated. Hybrid systems make sense to me. You index as much as you possibly can and federate what you can’t.
One of the critical roles that federated search plays is to provide access to the sources of a client’s choosing. I’ve written about the importance of federated search engines being comprised of diverse content sources. WorldWideScience.org is an excellent example of a federated search engine that searches diverse sources — specifically global sources from national governments and from organizations that are blessed by their governments. These are quality sources that are providing their research results and other scientific documents to the public for free. How does access to free scholarly content fit into Summon’s business model?
The danger with relying on any one service to provide you with access to its indexed content is that the service’s criteria for source selection may not be yours. That’s why I recommend hybrid solutions to get the most out of indexed content and the freedom of including federated sources of your choosing as well.
Tags: federated search