Mar
I was absolutely delighted to read a recent article by Barbara Quint, editor-in-chief of Information Today’s Searcher magazine. Federated Searching: Good Ideas Never Die, They Just Change Their Names reminds us that federated search existed before the term became popular:
Even back in the days when only professional searchers accessed online databases, searchers wanted some way to find answers in multiple files without having to slog through each database one at a time. In those days, the solution was called multi-file or cross-file searching, e.g. Dialog OneSearch or files linked via Z39.50 (ANSI/NISO standard for data exchange).
A little sidebar: I heard about this article from my brother Abe (founder, President, and CTO of blog sponsor Deep Web Technologies) before it came onto my Google Alerts radar. Abe was at the NFAIS Conference where he had gone to deliver a presentation on multilingual federated search. At the conference, Abe had a conversation with Iris Hanney, President of Unlimited Priorities, a support services company for businesses. It turns out that Barbara Quint is a member of the Unlimited Priorities team and produced this article for one of their publications, DCLNews. And, that’s how Abe heard about the article, in which he’s mentioned. Small world!
The article goes on to explain what federated search is and then dives into a discussion of what makes federated search effective. I love this look at effectiveness because it reinforces the point that not all federated search systems are created equal. This article appears at the perfect time since I just wrote about how one university had a bad experience with federated search and turned its back on the technology.
Abe and I have frequent discussions about how many people bash federated search as a technology because of one bad experience with a vendor. Well, one vendor does not define the entire industry. Different vendors implement the technology differently. Here’s one example. Quint brings up the classic problem of searching for names. In a nutshell, the issue is how to search for “Albert Einstein” in a database? The answer will vary per database. Some databases expect “Albert Einstein”, some want “Einstein, Albert” and some will only match “Einstein, A.” Name searching, also known as author searching, is a great example of how messy federated search can be and how important it is for the human developing the search interface to the source (the connector) to think through the messy details and to figure out, for each source, what the expected name format is for searching.
Quint gives an example of the wrong way to do author searching:
In an article written by Miriam Drake that appeared in the July-August 2008 issue of Searcher entitled “Federated Search: One Simple Query or Simply Wishful Thinking,” a leading executive of a federated service selling to library vendors was quoted as saying, “We simply search for a text string in the metadata that is provided by the content providers - if the patron’s entry doesn’t match that of the content provider, they may not find that result.” Ah, the tough luck approach!
Yep, that approach works great except when it doesn’t work.
Then Quint gives the alternative approach: use human intelligence (and sweat) to get it right:
In contrast, Abe Lederman, founder and president of Deep Web Technologies (www.deepwebtech.com), a leading supplier of federated search technology, responded about his companies work with Scitopia, a federated service for scientific scholarly society publishers, “We spend a significant amount of effort to get it as close to being right as possible for Scitopia where we had much better access to the scientific societies that are content providers. It is not perfect and is still a challenge. The best we can do is transformation.”
Yes, I’m paid by Deep Web Technologies to blog so I do toot their horn from time to time. But, loyalty aside, I’m proud that Barbara Quint is publicly acknowledging what I believe to be Deep Web Technologies’ greatest strength - their connectors. Plus, I was a full time staff member at Deep Web Technologies when Scitopia was being developed. I engaged in a number of meetings with the Scitopia partners where we discussed, at great lengths, how to get author searching right. As Abe points out in the quote above, we had a relationship with the Scitopia partners and could influence changes in their search engines so Scitopia isn’t a common case but it does illustrate what is possible in a partnership, especially when the vendor really cares about getting it right.
Author searching is just one example of where human intelligence is the secret sauce that differentiates the really bland federated search from the, umm, potent ones. Great metaphor, huh?! Getting Boolean, phrase, and wildcard searching right are also important. And, the only way to get these elements right is to pore through the source’s documentation and to experiment, experiment, experiment. The work may be difficult and tedious but it’s the only way to get outstanding results!
If you enjoyed this post, make sure you subscribe to the RSS feed!
Tags: federated search
2 Responses so far to "Human intelligence: the secret sauce of federated search"
March 2nd, 2010 at 4:55 pm
Bashing federated search is like bashing cake. You can have bad cake, but who thinks cake is a bad idea? You can quote me on that.
March 3rd, 2010 at 7:04 am
I don’t think you are tooting DWT’s horn too loudly here, Sol. Really just pointing out what any of the vendors who have been in this business for any length of time (such as both DWT and MuseGlobal) have found out by the sweat of their collective brows.
The greatest value of any federated search engine lies in its Connectors (by any name) and the degree to which they are matched to the peculiarities of each individual Source.
Everyone recognizes that field mapping is the level one requirement, but the level two of matching search languages starts to separate the sheep from the goats. Handling complex elements in queries (like names) dealing with logic, limiters, filters, languages, and the like is where the field narrows to the true professionals.
I agree with both Abe and Barbara that these levels of compatibility between Connector and Source are not obtainable other than by the application of human intelligence. (Opinion - discussion welcome)