I wrote on Tuesday about those “things” that contain the content that your federated search application is searching. Several of you left comments and, while there’s no consensus from the vendor community, “source” seems to be in fairly common use.

Here’s another naming question. What is the difference between metasearch, federated search, and broadcast search? I know that I don’t have the full picture of how these names have been used historically and how they’ve evolved. A little while ago there was a discussion on code4lib. I believe it was Mike Taylor who launched the conversation with this question:

I, and most of the people I’ve worked with, have been using the terms “metasearch”, “federated search”, “broadcast search” and “distributed search” synonymously for years. Have they now settled down into having distinct meanings? If anyone could summarise, I’d be grateful.

I’ve always put “metasearch” into its own category. Peter Noerr expressed it well:

It seems in the broader web world we in the library world have lost “metasearch”. That has become the province of those systems (mamma, dogpile, etc.) which search the big web search engines (G,Y,M, etc.) primarily for shoppers and travelers (kayak, mobissimo, etc.) and so on. One of the original differences between these engines and the library/information world ones was that they presented results by Source – not combined. This is still evident in a fashion in the travel sites where you can start multiple search sessions on the individual sites.

The other terms I’d always lumped together.

I get that the term “federation” is not clear. Not everyone thinks of federation as real-time search. Eric Lease Morgan makes the excellent point that we now have discovery services being introduced:

But I believe we are also seeing a new type of index manifesting itself, and this new index has yet to be named. Specifically, I’m thinking of the index where various types of content is aggregated into a single index and then queried. For example, instead of providing a federated search against one or more library catalogs, a Z39.50 accessible journal article index, a local cache of harvested OAI content, etc., I think we are beginning to see all of these content silos (and others) brought together into a single (Solr/ Lucene) index and searched simultaneously. I’m not sure, but I think this is how Summon works.

Peter Noerr reminds us that not everyone agrees about what federated search is:

Fed Search has the problem of Ray’s definition of Federated, to mean “a bunch of things brought together”. It can be broadcast search (real time searching of remote Sources and aggregation of a virtual result set), or searching of a local (to the searcher) index which is composed of material federated from multiple Sources at some previous time. We tend to use the term “Aggregate Index” for this (and for the Summon-type index) Mixed content is almost a given, so that is not an issue. And Federated Search systems have to undertake in real time the normalization and other tasks that Summon will be (presumably) putting into its aggregate index.

Jonathan Rochkind made the same point in response to my blog article where I accused another blogger of incorrectly using the term “federated search” in reference to Google Scholar.

Actually some people insist that “federated search” actually ONLY refers to an aggregated index like Google Scholar, and _shouldn’t_ be used to refer to broadcast search!

Interestingly enough, I ran into this question of “What exactly is federated search?” a year and a half ago and I blogged about it. The article was in response to an article in New Idea Engineering in which author Miles Kehoe wrote the following:

“… our customer – whose search team is staffed with equally bright folks – were of the opinion that a federated search meant that content from a number of different data stores would be indexed into a single search index, where users could enter queries and see results from all of the data stores would come from that one master index.”

So, this subject is a big mess. What’s the takeaway? Don’t expect others to know what you mean when you use one of the terms. Peter Noerr summed it up best:

“So we just do a lot of explaining – with pictures – to people.

