Last week, federated search vendor MuseGlobal announced that it had partnered with consulting firm Adhere Solutions to provide federated search for the Google Search Appliance. Dubbed the “All Access Connector,” MuseGlobal and Adhere have developed an extension to Google’s Search Appliance. The press release announcing the partnership lists a number of features:
- Access to hundreds of millions of pages of content from over 5,400 sources, all through the Google interface.
- Simple one-click entry to external sources with no additional log-in requirements through a powerful proxy server.
- Non-stop and instantaneous, 24-hour content retrieval through automatic updates to connectors as changes take place to target sources.
- A much lower cost than the manual acquisition and indexing of all desired sources.
- Compliance with current authentication and security policies with a role-based search access model.
- Pre-built and constantly monitored connectors that require no coding to implement.
- Easy navigation of search results by source, subject, date and other meta-data categories.
You may be wondering what the partnership means for the federated search industry. I have some thoughts.
First, I think that this kind of offering is a long time coming and will increase awareness of federated search technology. This can only be a good thing for the promoting the growth of the federated search industry just like when Microsoft acquired FAST enterprise search, which I imagine will increase the technology’s presence in the enterprise.
I don’t believe that the partnership is a threat to the federated search industry just yet because the majority of federated search vendors don’t sell to the enterprise. Microsoft and Autonomy (who acquired Verity) should be concerned. And, when hybrid federated search/enterprise search environments grow in popularity, then I think federated search vendors will have cause for concern. Right now, I don’t believe that most customers who are looking for a federated search solution (corporate, university, and public libraries, and research organizations) are going to be considering the Google Search Appliance because a search appliance with a federated search extension is not the same as a federated search solution. Plus, hopefully customers will realize that relevance ranking is, or should be, a key criteria for evaluating a solution.
This gets me to my next, and most important, point: relevance ranking of billions of documents in the public internet is different from relevance ranking in the enterprise which is different from relevance ranking in a research-oriented environment. Notably missing from the press release is any mention of relevance ranking. Google has built a very successful business out of its PageRank algorithm, which ranks documents according to their popularity. This approach works exceptionally well when there are billions of documents, when users are looking for facts and figures, popular information, or lay content.
The popularity approach, as popular as it is, doesn’t work all that well in a research environment, where popularity doesn’t matter but authoritativeness does; Google isn’t designed to be authoritative. And, the enterprise has different requirements for effective relevance ranking as well. For a very detailed discussion of how enterprise search and federated search have very different requirements than search in the public internet, read Federated search in the enterprise and the article it references. Also, see What determines quality of search results? for more than you ever wanted to know about the complexity and importance of this subject.
Computerworld published an article in January raising the question of who does better relevance ranking – Microsoft’s FAST enterprise search or Google’s enterprise search appliance. While the article doesn’t provide a clear answer — it presents viewpoints from both sides — it does remind us that the environment really does matter when you’re ranking documents. One approach does not fit all.
I have a lingering question about how relevance ranking is going to be performed in the Google Search Appliance, especially if crawled and indexed content is going to be aggregated with federated content. This is an important question because I imagine the ranking algorithms and the results displayed to the user are going to be very different from the two parties. Google does have a web page where it speaks in very general terms about its relevance ranking algorithms, but this is for its enterprise search where Google can crawl and index all content. Ranking in a federated search environment is a different beast.
For better or worse, I think this offering will get many potential customers to view federated search as a commodity. Thus, it will force the high-end federated search vendors to work even harder than they do now to differentiate themselves from their low-end competitors. I can see it now: prospective customers will start using Google as a reference for product comparisons and will expect vendors to provide cheap and simple solutions. This is not such a bad thing since it’s not news to anyone that users wanted federated search to be as simple to use as Google.
I think, overall, that the partnership is good for the industry. It will allow (and force) vendors to differentiate themselves. It will give customers more choices. It may drive prices of federated search offerings down and lead to more commodity offerings from Google copycats. And, the partnership brings federated search users closer to the “Google experience” they all want.