New Idea Engineering has an in-depth article, “20+ Differences Between Internet vs. Enterprise Search – And Why You Should Care (Part 1)”, all about how the popular search engines find content and how enterprise search engines should do it differently. The article provides a rather lengthy analysis of a number of issues that those in the market for enterprise search would benefit from understanding. In a nutshell, a search appliance is not going to be very effective in any but the smallest enterprises. Of interest to readers of this blog is a discussion of areas of concern to potential customers of federated search.
As a side note, I’m impressed with New Idea Engineering’s depth of coverage of the enterprise search industry. A few years ago, Abe and I wrote introductory articles about the deep web that were published in New Idea Engineering’s newsletter. I highly recommend the articles for those unfamiliar with the deep web. Abe and I have also referenced New Idea Engineering in a couple of blog posts: Federated Search: True Enterprise Search, and Federated search by any other name…
The section of the article about federated search in the enterprise has much food for thought. The author, New Idea’s co-founder and Vice President, Mark Bennett, lists eight federated search features of importance in the enterprise:
- Flexible rules for combining results from all of the engines searched
- Maintaining Users Security Credentials
- Mapping User Security Credentials to other security domains
- Advanced Duplicate Detection and Removal
- Combining results list Navigators, such as Faceted Search links and Taxonomy Nodes.
- Handling other results list links such as “next page” and sort order.
- Translating user searches into the different search syntaxes used by the disparate engines.
- Extracting hits from HTML results, AKA “scraping”, hopefully without the need to custom code.
A sidebar elaborates on each of the features.
The article makes this important point: the popular search engines perform full text searches of unstructured text but enterprise content is much more structured than content in the Internet at large, it often contains fielded data in databases, and it is often hierarchically organized. Federated search vendors that want to sell into the enterprise need to consider this important difference.
Another important consideration: while users of popular search engines expect a simple relationship with the search application – a single search yields a single set of results -enterprise users expect more interactivity. These users will want features like faceted searching and clustering, which I’ve written about previously, and other mechanisms to exploit, organize, and navigate the enterprise data.
Read the New Idea article yourself as there are a number of other differences between the Internet and the enterprise discussed. Even the bulk of the discussion, that doesn’t directly mention federated search, does raise important issues that federated search vendors needs to consider when designing products for the enterprise. This line from the article sums up the subject best:
The Enterprise is not just “a small Internet”