Yesterday I received an email from a reader of this blog asking if I knew of any performance benchmarks for federated search engines. I replied, telling the reader that I would address the question in this blog. Here’s my reply:
I’m not aware of any benchmark studies and even if I were I’d be very suspicious of their findings because there are too many variables that determine performance of federated search engines. These questions come to mind:
- How do you measure speed? Some federated search engines return results incrementally as they get results from individual sources. Are you measuring the time it takes to get results from the first (fastest) source or from all of the sources?
- Response time from a source will vary depending on the source, the time of day, and the load on the content provider’s server. The server load will be influenced by how many other searches are running. How can you baseline the response time to benchmark against?
- Response time is also heavily influenced by the load on all of the networks between the client browser, the federated search application server, and the content provider’s server. How will you get a baseline for that response time?
- A particular federated search engine may take much longer to process results if it is searching five sources rather than one. That’s yet another variable to consider. Will you be able to isolate performance of the federated search application from that of the sources being searched?
- Performance from content sources will be affected by the nature of a query and by the number of results returned, and that will vary by source. Broad queries that return many results are often processed slowly by the remote search engine. How will you be able to factor that effect into your performance testing?
- Are you comparing apples to apples? If the federated search engine you’re evaluating is returning results quickly but only evaluating them superficially, performing poor (or no) relevance ranking than quick response time is a consolation prize that masks a real problem.
In a future post I will go into much greater depth about how you should evaluate federated search vendors for performance. In the meantime, if you find a vendor who gives you performance benchmarks against its competition you may want to show them this post.