<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The &#8220;lowest common denominator&#8221; myth</title>
	<atom:link href="http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/feed/" rel="self" type="application/rss+xml" />
	<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/</link>
	<description>Covers topics related to federated search and the deep web</description>
	<lastBuildDate>Mon, 30 Jan 2012 05:01:14 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Peter Noerr</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28215</link>
		<dc:creator>Peter Noerr</dc:creator>
		<pubDate>Thu, 25 Jun 2009 01:39:21 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28215</guid>
		<description>Since I started &quot;LCD&quot;, I&#039;ll have a go.

As an operation I agree with the comments by Dave, Sol and Jonathan. LCD to me is literally where the Fed Search system issues a search of the same functionality (but possibly different syntax) to each Source. That implies that the functionality of the search is limited to that available from the least functional Source. I have deliberately used &quot;functionality&quot; to be as inclusive in the definition as possible. Indices, operators, relations, limits, even vocabularies are all examples of &quot;functionality&quot;.

Thus Dave&#039;s A, B, C case is a perfect example of what I would call an LCD situation - in the second instance. Here the search is issued only as a full text search, because that is the only functionality supported by all three Sources. The other instance (where only A &amp; B are searched) is dealt with below.

A couple of comments need to attach to this. Firstly it is interesting to ponder that the Fed search system must know enough about the different Sources to be able to determine what the &quot;common&quot; functions are. If it can do this, it is halfway(-ish) towards being able to handle Source Specific Searches (SSS). So why dumb down? Who knows? I don&#039;t.

The second point is that the first of Dave&#039;s instances (search A&amp;B only) uses what we call a &quot;strict&quot; mode for the search. That is: if the Source can&#039;t handle the search in it&#039;s entirety, then fail for that Source, and return 0 results. This is not LCD, but rather a strange form of &quot;HCF&quot; (if we want to stick with basic arithmetic acronyms) where only those Sources which meet ALL the requirements (support all the functionality) are sorted. Different category, but a problem none-the-less for certain types of searches.

Where non-LCD searching is possible (SSS as used above) then it is possible to switch to a &quot;relaxed&quot; mode and allow some portion of the search to be processed and produce at least some results. (You want details - I knew you would. OK, the most obvious example is where a Source does not support a particular index and any terms for that Source are mapped to one it does support - in 99% of cases this is &quot;keyword&quot; or its functional equivalent. )

This mapped and relaxed operation does allow results from Sources which would be &quot;0&quot;s to a strict search and, in our experience, most people would rather get something back which is somewhat like what they asked for, rather than nothing. Often because they weren&#039;t too sure about the query in the first place. Again a good result for some case, but not all. However take the union of the two cases and you cover what virtually everybody would want - and what you would get from the native Sources - always an important touchstone. (And this whole comment raise a host more issues for Sol to tease out into other posts - like the issue of notifying users of what the search has done.)</description>
		<content:encoded><![CDATA[<p>Since I started &#8220;LCD&#8221;, I&#8217;ll have a go.</p>
<p>As an operation I agree with the comments by Dave, Sol and Jonathan. LCD to me is literally where the Fed Search system issues a search of the same functionality (but possibly different syntax) to each Source. That implies that the functionality of the search is limited to that available from the least functional Source. I have deliberately used &#8220;functionality&#8221; to be as inclusive in the definition as possible. Indices, operators, relations, limits, even vocabularies are all examples of &#8220;functionality&#8221;.</p>
<p>Thus Dave&#8217;s A, B, C case is a perfect example of what I would call an LCD situation &#8211; in the second instance. Here the search is issued only as a full text search, because that is the only functionality supported by all three Sources. The other instance (where only A &amp; B are searched) is dealt with below.</p>
<p>A couple of comments need to attach to this. Firstly it is interesting to ponder that the Fed search system must know enough about the different Sources to be able to determine what the &#8220;common&#8221; functions are. If it can do this, it is halfway(-ish) towards being able to handle Source Specific Searches (SSS). So why dumb down? Who knows? I don&#8217;t.</p>
<p>The second point is that the first of Dave&#8217;s instances (search A&amp;B only) uses what we call a &#8220;strict&#8221; mode for the search. That is: if the Source can&#8217;t handle the search in it&#8217;s entirety, then fail for that Source, and return 0 results. This is not LCD, but rather a strange form of &#8220;HCF&#8221; (if we want to stick with basic arithmetic acronyms) where only those Sources which meet ALL the requirements (support all the functionality) are sorted. Different category, but a problem none-the-less for certain types of searches.</p>
<p>Where non-LCD searching is possible (SSS as used above) then it is possible to switch to a &#8220;relaxed&#8221; mode and allow some portion of the search to be processed and produce at least some results. (You want details &#8211; I knew you would. OK, the most obvious example is where a Source does not support a particular index and any terms for that Source are mapped to one it does support &#8211; in 99% of cases this is &#8220;keyword&#8221; or its functional equivalent. )</p>
<p>This mapped and relaxed operation does allow results from Sources which would be &#8220;0&#8243;s to a strict search and, in our experience, most people would rather get something back which is somewhat like what they asked for, rather than nothing. Often because they weren&#8217;t too sure about the query in the first place. Again a good result for some case, but not all. However take the union of the two cases and you cover what virtually everybody would want &#8211; and what you would get from the native Sources &#8211; always an important touchstone. (And this whole comment raise a host more issues for Sol to tease out into other posts &#8211; like the issue of notifying users of what the search has done.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sol</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28203</link>
		<dc:creator>Sol</dc:creator>
		<pubDate>Wed, 24 Jun 2009 19:53:18 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28203</guid>
		<description>Everyone,

I think of the myth as the blanket statement that federated search engines rollover/suck because some sources are not easy to search well. 

But, I&#039;m interested to know what you all think LCD means. Abe and I discussed it for a while and realized that we weren&#039;t quite sure what it means. So, please tell me.

At the same time there&#039;s a different discussion about how some federated search engines handle tricky sources better than others. I&#039;ll do what I can to find examples of smart connectors searching sources particularly well.</description>
		<content:encoded><![CDATA[<p>Everyone,</p>
<p>I think of the myth as the blanket statement that federated search engines rollover/suck because some sources are not easy to search well. </p>
<p>But, I&#8217;m interested to know what you all think LCD means. Abe and I discussed it for a while and realized that we weren&#8217;t quite sure what it means. So, please tell me.</p>
<p>At the same time there&#8217;s a different discussion about how some federated search engines handle tricky sources better than others. I&#8217;ll do what I can to find examples of smart connectors searching sources particularly well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28199</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Wed, 24 Jun 2009 16:49:39 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28199</guid>
		<description>Not surprisingly, I agree with Dave and Jonathan. I’d love to see a live example of a federated search engine that addresses the LCD problem. I agree that it’s possible in theory, but I’d like to see it done in practice!</description>
		<content:encoded><![CDATA[<p>Not surprisingly, I agree with Dave and Jonathan. I’d love to see a live example of a federated search engine that addresses the LCD problem. I agree that it’s possible in theory, but I’d like to see it done in practice!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28158</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Tue, 23 Jun 2009 13:12:35 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28158</guid>
		<description>I meant &quot;providing the kinds of advanced features you describe&quot;, not &quot;you require&quot;, above. 

I am also curious to see any particular examples you can point to of federated search providers that manage to do well what you describe in this post.</description>
		<content:encoded><![CDATA[<p>I meant &#8220;providing the kinds of advanced features you describe&#8221;, not &#8220;you require&#8221;, above. </p>
<p>I am also curious to see any particular examples you can point to of federated search providers that manage to do well what you describe in this post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28157</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Tue, 23 Jun 2009 13:11:26 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28157</guid>
		<description>Yeah, I&#039;m still a believer in some amount of LCD effect too. 

Providing the kind of advanced features you require is exceedingly difficult (meaning expensive), and if you want to cover a large range of sources will take constant (expensive) maintenance. 

And then, even when you&#039;ve done your best, searchers can still receive unexpected results -- even in the simplest case of the author search you give, both options are somewhat undesirable. Leaving a source out, the searcher may not realize the source has been left out and may think she has searched a source that in fact remains un-examined. Searching full text when the user asked for an &#039;author&#039; search may produce results that in fact don&#039;t meet the specifications of the search, frustrating the user.  And then there&#039;s sophisticated searching criteria like Dave mentions, a more common example might be various kinds of controlled vocabularies, from MeSH terms to molecular designations. 

I&#039;m a big believer in federated search as an important service to scholarly researchers, because the convenience justifies the flaws. But there certain inherent flaws that are difficult or impossible to get around.</description>
		<content:encoded><![CDATA[<p>Yeah, I&#8217;m still a believer in some amount of LCD effect too. </p>
<p>Providing the kind of advanced features you require is exceedingly difficult (meaning expensive), and if you want to cover a large range of sources will take constant (expensive) maintenance. </p>
<p>And then, even when you&#8217;ve done your best, searchers can still receive unexpected results &#8212; even in the simplest case of the author search you give, both options are somewhat undesirable. Leaving a source out, the searcher may not realize the source has been left out and may think she has searched a source that in fact remains un-examined. Searching full text when the user asked for an &#8216;author&#8217; search may produce results that in fact don&#8217;t meet the specifications of the search, frustrating the user.  And then there&#8217;s sophisticated searching criteria like Dave mentions, a more common example might be various kinds of controlled vocabularies, from MeSH terms to molecular designations. </p>
<p>I&#8217;m a big believer in federated search as an important service to scholarly researchers, because the convenience justifies the flaws. But there certain inherent flaws that are difficult or impossible to get around.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Lemen</title>
		<link>http://federatedsearchblog.com/2009/06/22/the-lowest-common-denominator-myth/comment-page-1/#comment-28137</link>
		<dc:creator>Dave Lemen</dc:creator>
		<pubDate>Tue, 23 Jun 2009 02:35:53 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=683#comment-28137</guid>
		<description>But there is a real LCD problem with federated search. You can argue that it is reduced in some circumstances, but I don&#039;t think it helps to pretend it doesn&#039;t exist. 

Let&#039;s say I federate full-text queries to sources A, B, and C. Sources A and B can filter on a geographic bounding box, but source C cannot. If a searcher gives me a full-text query term and a bounding box, I can only include results from A and B. If the searcher needs to query all three, she&#039;s limited to full-text. 

Federated search may still be very useful in this scenario, but the LCD problem needs to be recognized and understood.</description>
		<content:encoded><![CDATA[<p>But there is a real LCD problem with federated search. You can argue that it is reduced in some circumstances, but I don&#8217;t think it helps to pretend it doesn&#8217;t exist. </p>
<p>Let&#8217;s say I federate full-text queries to sources A, B, and C. Sources A and B can filter on a geographic bounding box, but source C cannot. If a searcher gives me a full-text query term and a bounding box, I can only include results from A and B. If the searcher needs to query all three, she&#8217;s limited to full-text. </p>
<p>Federated search may still be very useful in this scenario, but the LCD problem needs to be recognized and understood.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

