<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Daniel Tunkelang on the problem with federated search</title>
	<atom:link href="http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/</link>
	<description>Covers topics related to federated search and the deep web</description>
	<lastBuildDate>Thu, 15 Mar 2012 12:26:10 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: linkdump</title>
		<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/comment-page-1/#comment-27908</link>
		<dc:creator>linkdump</dc:creator>
		<pubDate>Tue, 16 Jun 2009 23:27:08 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=598#comment-27908</guid>
		<description>&lt;strong&gt;Daniel Tunkelang on the problem with federated search...&lt;/strong&gt;

 via Federated Search - (Daniel Tunkelang):[ Editor&#039;s Note: This is a guest article by Daniel Tunkelang. (See his bio below.) Daniel is passionate about designing search systems that improve users&#039; experience with information retrieval. This passion ...</description>
		<content:encoded><![CDATA[<p><strong>Daniel Tunkelang on the problem with federated search&#8230;</strong></p>
<p> via Federated Search &#8211; (Daniel Tunkelang):[ Editor&#8217;s Note: This is a guest article by Daniel Tunkelang. (See his bio below.) Daniel is passionate about designing search systems that improve users&#8217; experience with information retrieval. This passion &#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/comment-page-1/#comment-27886</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Tue, 16 Jun 2009 01:34:34 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=598#comment-27886</guid>
		<description>Peter, point taken: it is certainly possible to integrate with each source on a case-by-case basis and thus get beyond the least-common-denominator approach--though even there there is still the issue of amalgamating structure from the different sources. I&#039;d love to see a live example of a federated search engine that does this well. I agree that it&#039;s possible--indeed, I&#039;d like to see it done!</description>
		<content:encoded><![CDATA[<p>Peter, point taken: it is certainly possible to integrate with each source on a case-by-case basis and thus get beyond the least-common-denominator approach&#8211;though even there there is still the issue of amalgamating structure from the different sources. I&#8217;d love to see a live example of a federated search engine that does this well. I agree that it&#8217;s possible&#8211;indeed, I&#8217;d like to see it done!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Noerr</title>
		<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/comment-page-1/#comment-27862</link>
		<dc:creator>Peter Noerr</dc:creator>
		<pubDate>Mon, 15 Jun 2009 17:05:37 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=598#comment-27862</guid>
		<description>The idea of Sources returning their faceted analysis of a set of results is interesting. And it of course happens in practice. The idea that they could return some of the &quot;reasoning&quot; behind the facets is very interesting. And is not happening so far as I know.

Facets are an attempt to extract the semantics of a set of results. And they work pretty well whether pre-coordinated against a vocabulary of some sort, or are just the natural outcome of the retrieved documents. Normalizing these semantics is the big problem as Daniel points out.

Returning not just the facet values, but also the terms which have been used to derive them, does indeed provide for the very useful possibility that the federated search system could derive common facet values across a number of Sources. They would of necessity be &quot;fuzzy&quot; in that the supplied terms would not be co-extensive across the facets across the Sources, but they would probably be fairly good approximations. And much better than nothing at all. 

Processing would be increased all round, but probably not by an unacceptable amount, and the result could be not only a &quot;normalized&quot; set of facet values for the user, but also a set of documents which could be de-duped or clustered on those values. This should lead to a richer set of documents for the user.

I await the data so we can process it and see.</description>
		<content:encoded><![CDATA[<p>The idea of Sources returning their faceted analysis of a set of results is interesting. And it of course happens in practice. The idea that they could return some of the &#8220;reasoning&#8221; behind the facets is very interesting. And is not happening so far as I know.</p>
<p>Facets are an attempt to extract the semantics of a set of results. And they work pretty well whether pre-coordinated against a vocabulary of some sort, or are just the natural outcome of the retrieved documents. Normalizing these semantics is the big problem as Daniel points out.</p>
<p>Returning not just the facet values, but also the terms which have been used to derive them, does indeed provide for the very useful possibility that the federated search system could derive common facet values across a number of Sources. They would of necessity be &#8220;fuzzy&#8221; in that the supplied terms would not be co-extensive across the facets across the Sources, but they would probably be fairly good approximations. And much better than nothing at all. </p>
<p>Processing would be increased all round, but probably not by an unacceptable amount, and the result could be not only a &#8220;normalized&#8221; set of facet values for the user, but also a set of documents which could be de-duped or clustered on those values. This should lead to a richer set of documents for the user.</p>
<p>I await the data so we can process it and see.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Noerr</title>
		<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/comment-page-1/#comment-27858</link>
		<dc:creator>Peter Noerr</dc:creator>
		<pubDate>Mon, 15 Jun 2009 16:47:48 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=598#comment-27858</guid>
		<description>On the topic of &quot;LCD&quot; searches.

Daniel states: &quot;In particular, this approach to federation necessarily assumes a lowest common denominator of search functionality–a consequence of the requirement to evenhandedly broker among a variety of search applications that vary in the richness of their APIs&quot; 

This assumption of LCD searching is taken as an obvious truism. But why? 

If federated search systems are presumed to be capable of handling multiple record formats for data extraction from retrieved records, then why should they not be considered capable of generating Source specific search statements?

The reason (as for federated search itself) seems to be that most don&#039;t because it is yet another messy thing to deal with on a Source by Source basis. Note that here I am talking about more than adding blanks and quotes to a search statement.

The next step is to be more aware of the actual search syntax. Is the index for an author search represented by &quot;au=&quot; or &quot;/au&quot; or some other string? This is a fairly common, but not universal, capability - many systems restrict their interaction to the language of standards such as RPN for Z39.50, or a commonly implemented search language such as Open Search. 

Moving to proprietary APIs, and also to web search interfaces, provides the much larger variety Daniel posits as beyond the reach of federated search systems.

It ain&#039;t necessarily so. There are federated search systems which match the search to the capabilities of the Source engine. Applying limits where they exist, using indices where they exist, mapping to alternatives where the requested function is not available. All under user control for the desired strictness of the query. Acting very much like a user would in the same circumstances - adapting the query to what the Source can handle.

This addresses a second point in the quote above: that the desire is to have an identical query sent to all Sources, rather than the &quot;best&quot; one each can handle. Why? It is contrary to the way users act. They attempt to get the best results from each Source adapting to what it offers in the way of search tools. Surely federated search systems should try to do no less.

Admittedly, very few federated search systems do go to these lengths, but some do and, of course, we believe that Muse provides one of the more advanced capabilities of this type, or I wouldn&#039;t be mentioning it. 

The bottom line is that some federated search systems do adapt the searches to the richness, or otherwise, of the Sources they access. The LCD approach is not a necessary evil, but a chosen one.</description>
		<content:encoded><![CDATA[<p>On the topic of &#8220;LCD&#8221; searches.</p>
<p>Daniel states: &#8220;In particular, this approach to federation necessarily assumes a lowest common denominator of search functionality–a consequence of the requirement to evenhandedly broker among a variety of search applications that vary in the richness of their APIs&#8221; </p>
<p>This assumption of LCD searching is taken as an obvious truism. But why? </p>
<p>If federated search systems are presumed to be capable of handling multiple record formats for data extraction from retrieved records, then why should they not be considered capable of generating Source specific search statements?</p>
<p>The reason (as for federated search itself) seems to be that most don&#8217;t because it is yet another messy thing to deal with on a Source by Source basis. Note that here I am talking about more than adding blanks and quotes to a search statement.</p>
<p>The next step is to be more aware of the actual search syntax. Is the index for an author search represented by &#8220;au=&#8221; or &#8220;/au&#8221; or some other string? This is a fairly common, but not universal, capability &#8211; many systems restrict their interaction to the language of standards such as RPN for Z39.50, or a commonly implemented search language such as Open Search. </p>
<p>Moving to proprietary APIs, and also to web search interfaces, provides the much larger variety Daniel posits as beyond the reach of federated search systems.</p>
<p>It ain&#8217;t necessarily so. There are federated search systems which match the search to the capabilities of the Source engine. Applying limits where they exist, using indices where they exist, mapping to alternatives where the requested function is not available. All under user control for the desired strictness of the query. Acting very much like a user would in the same circumstances &#8211; adapting the query to what the Source can handle.</p>
<p>This addresses a second point in the quote above: that the desire is to have an identical query sent to all Sources, rather than the &#8220;best&#8221; one each can handle. Why? It is contrary to the way users act. They attempt to get the best results from each Source adapting to what it offers in the way of search tools. Surely federated search systems should try to do no less.</p>
<p>Admittedly, very few federated search systems do go to these lengths, but some do and, of course, we believe that Muse provides one of the more advanced capabilities of this type, or I wouldn&#8217;t be mentioning it. </p>
<p>The bottom line is that some federated search systems do adapt the searches to the richness, or otherwise, of the Sources they access. The LCD approach is not a necessary evil, but a chosen one.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guest Post at the Federated Search Blog &#124; The Noisy Channel</title>
		<link>http://federatedsearchblog.com/2009/06/12/daniel-tunkelang-on-the-problem-with-federated-search/comment-page-1/#comment-27716</link>
		<dc:creator>Guest Post at the Federated Search Blog &#124; The Noisy Channel</dc:creator>
		<pubDate>Fri, 12 Jun 2009 21:06:34 +0000</pubDate>
		<guid isPermaLink="false">http://federatedsearchblog.com/?p=598#comment-27716</guid>
		<description>[...] wrote a guest post at Sol Lederman&#8217;s Federated Search blog entitled &#8220;The Problem with Federated Search&#8220;. Here&#8217;s an excerpt: The case for federated search is straightforward: no single [...]</description>
		<content:encoded><![CDATA[<p>[...] wrote a guest post at Sol Lederman&#8217;s Federated Search blog entitled &#8220;The Problem with Federated Search&#8220;. Here&#8217;s an excerpt: The case for federated search is straightforward: no single [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

