2009 March | Federated Search BlogFederated Search

Archive for March, 2009

30
Mar

[ Editor’s note: In this guest article, Carl Grant adds his contribution to the discussion I started in Beyond Federated Search? and continued in Beyond federated search? The conversation continues. Be sure to read those two articles before reading Carl’s response. Also, check out the comments on the two articles.

Carl Grant is President of Ex Libris North America. With more than a quarter century of experience in the library-automation industry, I’m grateful for his periodic and very popular contributions to this blog. ]

Read the rest of this entry »

27
Mar

Blog sponsor Deep Web Technologies asked me to produce a white paper on why the quality of search results matters so I wrote a four-pager — Quality, Not Quantity: The danger of overlooking quality of search results. I wrote the paper to be easy to read while packing a good amount of information.

The white paper is available from Deep Web Technologies’ web-site as a PDF document. The paper is divided into short sections, some with pithy titles:

  • Why quality of results matters
  • What does “quality of results” mean anyway?
  • Too many results, not enough time
  • It’s not a popularity contest: the dirty little secret of the search engine industry
  • The need for speed and the price you pay
  • The myopic focus on features
  • What really matters
  • How Federated Search fits the bill
  • Not all federated search engines are created equal

Read the rest of this entry »

26
Mar

Computers in Libraries 2009 will be here in just a few days. It runs from March 30 through April 1, in Arlington, Virginia.

I was excited to see the April edition of Computers In Libraries Magazine (in printed form.) In it, there’s a three-page spread about the federated search writing contest and it includes the full text of Rich Turner’s first place essay plus acknowledgement of second and third place winners Steven Bell and Lee LeBlanc. I’ve already published Mr. Bell’s and Mr. LeBlanc’s essays. Mr. Turner’s essay should be available online, at the Computers in Libraries Magazine web-site, in the April edition, in about a week. I’ll let you know as soon as it’s up.

Read the rest of this entry »

23
Mar

This is one of my occasional off-topic posts.

One of my clients, the Office of Scientific and Technical Information (OSTI) has an innovative program to help digitize some of their technical reports that are currently only available in paper format and I thought I’d spread the word.

Adopt-A-Doc? is a service focused on getting full text technical reports from OSTI’s Energy Citations Database digitized. Here’s a description of the Energy Citations Database :

The Energy Citations Database (ECD) provides free access to over 2.3 million science research citations with continued growth through regular updates. There are over 209,000 electronic documents, primarily from 1943 forward, available via the database. Citations and documents are made publicly available by the U.S. Department of Energy (DOE).

ECD includes scientific and technical research results in disciplines of interest to DOE such as chemistry, physics, materials, environmental science, geology, engineering, mathematics, climatology, oceanography, computer science and related disciplines. It includes bibliographic citations to report literature, conference papers, journal articles, books, dissertations, and patents.

Read the rest of this entry »

20
Mar

Yesterday I wrote “Beyond federated search?” where I raised the concern about using services that provide indexed content as a way to bypass federated search and its associated challenges.

Jonathan Rochkind left two thoughtful comments which I’d like to respond to.

Read the rest of this entry »

19
Mar

I think it’s safe to say that, given the choice between searching a content source in real time vs. searching it from an index, we’d all opt for searching the index. This assumes, of course, the index is as current as the content that might be federated. I’ll be the first to admit that federated search is a necessary evil. But, necessary it is. I’ve been hearing people talk about life beyond federated search and I just don’t get it. Until every single content provider makes the full-text of all of their documents that can be federated available for harvesting and indexing, federated search isn’t going away.

Serials Solutions’ new Summon Unified Discovery Service is touted as going beyond federated search. The promotional video boasts how there are no connectors, no inconsistent metadata, and no waiting for results to come back. This is all well and good but how do you deal with quality content sources that are not available through the service?

Read the rest of this entry »

17
Mar

I’ve been noticing that my Biznar alerts for the quoted phrase “federated search” were finding me articles that Google Alerts wasn’t finding. I wanted to quantify this experience — i.e. I wanted to know exactly how many results were exclusive to Biznar. So, I performed a simple experiment. I compared my Biznar alert results to my Google alert results. Specifically, I searched my Gmail account for the title of each of the 21 Biznar alert results that I received for “federated search” at 1:11 this morning to see if Google Alerts had ever (not only today) found me an article with the same title. Note that the most recent Google Alert email for “federated search” came at 11:34 this morning, a bit over 9 1/2 hours later than the Biznar alert.

What was shocking to me was that, of the 21 Biznar alerts I got today, only 4 of them were ever presented by Google Alerts. At the end of this article are the titles (with links), authors, and snippets of the 21 Biznar alerts.

Read the rest of this entry »

13
Mar

I’m always on the lookout for academic articles related to federated search or the deep web to review. I’m embarrassed to not have heard about OAIster until Abe turned me on to it.

If you’re also new to OAIster, here’s a snippet from their About page:

OAIster is a union catalog of digital resources. We provide access to these digital resources by “harvesting” their descriptive metadata (records) using OAI-PMH (the Open Archives Initiative Protocol for Metadata Harvesting). The Open Archives Initiative is not the same thing as the Open Access movement.

The About page goes on to say:

These resources, often hidden from search engine users behind web scripts, are known as the “deep web.” The owners of these resources share them with the world using OAI-PMH.

Read the rest of this entry »

11
Mar

In the fall of 2006, the University of Wyoming Libraries began implementing Serials Solutions’ Central Search (now 360 Search.) Reference librarians Michael Nelson, Mary Ann Harlow, and Cassandra Kvenild chronicle their experience (through completion in January 2007) at AllBusiness.com.

Read the rest of this entry »

9
Mar

The blog is 15 months old now. Not too long ago the blog (momentarily) hit 700 readers. Clearly enough of you find the blog to have sufficient value that you subscribe. But, I’m not satisfied to have a respectable reader base. I’d like to see the blog grow in a couple of areas: input from vendors and experiences of people implementing federated search solutions.

I’ll be the first to admit that I’ve been mostly exposed to one federated search product. You can read more about me in my About page. My experience with federated search is fairly deep, but it’s with a single vendor, blog sponsor Deep Web Technologies. I worked there for five years and have consulted with them for a year after starting a consulting business. I’m familiar with many of the technical issues that customers face because I wore enough hats at Deep Web Technologies to encounter many of these issues. I built (simple) connectors. I deployed their products, I did troubleshooting. I managed engineers who worked with the technology and customers.

Read the rest of this entry »