Archive for April, 2008


2collab is a social bookmarking site that was introduced late last year. If it were just another bookmarking site it wouldn’t be all that interesting. Peter Scott’s Library Blog has a succinct description of 2collab that should make readers of this blog curious:

2collab is a social bookmarking site where you can store and organize your favorite internet resources - such as blogs, websites, research articles, and more. Then, in private or public groups you can decide to share your bookmarks with others - stimulating debate and discussion. Members of groups can evaluate these resources (by rating bookmarks, tagging and adding comments), or add their own bookmarks. You can browse public groups and bookmarks, but must register (your name and email address) to access the full functionality – such as creating groups, adding comments, and adding bookmarks.” 2collab is a free service from Elsevier, initiated by a collaboration between Scopus and ScienceDirect.

Read the rest of this entry »


A statement in a blog post at Science Library Pad caught my attention. The post, titled “availability, discovery, and delivery - redux,” focuses on the question of how well researchers are able to access the full text of documents they find in search results. The author sees this as a major problem and makes this attention-getting statement:

I’m not convinced that we’re doing a particularly good job of addressing these fundamental challenges even after years of working on proxies, federated search, link resolvers, and “live in your environment” plugins and external website settings.

For those who aren’t familiar with proxies, I wrote about proxy servers and federated search in February. Link resolvers, also called URL resolvers, are worth dedicating an entire post to but here’s the gist of what they do: When a user performs a search, sees a result list, and clicks on a result to view a scholarly article, the URL that the user is sent to when he clicks on the link is intercepted by the federated search application and possibly replaced with a link to a version of the document that the library has licensed rather than the original “for pay” link.

Read the rest of this entry »


I wrote recently about Ellie’s review of Roy Tennant’s talk for the Texas Library Association. At the time, I couldn’t find a copy of Tennant’s presentation. So, I contacted Mr. Tennant, and he sent me a link to the presentation with a note that many of the slides are screenshots and, without the context of the verbal presentation, may be cryptic.

Cryptic slides aside, there is very useful material in a number of the slides.



Ellie of the Ellie <3 Libraries Blog recently wrote a remarkably comprehensive summary, with commentary, of Roy Tennant’s “The Future of Catalogs” presentation for the TLA (Texas Library Association). The gist of her review is that the monolithic library catalog (OPAC) is dying and is being replaced with tools that foster discovery, integration of disparate sources, and Web 2.0 elements such as sharing of information (for getting resource recommendations.)

The world is changing. Library patrons are global citizens. It doesn’t serve the patrons for libraries to remain islands and to cling tightly to their piece of global content. The future, in my view, is Web 2.0 and beyond. More sharing, more collaboration, more mashups, more multimedia, and more global. And, at the same time, everything should become more simple for the user.

Read the rest of this entry »


If you were to develop a federated search product for mobile phones, what characteristics would the product have?

I’ve been noticing quite a buzz on the web about the federated mobile search market so I’m curious to know how this niche is being served and how it could be better served. To be honest, I know very little about the intersection of mobile phones and federated search; I’ll be learning along with, or from, readers of this blog.

Read the rest of this entry »


In January, I wrote a primer about clustering. I explained that:

… clustering is the automatic organization of search results into sets of results that have something in common. Some search engines and some federated search engines provide clustering features.

I also introduced faceted search, also known as faceted navigation:

This technology guides a user to relevant content by organizing search results in a hierarchical structure and providing labeled choices of paths in the hierarchy to follow. A faceted search system might have a series of pulldown menus that guide a user from the broad category of “Iraq” to “Iraq -> Geography”, to “Iraq -> Geography -> Maps” to “Iraq -> Geography -> Maps -> Baghdad.” Endeca is one vendor that provides faceted searching.

Read the rest of this entry »


Last month I gave away three copies of Christopher Cox’s book, Federated Search: Solution or Setback for Online Library Services, in exchange for reviews to be published on this blog. The books were kindly donated by Taylor & Francis.

Three volunteers stepped up and I have commitments to review these essays in the coming weeks:

  1. Build It (and Customize and Market It) and They Will Come
  2. Challenges for Federated Searching
  3. Integrating Library Services: A Proposal to Enable Federation of Information and User Services
  4. User Expectations in the Time of Google
  5. User Perceptions of MetaLib Combined Search
  6. Initiating the Learning Process
  7. Librarian Perspective on Teaching Metasearch and Federated Search Technologies
  8. Developing the right RFP for selecting your federated search product
  9. Planning and implementing a Federated Searching System
  10. SRU, Open Data and the future of Metasearch

Read the rest of this entry »


New Idea Engineering has an in-depth article, “20+ Differences Between Internet vs. Enterprise Search - And Why You Should Care (Part 1)”, all about how the popular search engines find content and how enterprise search engines should do it differently. The article provides a rather lengthy analysis of a number of issues that those in the market for enterprise search would benefit from understanding. In a nutshell, a search appliance is not going to be very effective in any but the smallest enterprises. Of interest to readers of this blog is a discussion of areas of concern to potential customers of federated search.

Read the rest of this entry »


The blogosphere is buzzing with posts about Google starting to look for content to index behind web forms. Google has an announcement about their experiment in the Google Webmaster Central Blog. The date of the announcement is April 11 so I guess this isn’t an April Fools’ joke. ComputerWorld wrote about the announcement, as did search engine land, Google Blogoscoped, and others.

Google explains in their announcement that they look for HTML FORM tags in “a small number of particularly useful sites.” In a nutshell, here is what Google says it’s doing when it finds web pages with “FORM” tags:

… we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page.

Read the rest of this entry »


Lorcan Dempsey has, what I find to be, a visionary blog article, The two ways of Web 2.0. Dempsey writes about Web 2.0. He sees two major ways that Web 2.0 is being used and he doesn’t see many of us making the distinction. He has coined the terms diffusion and concentration to describe these two ways.

Diffusion, Dempsey explains, is about connectivity among people, applications and data. Think blogs, RSS, and social networking sites. Concentration is about what Dempsey refers to as “major gravitational hubs” which include sites that contain or aggregate large volumes of content. Diffusion, Dempsey believes, is the more dominant of the two ways of Web 2.0.

Read the rest of this entry »