Federated Search Blog (2)Federated Search
10
Mar

Carl Grant recently published an article, Are librarians choosing to disappear from the information & knowledge delivery process?, at the CARE Affiliates Blog. It reads in part:

As librarians, we frequently strive to connect users to information as seamlessly as possible. A group of librarians said to me recently: “As librarian intermediation becomes less visible to our users/members, it seems less likely it is that our work will be recognized. How do we keep from becoming victims of our own success?”

This is certainly not an uncommon question or concern. As our library collections have become virtual and as we increasingly stop housing the collections we offer, there is a tendency to see us as intermediaries serving as little more than pipelines to our members. We have to think about where we’re adding value to that information so that when delivered to the user/member that value is recognized. Then we need to make that value part of our brand. Otherwise, as stated by this concern, librarians become invisible and that seems to be an almost assured way to make sure our funding does the same. As evidenced by this recently updated chart on the Association of Research Libraries website, this seems to be the track we are on currently:

The chart is not pretty if you’re a librarian trying to justify your existence. But, on a positive note, after you’ve gotten past the depressing chart Carl Grant lists seven suggestions for products the library world should be providing to patrons.

I recommend this article as a sobering read with a positive spin.

3
Mar

[ This article was originally published in the Deep Web Technologies Blog. ]

The highly regarded Charleston Advisor, known for its “Critical reviews of Web products for Information Professionals,” has given Deep Web Technologies 4 3/8 of 5 possible stars for its Explorit federated search product. The individual scores forming the composite were:

  • Content: 4 1/2 stars
  • User Interface/Searchability: 4 1/2 stars
  • Pricing: 4 1/2 stars
  • Contract Options: 4 stars

The scores were assigned by two reviewers who played a key role in bringing Explorit to Stanford University:

  • Grace Baysinger, Head Librarian and Bibliographer at the Swain Chemistry and Chemical Engineering Library at Stanford University
  • Tom Cramer, Chief Technology Strategist at Stanford University Libraries and Academic Information Resources

Read the rest of this entry »

18
Feb

The Harvard Library Innovation Laboratory at the Harvard Law School posted a link to a 23-minute podcast interview with Sebastian Hammer. Hammer is the president of Index Data, a company in the information retrieval space, including federated search.

Update 4/3/12: A transcript of the interview is here.

Hammer was interviewed about the challenges of federated search, which he addressed in a very balanced way. The gist of Hammer’s message is that, yes, there are challenges to the technology but they’re not insurmountable. And, without using the word “discovery service,” Hammer did a fine job of explaining that large indexes are an important component of a search solution but they’re not the entire solution, especially in organizations that have highly specialized sources they need access to.

I was delighted to hear Hammer mention the idea of “super nodes” to allow federated search to scale to thousands of sources. Blog sponsor Deep Web Technologies has used this idea, which they call hierarchical federated search for several years. Several of their applications search other applications which can, in turn, search other applications. In 2009, Deep Web Technologies founder and president Abe Lederman delivered a talk and presented a paper at SLA,
Science Research: Journey to Ten Thousand Source, detailing his company’s proven “divide-and-conquer” approach to federating federations of sources.

I was also happy to hear Hammer speak to the importance of hybrid solutions. Federation is appropriate for gaining access to some content and maintaining a local index works for other content. Neither alone is a complete solution. Deep Web Technologies figured this out some years ago. A good example of hybrid search technology is the E-print Network, a product of the U.S. Department of Energy’s Office of Scientific and Technical Information, (OSTI). Deep Web Technologies built the search technology, which combines information about millions of documents crawled from over 30,000 sites, with federated content. I have been involved with the crawl piece of the E-print Network for a number of years and can testify to the power of the right hybrid solution. In 2008 I wrote a three-part series of articles at OSTI’s blog explaining the technology behind the E-print Network. Part One is here.

In conclusion, I highly recommend the podcast for a good reminder that federated search isn’t dead and that it’s an important part of search.

14
Feb

Deep Web Technologies president, founder, and CTO Abe Lederman shares some thoughts on discovery services at the Deep Web Technologies Blog.

5
Feb

Multilingual federated search, the ability to search and to view results from foreign language sources in your own language, may be just an interesting idea to some but there is a strategic value to the technology. Consider this article published by the BBC in March of 2011: China ‘to overtake US on science’ in two years. If the prediction of the UK’s national science academy, the Royal Society, proves true then sometime next year China will produce scientific research papers at a faster rate than the current leader, the U.S.

Researchers in the English-speaking world have mostly been restricted to searching only English language sources since the tools for simultaneously searching foreign language sources and for performing the translations haven’t existed until recently. Thus, opportunities to search scholarly journals in Chinese, Japanese, Portuguese and other languages associated with countries producing a great volume of science output are being missed. In an economic climate where performing research and getting products to market quickly translates to that competitive edge that leads to greater profits, being able to scour the research Web quickly, effectively, efficiently, and on an ongoing basis is critical to developing and maintaining a competitive edge.

Blog sponsor Deep Web Technologies has developed a patent pending multilingual search version of its Explorit federated search application that integrates the search and translation technologies making for a seamless and productive research environment for scientists, engineers, and researchers in business, science, and technology.

Read the rest of this entry »

1
Jul

[ Editor’s note: Blog sponsor Deep Web Technologies has announced important enhancements to its federated search technology that allows its Explorit Research Accelerator product to go deeper into the deep Web than ever before. ]

Researchers can now search text, audio, video and images in multiple languages

SANTA FE, N.M., June 21, 2011 /PRNewswire/ — Deep Web Technologies?, the leader in federated search of the Deep Web, today announced full integration of multilingual and multimedia search into the company’s market-leading Explorit? Research Accelerator. The patent-pending multilingual search capability is the first such feature ever offered for Deep Web search.

Multilingual federated search, unveiled June 11, 2011 in Helsinki at the International Council for Scientific and Technical Information’s Summer Conference and originally only available as a beta release to users of the WorldWideScience.org gateway to global science, is now available to all Deep Web Technologies customers who require seamless access to foreign language documents. Explorit’s multilingual search capability translates a user’s search query into the native languages of the collections being searched, aggregates and ranks these results according to relevance, and translates result titles and snippets back to the user’s original language. The multilingual translation functionality, powered by Microsoft?, makes it simple to search collections in multiple languages from a single search box in the user’s native language.

Multimedia federated search, first introduced in the WorldWideScience.org and ScienceAccelerator.gov portals, allows for seamless integration of audio, video, and image content sources into Explorit. WorldWideScience.org searches seven multimedia sources: CDC Podcasts, CERN Multimedia, Medline Plus, NASA, NSF, NBII LIFE, and ScienceCinema. ScienceCinema is an exciting example of the ability to search speech indexed multimedia content. The DOE Office of Scientific and Technical Information (OSTI) developed ScienceCinema in partnership with Microsoft. When multimedia sources are included in an Explorit search, images and links to multimedia content can be presented alongside text results or in a separate results tab.

More

15
Jun

[ Editor’s note: This article was first published in the Deep Web Technologies Blog. ]

WorldWideScience is a global science gateway that combines national and international scientific databases into a search engine. From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher’s native language) and seven multimedia sources. WorldWideScience is the brainchild of the director of the DOE Office of Scientific and Technical Information (OSTI), Dr. Walt Warnick. The gateway is maintained and hosted by OSTI and governed by the WorldWideScience Alliance.

Deep Web Technologies is proud to have developed the federated search technology behind WorldWideScience. And, with the cooperation of the Microsoft Translation services team, Deep Web Technologies also implemented the multilingual technology. It was a major undertaking but a worthwhile one for the science community, whose members can now greatly expand their reach to scientific papers in languages beyond their own.

Dr. Warnick was invited to deliver a presentation at the 14th session of the United Nations’ Commission on Science and Technology (CSTD). In a post at the OSTI Blog, Dr. Warnick shares the warm reception that WorldWideScience received.

I wish more of my OSTI colleagues could have been in Geneva to share the warm response from the attendees. Several country representatives offered up new sources for WorldWideScience (WWS). Another member of the audience searched mobile WWS for his own name and remarked that he found many of his papers. I received enthusiastic comments, so many that I couldn?t address all of them because of time constraints. Significantly, the Chair of CSTD volunteered to pay the costs of becoming a member of the WorldWideScience Alliance. There was great excitement about the possibilities for its use within the home countries of the attendees and how WWS advances the goals of CSTD.

The paper “Breaking down language barriers through multilingual federated search” co-authored by Abe Lederman (founder and president of Deep Web Technologies), and Dr. Warnick, Brian Hitson, and Lorrie Johnson from OSTI, explains the importance of the gateway:

“WorldWideScience.org (WWS) is a global science gateway developed by the US Department of Energy Office of Scientific and Technical Information (OSTI) in partnership with federated search vendor Deep Web Technologies. WWS provides a simultaneous live search of 69 databases from government and government-sanctioned organizations from 66 participating nations. The WWS portal plays a leading role in bringing together the world’s scientists to accelerate the discoveries needed to solve the planet’s most pressing problems. In this paper we present a brief history of the development of WWS and discuss how a new technology, multilingual federated search, greatly increases WWS’ ability to facilitate the advancement of science.”

Deep Web Technologies is delighted to be working with OSTI and other organizations to push the envelope of search technology and to make the world a smaller place.

1
Jun

On search neutrality

Author: Sol

Abe Lederman, founder and president of Deep Web Technologies and sponsor of this blog, wrote an article at the Deep Web Technologies blog: Preparing for ALA Panel and Federated Search Neutrality. Abe discovered this article at beerbrarian about the problem of net neutrality in federated search.

For those of you not familiar with net neutrality, Wikipedia explains it:

Network neutrality (also net neutrality, Internet neutrality) is a principle which advocates no restrictions by Internet service providers or governments on consumers’ access to networks that participate in the internet. Specifically, network neutrality would prevent restrictions on content, sites, platforms, the kinds of equipment that may be attached, or the modes of communication.
. . .
Neutrality proponents claim that telecom companies seek to impose a tiered service model in order to control the pipeline and thereby remove competition, create artificial scarcity, and oblige subscribers to buy their otherwise uncompetitive services. Many believe net neutrality to be primarily important as a preservation of current freedoms. Vinton Cerf, considered a “father of the Internet” and co-inventor of the Internet Protocol, Tim Berners-Lee, creator of the Web, and many others have spoken out in favor of network neutrality.

In the net neutrality battle, consumers worry about telecom companies unfairly biasing the delivery of some content (that which they have business interest in biasing) over the content of others. Add search to the equation and what you get are concerns over whether your search results are sorted by relevance or by the business needs of the search engine company.

Read the rest of this entry »

20
May

Amusing anecdote

Author: Sol

Miles Kehoe at New Idea Engineering’s Enterprise Search Blog tells an entertaining anecdote.

The folks from Booz & Company, a spinoff from Booz Allen Hamilton, did a presentation on their experience comparing two well respected mainstream search products. They report that, at one point, one of the presenters was looking for a woman she knew named Sarah – but she was having trouble remembering Sarah’s last name. The presenter told of searching one of the engines under evaluation and finding that most of the top 60 people returned from the search were… men. None were named ‘Sue’; and apparently none were named Sarah either. The other engine returned records for a number of women named Sarah; and, as it turns out, for a few men as well.

After some frustration, they finally got to the root of the problem. It turns out that all of the Booz & Company employees have their resumes indexed as part of their profiles. Would you like to guess the name of the person who authored the original resume template? Yep – Sarah.

This is a great example of “garbage in, garbage out!” Meta data is only as good as the humans who curate it (or the machines who try to guess at it.) Thanks for the Friday chuckle, Miles!

5
May

I’ve always thought of personalization as a good thing. If Google knows something about me then it can provide results that I’ll find more relevant, right?

Watch this TED talk by Eli Pariser and, like me, you might start having second thoughts.

Pariser is former executive director of MoveOn and is now a senior fellow at the Roosevelt Institute. His book The Filter Bubble is set for release May 12, 2011. In it, he asks how modern search tools — the filter by which many of see the wider world — are getting better and better and screening the wider world from us, by returning only the search results it “thinks” we want to see.

Here’s the very thought-provoking first paragraph of the talk:

Mark Zuckerberg, a journalist was asking him a question about the news feed. And the journalist was asking him, “Why is this so important?” And Zuckerberg said, “A squirrel dying in your front yard may be more relevant to your interests right now than people dying in Africa.” And I want to talk about what a Web based on that idea of relevance might look like.

Read the rest of this entry »