Archive for the "Uncategorized" Category

15
Jun

[ Editor's note: This article was first published in the Deep Web Technologies Blog. ]

WorldWideScience is a global science gateway that combines national and international scientific databases into a search engine. From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher’s native language) and seven multimedia sources. WorldWideScience is the brainchild of the director of the DOE Office of Scientific and Technical Information (OSTI), Dr. Walt Warnick. The gateway is maintained and hosted by OSTI and governed by the WorldWideScience Alliance.

Deep Web Technologies is proud to have developed the federated search technology behind WorldWideScience. And, with the cooperation of the Microsoft Translation services team, Deep Web Technologies also implemented the multilingual technology. It was a major undertaking but a worthwhile one for the science community, whose members can now greatly expand their reach to scientific papers in languages beyond their own.

Dr. Warnick was invited to deliver a presentation at the 14th session of the United Nations’ Commission on Science and Technology (CSTD). In a post at the OSTI Blog, Dr. Warnick shares the warm reception that WorldWideScience received.

I wish more of my OSTI colleagues could have been in Geneva to share the warm response from the attendees. Several country representatives offered up new sources for WorldWideScience (WWS). Another member of the audience searched mobile WWS for his own name and remarked that he found many of his papers. I received enthusiastic comments, so many that I couldn?t address all of them because of time constraints. Significantly, the Chair of CSTD volunteered to pay the costs of becoming a member of the WorldWideScience Alliance. There was great excitement about the possibilities for its use within the home countries of the attendees and how WWS advances the goals of CSTD.

The paper “Breaking down language barriers through multilingual federated search” co-authored by Abe Lederman (founder and president of Deep Web Technologies), and Dr. Warnick, Brian Hitson, and Lorrie Johnson from OSTI, explains the importance of the gateway:

“WorldWideScience.org (WWS) is a global science gateway developed by the US Department of Energy Office of Scientific and Technical Information (OSTI) in partnership with federated search vendor Deep Web Technologies. WWS provides a simultaneous live search of 69 databases from government and government-sanctioned organizations from 66 participating nations. The WWS portal plays a leading role in bringing together the world’s scientists to accelerate the discoveries needed to solve the planet’s most pressing problems. In this paper we present a brief history of the development of WWS and discuss how a new technology, multilingual federated search, greatly increases WWS’ ability to facilitate the advancement of science.”

Deep Web Technologies is delighted to be working with OSTI and other organizations to push the envelope of search technology and to make the world a smaller place.

If you enjoyed this post, make sure you subscribe to the RSS feed!

1
Apr

This post might seem a bit off topic but it’s not really.

Google is getting on the Kinect bandwagon with the introduction of spatial tracking technology into Gmail.

How it works

Gmail Motion uses your computer’s built-in webcam and Google’s patented spatial tracking technology to detect your movements and translate them into meaningful characters and commands. Movements are designed to be simple and intuitive for people of all skill levels.

More information is at this URL and in this video:

Wouldn’t it be cool if federated search apps could integrate movement technology so seamlessly!?

If you enjoyed this post, make sure you subscribe to the RSS feed!

20
Feb

I recently discovered an article, 5 Reasons Not to Use Google First, that sings my song. The article addresses this question:

Google is fast, clean and returns more results than any other search engine, but does it really find the information students need for quality academic research? The answer is often ‘no’. “While simply typing words into Google will work for many tasks, academic research demands more.” (Searching for and finding new information – tools, strategies and techniques)

The next paragraph gave me a chuckle.

As far back as 2004, James Morris, Dean of the School of Computer Science at Carnegie Mellon University, coined the term “infobesity,” to describe “the outcome of Google-izing research: a junk-information diet, consisting of overwhelming amounts of low-quality material that is hard to digest and leads to research papers of equally low quality.” (Is Google enough? Comparison of an internet search engine with academic library resources.)

The article continues with its list of five good reasons to not use Google first.

Note that the recommendation isn’t to skip Google altogether. There’s a balance that’s needed to get the best value when performing research. The findings in the “Is Google enough?” article summarizes this point really well:

Google is superior for coverage and accessibility. Library systems are superior for quality of results. Precision is similar for both systems. Good coverage requires use of both, as both have many unique items. Improving the skills of the searcher is likely to give better results from the library systems, but not from Google.

If you enjoyed this post, make sure you subscribe to the RSS feed!

31
Jan

[ This is a republication of the article, "Deep Web Tech in the News: Image Search" that was published in the Deep Web Technologies Blog. Note that Deep Web Technologies sponsors the Federated Search Blog and that I consult for the organization, OSTI, that stewards Science.gov. ]

Deep Web Tech in the News: Image Search

One small step for Science.gov, one giant leap for Federated Search.

“Science.gov is a gateway to more than 42 scientific databases and 200 million pages of science information with just one query, and is a gateway to more than 2,000 scientific websites from 18 organizations within 14 federal science agencies. These agencies represent 97% of the federal R&D budget. Science.gov is the USA.gov portal to science and the U.S. contribution to WorldWideScience.org. Science.gov is hosted by the Department of Energy Office of Scientific and Technical Information, within the Office of Science, and is supported by CENDI, an interagency working group of senior scientific and technical information managers.”

Science.gov received a pretty large upgrade in December, the image search is located under “special collections” and works just like science.gov except the results have thumbnails (www.science.gov/scigovimage/). The search query now quickly pulls back related images from multiple sources into a thumbnail size result. This is one of very few publicly available science image search portals. Cheryl LaGuardia, an industry critic, wrote:

For a free service this works mighty well: my test search for “tornedo” got the reply, “Did you mean “tornado”? with 151 results for the corrected spelling (a test, mind you, or perhaps I’m easing back into work slowly and may have inadvertently misspelled… no matter! The system works!). The resultant images are terrific, compelling enough to send Dorothy pedaling madly down the road away from them on her bicycle, with Toto in tow.

Deep Web Technologies powers the entire website, and we look forward to using this innovation on other projects in the future.

If you enjoyed this post, make sure you subscribe to the RSS feed!

22
Dec

Andrew Pace, Executive Director for Networked Library Services at OCLC, penned this very catchy jingle for your Christmas singing pleasure:

My server got run over by a cloud app
Quicken, Word, and CRMs all grieve
I’m lovin’ Google, Mint-dot-com, and Salesforce
Like Zuckerberg and cnet, I believe

It’s like software as a service
It’s second nature to the kids
Any metaphor will work here
Clouds, architecture, rent, or power grids

I read ebooks on my handheld
And I can bank while in the loo
All my data’s on the network
At home, at work, in church, or Timbuktu

Click here for the whole song.

And, if you can’t get enough of the catchy jingle check out this iPhone app.

Hat tip to Roy Tennant.

If you enjoyed this post, make sure you subscribe to the RSS feed!

7
Nov

Information Today columnist Don Hawkins recently published “A Blunt Assessment of Search Discovery Tools.” Hawkins highlights some concerns that the Montana State University library raised with two discovery services that they experimented with, WorldCat Local and Summon.

WorldCat Local

  • Many records didn’t have OCLC numbers so did not show up in the database.
  • Some known items (mainly government documents) were not found.

Summon

  • The vendor promised a simple implementation, but loading one digital collection was slow.
  • Problems occurred in the details: deleted items continued to appear; known item searches may not work.
  • Database name searches may require an exact match.
  • One experiment resulted in a 29% failure rate for a subject search.
  • Sometimes discovery tools search the full text, but not always, and we don’t know when they do.
  • Relevance is not good yet.

Ex Libris Chief Librarian Carl Grant raised concerns of his own, but from a different slant. In “Gladiators” to perform sleight-of-hand at Charleston Conference.” Grant makes a pretty strong assertion, referring to EBSCO and Serials Solutions:

These two particular firms are, as Library Journal says, in the “greatest competition” because they are, first and foremost, publishers/aggregators fighting head-to-head for their first line of business, which is content and content aggregation services. The discovery solution is secondary to them and it is shown in numerous ways by their actions.

He proceeds to provide questions to discovery service providers to understand their true motivations. These questions, of course, reflect the interests of discovery service provider Ex Libris. Nonetheless, those exploring discovery services need to ask these questions.

The upshot — federated search isn’t dead yet and discovery services are not the magic bullet the marketing material would have you believe.

If you enjoyed this post, make sure you subscribe to the RSS feed!

15
Sep

[ Editor's note: The following is a guest article by Dr. Peter Noerr.

Peter Noerr’s background is in information retrieval, where his extensive design and development experience has culminated in the creation of successful information technology product lines. Dr. Noerr was educated in South Africa and the UK, completing a Doctorate in Information Science from The City University, London. He spent six years working for the British Library as Head of Systems Development. In 1980 he left the Library to co-found IME Ltd. Dr. Noerr designed and produced the Tinman/Information Navigator line of library automation software for the company, selling over 3,000 systems throughout the world by the time the company was sold in 1996. Since then, Dr. Noerr has consulted for a variety of organizations on information management and retrieval. Dr. Noerr has authored many articles and publications and is frequently invited to speak at international conferences. Dr. Noerr is co-founder of MuseGlobal, Inc. and chief architect of the Muse product line. Dr. Noerr currently serves as Chief Technology Officer of MuseGlobal, Inc. ]

Federated Search or federated search

A little while ago New York Law School announced the unveiling of their DRAGNET system where searchers are able to use an Application built using Google’s Customized Search Engine (CSE) to find answers to their questions from a stable of 72 legal websites. The announcement runs:

The New York Law School’s Mendik Library has recently developed DRAGNET, a search tool that allows the user to find a topic simultaneously in more than 80 legal web sites and databases. DRAGNET stands for “Database retrieval access using Google’s new electronic technology.”
It is located at http://www.nyls.edu/library/research_tools_and_sources/dragnet

Leaving aside the difference in the number of Sources, it is a well engineered, and targeted system for its intended clientele. And it is intended for a particular purpose.

DRAGNET can be a good tool to begin a research project, giving you a sense of what kinds of materials can be found on your topic.

What is of interest to me is that it has been touted by commentators (on the web4lib listserv for example) as a “federated search tool.” Now, admittedly this use of federated search (FS) does not include capital letters, and the actual phrase has something of an identity crisis laden history, but DRAGNET (which does not use the name) is not a federated search system by whatever name you wish to call the technology.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

31
Aug

In May, search consultant Avi Rappoport delivered a presentation at the Enterprise Search Summit: Federated vs. Aggregated Search Architectures.

Avi Rappoport is an enterprise search consultant, helping companies improve search engine functionality for websites and intranets. She has a degree from UC Berkeley’s (then) School of Library and Information Science and spent 10 years in software development before becoming a search consultant. She is the editor of SearchTools.com and a frequent speaker and author, providing a strong focus on search usability in the broadest sense and sharing her conviction that search engines can always be better.

Avi created a web page with a summary of and links to a couple of versions of her presentation.

I greatly appreciate Avi’s consideration of the pluses and minuses of federation aggregation (i.e. discovery service) in a world that is often polarized about one approach being better in all cases.

My research for this presentation indicated that each is useful in specific circumstances (I know, no surprise there). Many data sources are obviously best accessed by one or the other, but it’s the corner cases that are tricky. Aspects to consider include:

  • size of the content in the source
  • how often your users need that content
  • content change rate
  • importance of real-time access control permissions changes
  • content licensing rules
  • available tools for indexing / querying
  • difficulty of extracting and indexing
  • quality of the internal search engine
  • difficulty of sending queries and receiving results

The final slide has some sage advice:

Be open-minded, analyze the benefits of each approach for each data source.

One size does NOT fit all.

If you enjoyed this post, make sure you subscribe to the RSS feed!

26
Aug

[ Editor's Note: This is a very touching article by Nena Moss first published in the OSTI Blog. My dad suffered with Alzheimer's for a number of years before he died so I can relate to Nena's experience. Disclaimer: I have been paid to support OSTI in a number of capacities for the past eight years. ]

My mother died in March 2010 after a 15-year battle with Alzheimer’s, so I pay particular attention to news about this dreadful disease. A recent New York Times article caught my eye: “Sharing of Data Leads to Progress on Alzheimer’s.”

How did sharing data lead to progress on Alzheimer’s? A collaborative effort, the Alzheimer’s Disease Neuroimaging Initiative, was formed to find the biological markers that show the progression of Alzheimer’s disease in the human brain. The key was to share all the data, making every finding public immediately – “available to anyone with a computer anywhere in the world.”

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

9
Aug


Early this year O’Reilly published Search Patterns, by Peter Morville and Jeffery Callender. This is Morville’s fourth information/search-related book. Search Patterns addresses the intersection of user interface and search.

Search Patterns is an absolutely outstanding book. I don’t get excited about search-related books very often but this one totally captivated me. O’Reilly sent me a review copy some months ago. It sat in a pile until I started seeing reviews and references to the book on the Web. The press prompted me to open the book.

The first thing I noticed in flipping through the book was the many high-quality color screen shots and illustrations. Plus, Search Patterns is printed on glossy paper to enhance the visual elements of the book.

At 173 pages (plus index) and a nice balance of text and images, Search Patterns is, at the surface, a quick read. But, there are numerous gems throughout the book so allow yourself plenty of time to read (and reread) sections that draw you.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!