Archive for the "multilingual" Category


Multilingual federated search, the ability to search and to view results from foreign language sources in your own language, may be just an interesting idea to some but there is a strategic value to the technology. Consider this article published by the BBC in March of 2011: China ‘to overtake US on science’ in two years. If the prediction of the UK’s national science academy, the Royal Society, proves true then sometime next year China will produce scientific research papers at a faster rate than the current leader, the U.S.

Researchers in the English-speaking world have mostly been restricted to searching only English language sources since the tools for simultaneously searching foreign language sources and for performing the translations haven’t existed until recently. Thus, opportunities to search scholarly journals in Chinese, Japanese, Portuguese and other languages associated with countries producing a great volume of science output are being missed. In an economic climate where performing research and getting products to market quickly translates to that competitive edge that leads to greater profits, being able to scour the research Web quickly, effectively, efficiently, and on an ongoing basis is critical to developing and maintaining a competitive edge.

Blog sponsor Deep Web Technologies has developed a patent pending multilingual search version of its Explorit federated search application that integrates the search and translation technologies making for a seamless and productive research environment for scientists, engineers, and researchers in business, science, and technology.

Read the rest of this entry »


[ Editor's note: Blog sponsor Deep Web Technologies has announced important enhancements to its federated search technology that allows its Explorit Research Accelerator product to go deeper into the deep Web than ever before. ]

Researchers can now search text, audio, video and images in multiple languages

SANTA FE, N.M., June 21, 2011 /PRNewswire/ — Deep Web Technologies?, the leader in federated search of the Deep Web, today announced full integration of multilingual and multimedia search into the company’s market-leading Explorit? Research Accelerator. The patent-pending multilingual search capability is the first such feature ever offered for Deep Web search.

Multilingual federated search, unveiled June 11, 2011 in Helsinki at the International Council for Scientific and Technical Information’s Summer Conference and originally only available as a beta release to users of the gateway to global science, is now available to all Deep Web Technologies customers who require seamless access to foreign language documents. Explorit’s multilingual search capability translates a user’s search query into the native languages of the collections being searched, aggregates and ranks these results according to relevance, and translates result titles and snippets back to the user’s original language. The multilingual translation functionality, powered by Microsoft?, makes it simple to search collections in multiple languages from a single search box in the user’s native language.

Multimedia federated search, first introduced in the and portals, allows for seamless integration of audio, video, and image content sources into Explorit. searches seven multimedia sources: CDC Podcasts, CERN Multimedia, Medline Plus, NASA, NSF, NBII LIFE, and ScienceCinema. ScienceCinema is an exciting example of the ability to search speech indexed multimedia content. The DOE Office of Scientific and Technical Information (OSTI) developed ScienceCinema in partnership with Microsoft. When multimedia sources are included in an Explorit search, images and links to multimedia content can be presented alongside text results or in a separate results tab.



[ Editor's note: This article was first published in the Deep Web Technologies Blog. ]

WorldWideScience is a global science gateway that combines national and international scientific databases into a search engine. From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher’s native language) and seven multimedia sources. WorldWideScience is the brainchild of the director of the DOE Office of Scientific and Technical Information (OSTI), Dr. Walt Warnick. The gateway is maintained and hosted by OSTI and governed by the WorldWideScience Alliance.

Deep Web Technologies is proud to have developed the federated search technology behind WorldWideScience. And, with the cooperation of the Microsoft Translation services team, Deep Web Technologies also implemented the multilingual technology. It was a major undertaking but a worthwhile one for the science community, whose members can now greatly expand their reach to scientific papers in languages beyond their own.

Dr. Warnick was invited to deliver a presentation at the 14th session of the United Nations’ Commission on Science and Technology (CSTD). In a post at the OSTI Blog, Dr. Warnick shares the warm reception that WorldWideScience received.

I wish more of my OSTI colleagues could have been in Geneva to share the warm response from the attendees. Several country representatives offered up new sources for WorldWideScience (WWS). Another member of the audience searched mobile WWS for his own name and remarked that he found many of his papers. I received enthusiastic comments, so many that I couldn?t address all of them because of time constraints. Significantly, the Chair of CSTD volunteered to pay the costs of becoming a member of the WorldWideScience Alliance. There was great excitement about the possibilities for its use within the home countries of the attendees and how WWS advances the goals of CSTD.

The paper “Breaking down language barriers through multilingual federated search” co-authored by Abe Lederman (founder and president of Deep Web Technologies), and Dr. Warnick, Brian Hitson, and Lorrie Johnson from OSTI, explains the importance of the gateway:

“ (WWS) is a global science gateway developed by the US Department of Energy Office of Scientific and Technical Information (OSTI) in partnership with federated search vendor Deep Web Technologies. WWS provides a simultaneous live search of 69 databases from government and government-sanctioned organizations from 66 participating nations. The WWS portal plays a leading role in bringing together the world’s scientists to accelerate the discoveries needed to solve the planet’s most pressing problems. In this paper we present a brief history of the development of WWS and discuss how a new technology, multilingual federated search, greatly increases WWS’ ability to facilitate the advancement of science.”

Deep Web Technologies is delighted to be working with OSTI and other organizations to push the envelope of search technology and to make the world a smaller place.


Hope Leman is one of my favorite people. I know of very few individuals who are as passionate about anything as is Hope. Hope won second place in our second Federated Search Blog contest and I commented on her passionate review of in 2008.

Hope wrote again about Her article is at her blog, Signifcant Science. Hope is a research information technologist for a health network in Oregon. She is also Web administrator of the free online grants and scholarship listing service, ScanGrants, and of the free online search platform, ResearchRaven. From several conversations with Hope I know that ScanGrants is a labor of love and a good demonstration of Hope’s passion about helping researchers.

In Multilingual WorldWideScience: Accelerating Scientific Research, Empowering Researchers Hope reminds us of the key role that search plays in research especially in the world of free science and foreign language science.

Hope’s message is personal, and I love that:

As someone who grew up in a family that housed students who had left home and family in China, Japan, Iran, Korea and other countries to study engineering, chemistry, physics, biochemistry and so on at Oregon State University here in my hometown of Corvallis, Oregon I know what brilliant people there are in many countries who have so much to offer and what a boon it will be that the work of researchers worldwide will become useable to each of them and benefit the rest of us.

This update on Hope’s friend who suffered from ALS is even more touching:

I have recently lost a friend to amyotrophic lateral sclerosis and I would often sadly reflect as I bicycled home from her house about the glacial pace of progress on research on that disease and others like it. That is why I find Dr. Warnick’s enthusiasm and practical accomplishments so very admirable and the best possible case for paying one?s taxes with a minimal amount of grumbling. He is putting federal funds to exemplary use

Dr. Warnick, Director of OSTI, conceived WorldWideScience and his agency hosts and manages the search portal.

Databases and search engines aren’t about getting one’s job done. At the noblest level, they’re about solving important problems, and saving lives when we can.

[ Disclaimer: OSTI is one of my consulting clients. Deep Web Technologies, who built the single and multiple language search engines behind and who sponsors this blog is another of my clients. ]


[ Editor's note: This article is republished from the Deep Web Technologies Blog. It is Abe's perspective on the launch of Multilingual Federated Search in Helsinki last month. ]

Photo credit: Jakke Nikkarinen/STT Info Kuva Pictured, from left, Dr. Walter Warnick, U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Director; Yuri Arskiy, All-Russian Institute of Scientific and Technical Information (VINITI) Director; Tony Hey, Microsoft Research Corporate Vice-President; Richard Boulderstone of the British Library and the WorldWideScience Alliance Chairman; and Wu Yishan, Institute of Scientific and Technical Information of China (ISTIC) Chief Engineer.

It was an honor to attend and for my company to have played a key role in the launch of multilingual in Helsinki this past June 11th. Beginning more than three years ago, the R&D effort that ultimately resulted in the launch of our ground-breaking multilingual federated search capability involved plenty of hard work by lots of folks at Deep Web Technologies. It certainly could not have been accomplished without our invaluable partnerships with the Department of Energy Office of Scientific and Technical Information (OSTI), the WorldWideScience Alliance, and Microsoft Research.

Read the rest of this entry »


Last Friday blog sponsor Deep Web Technologies released its beta version of multilingual federated search, available at Deep Web Technologies and several government agencies key to the effort acknowledged the great accomplishment via press releases.

Deep Web Technologies

HELSINKI, June 11 /PRNewswire/ — Deep Web Technologies unveiled multilingual translation capability today for the WorldWideScience Alliance using its federated search application., the international science portal, is the first application to be deployed with this unique capability. Abe Lederman, President and CTO of Deep Web Technologies, demonstrated the new technology at the International Council for Scientific and Technical Information’s (ICSTI) 2010 Summer Conference in Helsinki. ICSTI is a primary sponsor of the Alliance, whose purpose is to provide “a geographically diverse, governance structure to promote and build upon the original vision of a global science gateway.”

Multilingual federated search translates a user’s search query into the native languages of the collections being searched, aggregates and ranks these results according to relevance, and translates result titles and snippets back to the user’s original language. The translation, powered by Microsoft, makes it simple to search collections in multiple languages from a single search box in the user’s native language. The Conference will include a keynote address by Tony Hey, Corporate Vice President of the External Research Division of Microsoft Research, as well as a presentation by Dr. Walter Warnick, Director of the Office of Scientific and Technical Information of the U.S. Department of Energy Office of Science. (More)

US Department of Energy Office of Science

Washington, D.C.—Scientific language barriers were broken today in Helsinki with the launch of Multilingual While a large share of scientific literature is published in English, vast quantities of high-quality science are not, and the pace of non-English scientific publishing is increasing. will now enable the first-ever real-time searching and translation across globally-dispersed, multilingual scientific literature using complex translations technology.

“In an increasingly interconnected world, resolving the global challenges of science requires rapid communication of scientific knowledge,” said Dr. William F. Brinkman, Director of the Office of Science, U.S. Department of Energy. “Breaking the language barrier through will help erode borders and build research networks across DOE, the nation, and around the globe.” (More)

DOE Office of Scientific and Technical Information

OAK RIDGE, TN - Now you can find non-English scientific literature from databases in China, Russia, France, and several Latin American countries and have your search results translated into one of nine languages. With the beta launch today (view the Office of Science announcement) of Multilingual, real-time searching and translation of globally-dispersed collections of scientific literature is possible. This new capability is the result of an international public-private partnership between the Alliance and Microsoft Research, whose translation technology has been paired with the federated searching technology of Deep Web Technologies.

Microsoft Research Corporate Vice-President Tony Hey said, “We are extremely pleased to have our Microsoft Translator technology used with WorldWideScience. Built at Microsoft Research, this translation technology already provides translations to millions of users. Partnering with WorldWideScience is an opportunity to advance science across language barriers and improve scientific discovery.” (More)

British Library

World Wide Science Alliance broadens access to global research with the launch of a new multilingual tool, enabling scientists to simultaneously search and translate over 400 million pages of scientific research published in 65 countries from around the world.

Although most scientific literature continues to be published in English, the pace of non-English scientific publishing is increasing rapidly, with vast quantities of high-quality science now being produced every year. Launched today at the International Council for Scientific and Technical Information (ICSTI) annual conference in Helsinki, Finland, a new beta version of will enable scientists to break down the language barrier, facilitating greater global cooperation with regards to the pursuit of scientific research. (More)


Multilingual federated search is a big deal for a couple of reasons. First, no one has done it up to now. Yes, Google just added translation into its universal search. And, no doubt Bing will follow suit. But, being able to search the quality scientific and technical information that sometimes is only available via federated search, and doing it in foreign languages, is important.

The second reason that multilingual federated search is so important is because China, Japan, Russia, and other nations produce large volumes of research output. As the world shrinks we can’t afford to ignore the non-English literature. In a blog article the author noted that Thomson Reuters highlighted the importance of China’s research output on the basis of sheer volume :

According to citation analysis based on data from Web of Science, China is ranked second in the world by number of scientific papers published in 2007. Scientific’s World IP Today Report on Global Patent Activity 2007 reported that China almost doubled its volume of patents from 2003 to 2007, and looks set to become a strong rival to Japan and the United States in years to come.

The bottom line: federated search is about research and research is global.

Read the rest of this entry »