Six years ago I moved from the Bay Area to New Mexico to be closer to my brother Abe and to his family. For five of those years I was an employee of his at Deep Web Technologies. Now, I just write for this blog and do some project work for him. Ever since starting to work for Deep Web, and to this day, I’ve supported DOE OSTI (The US Department of Energy Office of Scientific and Technical Information) in a number of capacities. OSTI is chartered to disseminate scientific and technical information to the public, especially as it pertains to DOE’s interests. OSTI has built a number of highly visible applications for this purpose, and some of these perform federated search and use technology developed by Deep Web.
Early on in my employment with Deep Web I got to know Dr. Walt Warnick, OSTI’s director. Aside from the fact that he was a very important customer of ours, I developed a respect for Dr. Warnick beyond merely what the business relationship dictated. Dr, Warnick, I was to discover, had an unwavering vision of deploying bigger and better federated search applications to meet its charter. Dr. Warnick, in fact, popularized federated search in the federal government — his organization is responsible for creating Science.gov, WorldWideScience.org, and ScienceAccelerator.gov, among other popular applications. These applications serve researchers as well as the non-scientists that Dr. Warnick refers to as the “science attentive citizenry.”
Dr. Warnick just posted an article on the OSTI blog (for which I write from time to time): The Science Knowledge Imperative: Making non-Googleable Science Findable. I am very impressed with this article as it clearly articulates Dr. Warnick’s passion and also articulates well his reasons for being an evangelist for federated search.
Google can’t be everywhere. Google does an outstanding job of crawling and indexing the surface web but much of the science information isn’t in the surface web; instead much of it lies in the deep web. It takes a very different kind of technology (i.e. federated search) to access scientific information in the deep web. (See Introduction to the deep web for background information.) So, Dr. Warnick has been driving OSTI to invest in advancing the state of federated search technology to find science information where it lives.
While Google can’t be everywhere, it certainly is in many places. So, the vision of getting more quality science information to those who need it doesn’t exclude Google. The vision includes delivering crawled and indexed content side by side with federated search content. You can (and should) read the OSTI Blog article for yourself. It builds momentum to make this bold final statement:
Thus, the combination of crawled indexes and federated searches is an extremely promising path to the future. A billion-page, high quality science search tool may soon be available to further accelerate the progress of science.
Two factors make Dr. Warnick’s vision within reach. First, OSTI already delivers crawled and indexed content together with federated content with its Eprint Network. Second, OSTI has already created hierarchical federated search applications, where one federated search application federates the search results of other federated search applications which in turn may federate other such applications. Current barriers to Dr. Warnick’s vision are more political and administrative than technical, although implementing and deploying large-scale federated search systems isn’t without its technical challenges, in particular, scaling to a large number of simultaneous users.
Closely related to the vision of making a billion pages of high quality science searchable from a single search page are the concepts of global discovery and diffusion of knowledge. How can information that is documented by one scientist in one discipline be quickly discovered by scientists in other disciplines throughout the world? How can knowledge be quickly spread? The mathematical study of how diseases spread has demonstrated that increasing access to scientific research leads to an acceleration of discoveries. Global discovery was introduced at the 2006 AAAS Annual Meeting. And, knowledge diffusion, an area in which OSTI is conducting research, was discussed as well:
“The spread of new ideas in science is mathematically similar to the spread of disease, even though one produces positive results, the other negative,” said Dr. Warnick. “Our goal is to foster epidemics of new knowledge by speeding the diffusion of new ideas.”
I am always delighted to meet a person with passion and vision, especially when that person has the ability to influence the advancement of science. I believe that large-scale federated search applications and knowledge diffiusion are the critical components that will accelerate global discovery. Further, I believe that all users, customers, and vendors of federated search, whether or not they are interested in science, have Dr. Warnick to thank for making the technology more well-known and utilized than it would be otherwise.