Archive for the "viewpoints" Category

21
Aug

On federated fetching

Author: Sol

“Federated fetching” is a new term to me. I discovered it at Srinivas Reddy’s Weblog, referencing the O’Reilly book, Beautiful Data:

When we deal with web scale data ‘discoverability’ of information is key. While ‘web search’ provides a lot of value today what we really need is to enable ‘data find data’. I like the differentiation in the book between ‘federated search’ and ‘federated fetch’. The latter needs adaptive systems that can discover new data correlations based on user context and new data collected.

This reference got me curious. Was the Web buzzing with discussion of federated search vs. federated fetch? Not exactly, according to Google, although there are 740 references to the phrase but only 24 of them are considered unique enough for Google to display. Interestingly enough, the first reference is to Jeff Jonas “When Federated Search Bites” article which I wrote about a month ago.

Once a directory reveals a pointer, you can go fetch it. Federated fetch does scale.

Google Books provides the term in the context of the Beautiful Data book:

So, federated fetch is the “end game,” if I understand the concept correctly. It’s what you get when, for example, a link resolver gets you to the full text copy of a book you can actually read.

There you have it, a new phrase I learned today.

If you enjoyed this post, make sure you subscribe to the RSS feed!

26
Jul

Huh?

Author: Sol

Jeff Jonas recently published an article, “When Federated Search Bites.” If this article is meant to be link bait, I’m not biting. You can get a link from Google.

I certainly don’t know everything about federated search but I know enough to recognize what’s not federated search, at least not what most of us think to be federated search.

The article, really a rant, starts off reasonably enough:

Federated search: conducting a search against ?n? source systems via a broadcast mechanism without the benefit or guidance of an index.

I am speaking specifically about environments where the systems in the federation are heterogeneous, are physically dispersed, were not engineered for federation a priori, and are not managed by a common command and control system.

Here’s another reasonable statement:

Most organizations have some obligation to make sense of what they know. For example, the airline should know if the person added to the watch list is already an employee or already has a flight reservation. Ideally, the moment such facts become knowable, someone or some system should be notified. Think of this as ?the data speaks to itself.? I call this data finds data.

Yes, having new data trigger analysis is a good idea. But, IT’S NOT FEDERATED SEARCH.

So, the entire basis of the rant is that federated search is not this advanced analysis system I want therefore it sucks. That’s like saying that my oven doesn’t analyze the food I put in it and automatically cook it perfectly therefore my oven “bites.”

There may be a discussion about the challenges of analyzing federated data vs. indexed data but that has nothing to do with what federated search does.

What do you think? Does the article make sense to you?

If you enjoyed this post, make sure you subscribe to the RSS feed!

23
Jul

Federated search as a transformational technology enabling knowledge discovery: the role of WorldWideScience.org” is by far the best historical paper I’ve read about DOE’s Office of Scientific and Technical Information (OSTI), and I consult for the agency.

OSTI has created a number of search portals (WorldWideScience.org, Science,gov, DOE ScienceAccelerator, DOE Energy Citations Database, and DOE Information Bridge to name a few) but few know about the history of the agency that created them.

OSTI grew out of the post-World War II initiative to make the scientific research of the Manhattan Project as freely available to the public as possible. On November 17, 1944, President Roosevelt wrote Vannevar Bush, then the Director of the Office of Scientific Research and Development, to request his counsel on how to capitalize on the experience of the United States’ R&D war efforts — most of which was done in utter secrecy — in the days of peace to come.

OSTI Director Dr. Walter Warnick tells the story of the development of OSTI, its role in advancing science, and how federated search serves that role in ways that Google can’t.

The paper, at 23 pages, covers the subject with a good deal of depth.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

15
Jul


Hope Leman is one of my favorite people. I know of very few individuals who are as passionate about anything as is Hope. Hope won second place in our second Federated Search Blog contest and I commented on her passionate review of WorldWideScience.org in 2008.

Hope wrote again about WorldWideScience.org. Her article is at her blog, Signifcant Science. Hope is a research information technologist for a health network in Oregon. She is also Web administrator of the free online grants and scholarship listing service, ScanGrants, and of the free online search platform, ResearchRaven. From several conversations with Hope I know that ScanGrants is a labor of love and a good demonstration of Hope’s passion about helping researchers.

In Multilingual WorldWideScience: Accelerating Scientific Research, Empowering Researchers Hope reminds us of the key role that search plays in research especially in the world of free science and foreign language science.

Hope’s message is personal, and I love that:

As someone who grew up in a family that housed students who had left home and family in China, Japan, Iran, Korea and other countries to study engineering, chemistry, physics, biochemistry and so on at Oregon State University here in my hometown of Corvallis, Oregon I know what brilliant people there are in many countries who have so much to offer and what a boon it will be that the work of researchers worldwide will become useable to each of them and benefit the rest of us.

This update on Hope’s friend who suffered from ALS is even more touching:

I have recently lost a friend to amyotrophic lateral sclerosis and I would often sadly reflect as I bicycled home from her house about the glacial pace of progress on research on that disease and others like it. That is why I find Dr. Warnick’s enthusiasm and practical accomplishments so very admirable and the best possible case for paying one?s taxes with a minimal amount of grumbling. He is putting federal funds to exemplary use

Dr. Warnick, Director of OSTI, conceived WorldWideScience and his agency hosts and manages the search portal.

Databases and search engines aren’t about getting one’s job done. At the noblest level, they’re about solving important problems, and saving lives when we can.

[ Disclaimer: OSTI is one of my consulting clients. Deep Web Technologies, who built the single and multiple language search engines behind WorldWideScience.org and who sponsors this blog is another of my clients. ]

If you enjoyed this post, make sure you subscribe to the RSS feed!

9
Jul

JISC, the UK-based education and research organization, commissioned a report from OCLC to bring together findings from different studies on how the way people look for information in libraries and online is changing. The commissioned study synthesizes the results of twelve studies. There is also a podcast What does the digital information seeker look like? at the JISC web-site that summarizes the findings.

What did the study find that would be of interest to the federated search community? Here are my thoughts:

  • “… there is an identifiable need for training, support and improved systems to help people find the information they need.” As we dream up more bells and whistles we need to consider whether users can effectively use the features we give them today. If they can’t then it’s the vendor’s responsibility to simplify the interface, easing the training burden of the library staff. After all, how many people take training classes in using Google? But, then again, how many people use Google’s advanced search?

  • “E-journals are increasingly important to the research process and the majority of professional researchers have embraced digital content”. Make sure you provide access to the journals your patrons need. Discovery services may provide access to some of them. Federated search can provide access to others.

  • “Immediate access to information from their own desktop computer is almost taken for granted and gaining access to the full-text journal article is seen as more of an issue than discovering the information sources.” This speaks to the importance of good link resolvers. It’s no surprise that users are not satisfied with an abstract and no way to get the full text of the article.

    Read the rest of this entry »

    If you enjoyed this post, make sure you subscribe to the RSS feed!

5
Jul

Tulane Reference Librarian Paul St-Pierre presents a compelling case for federated search technology in a 31-minute video.

.

While the video is largely about Tulane’s experience with Metalib the first ten minutes or so articulate problems that Tulane was seeing that motivate the search for a technology solution and that piece of the video is vendor-neutral.

St-Pierre explains that the problem at Tulane is “too much information.” Nothing new here. But, at 500 indexes and databases and 30,000 e-journals managing that information is a bigger challenge for them than for many other organizations.

Before federated search Tulane had many search tools, many user interfaces, and it was complicated to navigate the different tools, especially with documents being in many formats. St-Pierre described the situation as there being many paths to get to text.

Tulane’s competition is Google. It’s easy and it brings back lots of information. But, as St-Pierre reveals in three graphs, things are not as simple as they appear on the surface.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

1
Jul

[ Editor's note: This article is republished from the Deep Web Technologies Blog. It is Abe's perspective on the launch of Multilingual Federated Search in Helsinki last month. ]

Photo credit: Jakke Nikkarinen/STT Info Kuva Pictured, from left, Dr. Walter Warnick, U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Director; Yuri Arskiy, All-Russian Institute of Scientific and Technical Information (VINITI) Director; Tony Hey, Microsoft Research Corporate Vice-President; Richard Boulderstone of the British Library and the WorldWideScience Alliance Chairman; and Wu Yishan, Institute of Scientific and Technical Information of China (ISTIC) Chief Engineer.

It was an honor to attend and for my company to have played a key role in the launch of multilingual WorldWideScience.org in Helsinki this past June 11th. Beginning more than three years ago, the R&D effort that ultimately resulted in the launch of our ground-breaking multilingual federated search capability involved plenty of hard work by lots of folks at Deep Web Technologies. It certainly could not have been accomplished without our invaluable partnerships with the Department of Energy Office of Scientific and Technical Information (OSTI), the WorldWideScience Alliance, and Microsoft Research.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

24
Jun

In March I reported on an article that Barbara Quint, editor-in-chief of Information Today’s Searcher Magazine, published for DCLnews: Federated Searching
Good Ideas Never Die, They Just Change Their Names
. DCLnews is one of the publications of Iris Hanney’s business support services company, Unlimited Priorities.

Abe Lederman, Deep Web Technologies founder and president and sponsor of this blog, was quoted in this article regarding his experience with one particularly thorny aspect of federated search:

So how do federated search services handle [author searching] problems? In an article written by Miriam Drake that appeared in the July-August 2008 issue of Searcher entitled “Federated Search: One Simple Query or Simply Wishful Thinking,” a leading executive of a federated service selling to library vendors was quoted as saying, “We simply search for a text string in the metadata that is provided by the content providers - if the patron’s entry doesn’t match that of the content provider, they may not find that result.” Ah, the tough luck approach! In contrast, Abe Lederman, founder and president of Deep Web Technologies (www.deepwebtech.com), a leading supplier of federated search technology, responded about his companies work with Scitopia, a federated service for scientific scholarly society publishers, “We spend a significant amount of effort to get it as close to being right as possible for Scitopia where we had much better access to the scientific societies that are content providers. It is not perfect and is still a challenge. The best we can do is transformation.”

I reviewed Miriam Drake’s article last July.

Read the rest of this entry »

If you enjoyed this post, make sure you subscribe to the RSS feed!

22
Jun

Image and Data Manager (IDM) is an online magazine with a focus on information management for Australia and New Zealand. Today they published an article: Virtual aggregation trumps data migration.

The article starts with a couple of poignant examples of failures in knowledge management infrastructures:

In the United States, federal intelligence bodies failed to “connect the dots” they had been compiling when Al Qaeda terrorist Umar Farouk Abdul Mutallab attempted to blow up an airliner in late 2009.

In the United Kingdom, the cases of Khyra Ishaq and Baby P highlighted the all-too-common lack of early warning systems that could have saved the lives of young victims. Child protection services agencies possessed the information that could have protected Ishaq and Baby P but not the infrastructure necessary to alert them to potential problems.

The article argues that trying “to merge massive amounts of information from disparate data sources” has been a huge failure. The article continues with a good argument for staying with federated search:

With today’s heightened focus on risk, many CIOs are now recognising the outcomes that can be generated through federated search. The key premise being to avoid risky and costly data migration or physical aggregation exercises, and leave data in place. In today’s enterprise, data needs to live and breathe in different places.

The article is a fast and easy read and its arguments are worth serious consideration for those in the “federate or migrate” discussion.

If you enjoyed this post, make sure you subscribe to the RSS feed!

17
Jun

I’ve read many articles about the Semantic Web. Most are very abstract. So, I was pleased to discover “Semantic Web: Your Web’s Smarter Younger Brother” by Tom Robinson. Robinson provides a list of nine ways the Semantic Web will positively affect us. The items are such that we can all understand and relate to. He cites the source as semantic expert Tony Shaw but doesn’t provide a reference to the list. Here are the first five items, listed in reverse order (of importance, I imagine):

9. Annoying ads that have nothing to do with your interests will disappear.

8. Your computer will understand you through natural language recognition. When you tell it you want directions to that restaurant on Main Street with the amazing French onion soup, it will know what you’re talking about.

7. All of your computers will become more intuitive and easier to use.

6. Your bank will implement semantically-driven fraud monitoring systems. And your credit card company won’t erroneously reject your charges for a meal in London ever again, because it will understand that you bought airline tickets to England.

5. New types of consumer products will emerge that allow you to connect to your doctors and other medical experts globally. That means you’ll get a faster diagnosis of your illness and a wider array of better treatment options available to you.

See the article for the full list.

Robinson provides his own list of implications for colleges and universities. Here are the first three:

6. Research libraries already use this technology to connect disparate scientific databases.

5. Students will be able to do class-related research faster and more comprehensively so they can spend more energy on data analysis and writing.

4. Matching technology will appear in new generation job boards to create intuitive profiles and match applicants and schools that are good fits for each other.

Interesting food for thought.

If you enjoyed this post, make sure you subscribe to the RSS feed!