Archive for the "viewpoints" Category

16
Apr

Abe Lederman, founder and CEO of blog sponsor Deep Web Technologies, recently got a couple of exposures at MobileGroove, a site which provides analysis and commentary on mobile search, mobile advertising, and social media. The two MobileGroove articles cover Deep Web Technologies’ Biznar mobile federated search app.

More at the Deep Web Technologies Blog.

3
Mar

[ This article was originally published in the Deep Web Technologies Blog. ]

The highly regarded Charleston Advisor, known for its “Critical reviews of Web products for Information Professionals,” has given Deep Web Technologies 4 3/8 of 5 possible stars for its Explorit federated search product. The individual scores forming the composite were:

  • Content: 4 1/2 stars
  • User Interface/Searchability: 4 1/2 stars
  • Pricing: 4 1/2 stars
  • Contract Options: 4 stars

The scores were assigned by two reviewers who played a key role in bringing Explorit to Stanford University:

  • Grace Baysinger, Head Librarian and Bibliographer at the Swain Chemistry and Chemical Engineering Library at Stanford University
  • Tom Cramer, Chief Technology Strategist at Stanford University Libraries and Academic Information Resources

Read the rest of this entry »

18
Feb

The Harvard Library Innovation Laboratory at the Harvard Law School posted a link to a 23-minute podcast interview with Sebastian Hammer. Hammer is the president of Index Data, a company in the information retrieval space, including federated search.

Update 4/3/12: A transcript of the interview is here.

Hammer was interviewed about the challenges of federated search, which he addressed in a very balanced way. The gist of Hammer’s message is that, yes, there are challenges to the technology but they’re not insurmountable. And, without using the word “discovery service,” Hammer did a fine job of explaining that large indexes are an important component of a search solution but they’re not the entire solution, especially in organizations that have highly specialized sources they need access to.

I was delighted to hear Hammer mention the idea of “super nodes” to allow federated search to scale to thousands of sources. Blog sponsor Deep Web Technologies has used this idea, which they call hierarchical federated search for several years. Several of their applications search other applications which can, in turn, search other applications. In 2009, Deep Web Technologies founder and president Abe Lederman delivered a talk and presented a paper at SLA,
Science Research: Journey to Ten Thousand Source, detailing his company’s proven “divide-and-conquer” approach to federating federations of sources.

I was also happy to hear Hammer speak to the importance of hybrid solutions. Federation is appropriate for gaining access to some content and maintaining a local index works for other content. Neither alone is a complete solution. Deep Web Technologies figured this out some years ago. A good example of hybrid search technology is the E-print Network, a product of the U.S. Department of Energy’s Office of Scientific and Technical Information, (OSTI). Deep Web Technologies built the search technology, which combines information about millions of documents crawled from over 30,000 sites, with federated content. I have been involved with the crawl piece of the E-print Network for a number of years and can testify to the power of the right hybrid solution. In 2008 I wrote a three-part series of articles at OSTI’s blog explaining the technology behind the E-print Network. Part One is here.

In conclusion, I highly recommend the podcast for a good reminder that federated search isn’t dead and that it’s an important part of search.

1
Jun

On search neutrality

Author: Sol

Abe Lederman, founder and president of Deep Web Technologies and sponsor of this blog, wrote an article at the Deep Web Technologies blog: Preparing for ALA Panel and Federated Search Neutrality. Abe discovered this article at beerbrarian about the problem of net neutrality in federated search.

For those of you not familiar with net neutrality, Wikipedia explains it:

Network neutrality (also net neutrality, Internet neutrality) is a principle which advocates no restrictions by Internet service providers or governments on consumers’ access to networks that participate in the internet. Specifically, network neutrality would prevent restrictions on content, sites, platforms, the kinds of equipment that may be attached, or the modes of communication.
. . .
Neutrality proponents claim that telecom companies seek to impose a tiered service model in order to control the pipeline and thereby remove competition, create artificial scarcity, and oblige subscribers to buy their otherwise uncompetitive services. Many believe net neutrality to be primarily important as a preservation of current freedoms. Vinton Cerf, considered a “father of the Internet” and co-inventor of the Internet Protocol, Tim Berners-Lee, creator of the Web, and many others have spoken out in favor of network neutrality.

In the net neutrality battle, consumers worry about telecom companies unfairly biasing the delivery of some content (that which they have business interest in biasing) over the content of others. Add search to the equation and what you get are concerns over whether your search results are sorted by relevance or by the business needs of the search engine company.

Read the rest of this entry »

20
May

Amusing anecdote

Author: Sol

Miles Kehoe at New Idea Engineering’s Enterprise Search Blog tells an entertaining anecdote.

The folks from Booz & Company, a spinoff from Booz Allen Hamilton, did a presentation on their experience comparing two well respected mainstream search products. They report that, at one point, one of the presenters was looking for a woman she knew named Sarah - but she was having trouble remembering Sarah’s last name. The presenter told of searching one of the engines under evaluation and finding that most of the top 60 people returned from the search were… men. None were named ‘Sue’; and apparently none were named Sarah either. The other engine returned records for a number of women named Sarah; and, as it turns out, for a few men as well.

After some frustration, they finally got to the root of the problem. It turns out that all of the Booz & Company employees have their resumes indexed as part of their profiles. Would you like to guess the name of the person who authored the original resume template? Yep - Sarah.

This is a great example of “garbage in, garbage out!” Meta data is only as good as the humans who curate it (or the machines who try to guess at it.) Thanks for the Friday chuckle, Miles!

5
May

I’ve always thought of personalization as a good thing. If Google knows something about me then it can provide results that I’ll find more relevant, right?

Watch this TED talk by Eli Pariser and, like me, you might start having second thoughts.

Pariser is former executive director of MoveOn and is now a senior fellow at the Roosevelt Institute. His book The Filter Bubble is set for release May 12, 2011. In it, he asks how modern search tools — the filter by which many of see the wider world — are getting better and better and screening the wider world from us, by returning only the search results it “thinks” we want to see.

Here’s the very thought-provoking first paragraph of the talk:

Mark Zuckerberg, a journalist was asking him a question about the news feed. And the journalist was asking him, “Why is this so important?” And Zuckerberg said, “A squirrel dying in your front yard may be more relevant to your interests right now than people dying in Africa.” And I want to talk about what a Web based on that idea of relevance might look like.

Read the rest of this entry »

20
Apr

Here’s a chunk of an interesting article from TechEYE.net: Kids go cold turkey when you take their technology away — Like quitting heroin:

Boffins have found that taking a kid’s computer technology away for a day gives them similar symptoms as going cold turkey.

The study was carried out by the University of Maryland. It found that 79 percent of students subjected to a complete media blackout for just one day reported adverse reactions ranging from distress to confusion and isolation.

One of the things the kids spoke about was having overwhelming cravings while others reported symptoms such as ‘itching’.

The study focused on students aged between 17 and 23 in ten countries. Researchers banned them from using phones, social networking sites, the internet and TV for 24 hours.
The kids could use landline phones or read books and were asked to keep a diary.
One in five reported feelings of withdrawal like an addiction while 11 percent said they were confused. Over 19 percent said they were distressed and 11 percent felt isolated. Some students even reported stress from simply not being able to touch their phone.

I wonder what would happen if all the search engines were turned off for a day.

Hat tip to Stephen Arnold.

20
Feb

I recently discovered an article, 5 Reasons Not to Use Google First, that sings my song. The article addresses this question:

Google is fast, clean and returns more results than any other search engine, but does it really find the information students need for quality academic research? The answer is often ‘no’. “While simply typing words into Google will work for many tasks, academic research demands more.” (Searching for and finding new information – tools, strategies and techniques)

The next paragraph gave me a chuckle.

As far back as 2004, James Morris, Dean of the School of Computer Science at Carnegie Mellon University, coined the term “infobesity,” to describe “the outcome of Google-izing research: a junk-information diet, consisting of overwhelming amounts of low-quality material that is hard to digest and leads to research papers of equally low quality.” (Is Google enough? Comparison of an internet search engine with academic library resources.)

The article continues with its list of five good reasons to not use Google first.

Note that the recommendation isn’t to skip Google altogether. There’s a balance that’s needed to get the best value when performing research. The findings in the “Is Google enough?” article summarizes this point really well:

Google is superior for coverage and accessibility. Library systems are superior for quality of results. Precision is similar for both systems. Good coverage requires use of both, as both have many unique items. Improving the skills of the searcher is likely to give better results from the library systems, but not from Google.

15
Feb

I learned, from Roy Tennant, about work that Microsoft and others are doing with natural user interfaces (NUIs). What’s an NUI? Here’s a piece of a Microsoft Blog article that gives you the gist:

One product that has gotten a lot of attention recently is our Kinect for Xbox 360, which incorporates facial recognition along with gesture-based and voice control. The device knows who you are, understands your voice or the wave of your hand and is changing the face of gaming as we know it. …

By combining sensory inputs with the knowledge of what you?re trying to do (contextual awareness), where you are and what is around you (environmental awareness), 3D simulation and anticipatory learning, we can foresee a future where technology becomes almost invisible. Imagine a world where interacting with technology becomes as easy as having a conversation with a friend.

I can’t quite fathom what search would like in a world of NUI but I’m looking forward to it.

5
Jan

My brother Abe just started a LinkedIn discussion in the Enterprise Search Engine Professionals group. Here’s the post.

Happy New Year, Everyone!

I’m interested to know who is doing real-time federated search in the enterprise. By “real-time” I mean searching sources live, not building nor searching an index. Have you implemented such a beast? Has it been successful? What have the challenges been? Access security and policy issues come to mind. What do you see as the advantages and disadvantages of federated search in the enterprise?

By way of disclosure, I co-founded Verity, and I founded and run the federated search company, Deep Web Technologies.

Here are a few links that might be of interest to participants in this discussion:

What are your thoughts?

That last link refers to a New Idea Engineering article that discusses a number of important features of federated search in the enterprise.

  • Flexible rules for combining results from all of the engines searched
  • Maintaining Users Security Credentials
  • Mapping User Security Credentials to other security domains
  • Advanced Duplicate Detection and Removal
  • Combining results list Navigators, such as Faceted Search links and Taxonomy Nodes.
  • Handling other results list links such as “next page” and sort order.
  • Translating user searches into the different search syntaxes used by the disparate engines.
  • Extracting hits from HTML results, AKA “scraping”, hopefully without the need to custom code.

If you know of any activity in the enterprise search world that intersects with federated search and that doesn’t involve building and maintaining indices Abe and I would love it if you would join the conversation.

For those of you new to this blog, federated search vendor Deep Web Technologies is the sponsor.