Miles Kehoe at New Idea Engineering’s Enterprise Search Blog tells an entertaining anecdote.
The folks from Booz & Company, a spinoff from Booz Allen Hamilton, did a presentation on their experience comparing two well respected mainstream search products. They report that, at one point, one of the presenters was looking for a woman she knew named Sarah – but she was having trouble remembering Sarah’s last name. The presenter told of searching one of the engines under evaluation and finding that most of the top 60 people returned from the search were… men. None were named ‘Sue’; and apparently none were named Sarah either. The other engine returned records for a number of women named Sarah; and, as it turns out, for a few men as well.
After some frustration, they finally got to the root of the problem. It turns out that all of the Booz & Company employees have their resumes indexed as part of their profiles. Would you like to guess the name of the person who authored the original resume template? Yep – Sarah.
This is a great example of “garbage in, garbage out!” Meta data is only as good as the humans who curate it (or the machines who try to guess at it.) Thanks for the Friday chuckle, Miles!
Here’s an entertaining article from DomainGang.
Yahoo to eliminate all Bots, Crawlers and Web Spiders
Posted by Lucius “Guns” Fabrice on September 20, 2010
Since the early 90’s the automated search and retrieval of the so-called “deep web” has become the ultimate goal of every search engine. While Google recently introduced the Instant Search feature that delivers results on the spot, trends might change soon.
After almost 15 years on the go, Yahoo announced that it’s pulling the plug on its bots, crawlers and web spiders.
“Our business model is simple: automation kills jobs”, said Matt Jiggerson, chief engineer at Yahoo.
“All this software, searching thousands of web sites and storing millions of gigabytes of information does not fit in the current state of bad economy we are facing”.
Read the whole article.
The U.S. Census Bureau has a federated search tool
in development, Data Ferrett.
The (Beta)DataFerrett helps you locate and retrieve the data you need across the Internet to your desktop or system, regardless of where the data resides.
DataFerrett is a unique data mining and extraction tool. (Beta)DataFerrett allows you to select a databasket full of variables and then recode those variables as you need. You can then develop and customize tables. Selecting your results in your table you can create a chart or graph for a visual presentation into an html page. Save your data in the databasket and save your table for continued reuse.
I have no idea how useful the tool is but their mascot sure is cute!
Here are a couple of fun videos. The first is the Parisian Love Google search story that aired during the Super Bowl. The second is a great parody of a Google search story. Feel inspired? Create your own Google Search Story with YouTube’s Video Director, complete with your choice of 24 different music tracks.
Hat tip to Making Curriculum Pop
If you’ve not seen this parody of Lady Gaga’s Bad Romance you’ll enjoy this great video made by students and faculty from the University of Washington’s Information School.
Hat tip to Jenny Luca.
[ Editor’s note: This article first appeared in the OSTI Blog. Dr. Walt Warnick, Director of the Office of Scientific and Technical Information, part of DOE, and I co-authored the article. For some important search applications there is no alternative to federated search.]
Discovery services have begun to appear in the search landscape. Discovery services provide access to documents from publishers with which they have relationships by indexing the publishers’ metadata and/or full text. Discovery services are marketed to libraries where patrons appreciate near-instantaneous search results and where library staff is willing to restrict access to sources available from the service (and optionally the library’s own holdings.) While these services tout themselves as improvements to federated search, the reality is that there is no alternative to federated search for a number of important applications.
WorldWideScience.org is a global gateway to science. The federated search application was conceived and developed at OSTI and hosted by us. The portal performs live federated search of 70 databases from 66 countries. Participating members provide access to their national research databases. For a number of reasons this important gateway to millions of research documents does not lend itself to the discovery service model.
Read the rest of this entry »
When I think of real-time search and automated retrieval I think of a federated search solution. Well, here’s a different kind of such a system. Evanced Solutions has built a robot (yes, an actual physical robot) that works like those in manufacturing plants. (No, the robot doesn’t look like the image here. This image is from the Wikipedia robot article.)
The U.S. designed and manufactured system allows libraries to provide books and audiovisual materials in convenient locations without the space and cost associated with constructing a traditional library branch or building.
The new library vending system will be powered by an industrial multi-axis robot typically used in manufacturing plants. The robot will deliver library materials to patrons from storage shelves in the machine. It also re-shelves those same materials to the machine when returned by the patron for check-out by the next person.
The press release, Robot Extends Library Services, says the prototype of its new BranchAnywhere library vending system was to be unveiled last month at the Public Library Association Conference in Portland, Oregon.
A hat tip goes to Stan at the Library Blog Buzz.
Thursday, April 1st. This morning the international federated search watch group, Federation is Bad (FIB), released its findings of a comprehensive 20-year study of student perceptions of federated search. The news isn’t good. The study followed the federated search habits of 7 undergraduates who each spent 20 years (or longer) in one of a dozen American universities. While the test subjects initially liked federated search (during their first few years as freshmen) by years 18 and 19, they didn’t care much for the technology.
Immediately following the dismaying news, a new international federated search watch group, Federation is Good (FIG), was formed to contest the findings of FIB. FIG challenged the FIB findings on a couple of fronts. “FIB’s findings are nowhere near statistically significant and their results are not at all relevant,” claimed FIG spokeswoman Mata Serge. “They only included 7 [expletive deleted] students in their cheesy study,” noted Serge. “And, most students finish their undergraduate work in 8 or 10 years, not 20. Their study is seriously flawed.”
Read the rest of this entry »
Ok, so this post is completely off-topic. I ran into this great web page with fun internet trivia so I thought I’d turn some of the items into questions and share them with all of you.
- Where did Google get its name?
- Where did Yahoo! get its name?
- In February 2004, what country led the world in Internet penetration, with 76.9 percent of people connected to the Internet?
- What is a ‘bastion host?’
- What are ‘browser safe colours?’
- It took 13 years for television to reach 50 million users. How long did it take the Internet to reach 50 million users?
- How many times per minute does the average computer user blink?
- What was the first computer company to register for a domain name?
- What is “6bone?”
- Domain registration was free until 1995. Who changed it?
Want more Internet trivia? Want the answers to these questions? Check out Complete Computer Solutions.
Federated search gets beaten up a lot so it’s always nice to get some positive attention. If you want to read some over-the-top cheerleader-quality positive writing about federated search just check out this article at the Kansas State University “Talking in the Library” blog.
D.J. Beckley wrote this “Federated Search Engine = Awesome” “tips and tricks” article to entertain and educate. The article starts like this:
You might not know this little bit of information, but one of the coolest things to do in the library doesn’t involve acts that might get you banned from the building. It’s actually federated searching and you can do it in the future if you haven’t already! Seemingly contrary to its name, federated searching does not involve the Department of Homeland Security, illegal wiretaps, or raids on your personal belongings
Read the rest of this entry »