25
May

Shortcomings of full-text searching

Author: Sol

Jeffrey Beall from the University of Colorado Denver, has a nice slide presentation: The Shortcomings of Full-Text Searching.

The slide show lists 14 problems one encounters with search engines. Here’s the list:

The synonym problem. You search for “dentures” but don’t think to search for “false teeth.”

Obsolete terms. You’re researching the history of motion pictures and don’t think to search for “photoplay.”

The homonym problem. Your search engine doesn’t do clustering and you search for “conductor.” Or, you search for “Roger Morris” and find the wrong one. Or, you search for “red,” which means “network” in Spanish.

Spamming. There’s lots of junk in the indexes of the big search engines to make your searches less effective.

Inability to narrow searches by facets. Clustering and search refinement doesn’t exist in all search engines.

Inability to sort search results. It can be hard to organize results.

The aboutness problem. Just because the result has your terms in it doesn’t mean the result is actually about the term.

Figurative language. You search for information about “drowning” and find a document about someone “drowning in birthday presents.”
Search words not in web page. There is supposedly a book about the French Revolution that does not use the term “French Revolution.”

Abstract topics. How do you find useful document on “health,” “free will” or “ethics?”

Paired topics. Art and mental illness, architecture and philosophy, and movies and fascism are examples of paired topics. Often search engines find documents with both terms but the terms are not related, they just happen to appear in both documents.

Word lists. You’re searching for a term. What you find is a word list that contains your term but has nothing to do with your term.

The Dark Web. That’s the Deep Web. Lots of quality information is in the Deep Web and not accessible to Google and the other crawlers.

Non-textual things. Without meta data or tagging non-text data is very difficult to find.

What’s Beall’s conclusion? Search the library databases directly. I’m confused because searching library databases IS performing full-text search. I think Beall is focusing on the Surface Web search engine (Google and Bing, for example) as the major sources of the problem. To some extent searching sources directly or via federated search can overcome these problems depending on how scholarly the content is, how good the meta data is, and how good the underlying search engines are.

Hat tip to the PurpleSearch Blog.

If you enjoyed this post, make sure you subscribe to the RSS feed!

Tags: federated search

This entry was posted on Tuesday, May 25th, 2010 at 6:02 pm and is filed under viewpoints. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or TrackBack URI from your own site.

One Response to "Shortcomings of full-text searching"

1 D. Bonner
May 26th, 2010 at 6:16 am
“I’m confused because searching library databases IS performing full-text search.”

I believe he is referring to library catalogs, which do not perform full-text searching at all. Instead, they search metadata that includes controlled, pre-coordinated subject terms-the “deep web” data Jeffrey refers to as inaccessible to most FT interfaces.

Sponsored By
Subscribe via RSS

Subscribe to posts
or to comments
Subscribe via Email
- Subscribe to Posts
- Subscribe to Comments
We're on twitter
- (Deep Web Technologies Blog) Hot Tubs, Special Relativity and Subjective Time: http://bit.ly/9fbm3c about 17 hours ago from bitly
- (Swiss Army Librarian) Buying Databases Like Used Cars: http://bit.ly/cpYn6z 02:17:35 PM June 26, 2010 from bitly
- (Urban Library Journal) In Review: Going beyond Google: The Invisible Web in learning and teaching. http://bit.ly/dtceXk 02:05:49 PM June 26, 2010 from bitly
- (A teaser on the statistics of federated search usage) Carmichael's Connundrums: http://bit.ly/9ItuBh 10:30:26 PM June 23, 2010 from bitly
- (Federated Search Blog) Federated search vs. data migration) http://bit.ly/bH0ARs 05:42:34 PM June 22, 2010 from bitly
Proud Member
Recent Posts
Recent Comments
- Sol on Federated search vs. data migration
- Gregor Erbach on Federated search vs. data migration
- Mike on What the semantic web will mean to you
- Gregor Erbach on High fidelity search?
- Renee on Librarians do Gaga

Shortcomings of full-text searching

One Response to "Shortcomings of full-text searching"

Leave a reply

Categories

Archives

Pages

Sponsored By

Subscribe via RSS

Subscribe via Email

We're on twitter

Proud Member

Recent Posts

Recent Comments