5
Dec

Introduction to the deep web

Author: Sol

In 2004 Abe and I produced a three-part article series for New Idea Engineering. New Idea Engineering is a software and service vendor specializing in Enterprise search. The articles provide a very general introduction to the deep web. While this is a blog about federated search, deep web searching is very closely related, as deep web content is often federated, or aggregated. And, the content that is federated is often deep web content.

These are the articles:

Mining the Deep Web
This is a good introduction to what the deep web is, how it’s different from the so-called surface web, how Google acquires content and how deep web search engines acquire it.
Challenges of the Deep Web Explorers
In this article we discuss the pros and cons of harvesting vs. real-time deep web searching of content.
Beyond Information Clutter
This article introduces the issue of relevance ranking of search engine results, and one way that Deep Web Technologies deals with the problem. We invite discussion of other approaches.

These articles are intended for the person completely new to concepts such as deep web, surface web, crawler, and harvesting.

If you enjoyed this post, make sure you subscribe to the RSS feed!

This entry was posted on Wednesday, December 5th, 2007 at 4:56 am and is filed under basics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or TrackBack URI from your own site.

One Response to "Introduction to the deep web"

1 Content access basics - Part I - screen scraping » Federated Search Blog
December 27th, 2007 at 9:27 pm
[...] that lives inside of databases. Read the earlier articles on crawling vs. deep web searching and introduction to the deep web for background information on deep web searching. Also, read the article about connectors to [...]

Introduction to the deep web

One Response to "Introduction to the deep web"

Leave a reply

Categories

Archives

Pages

Sponsored By

Subscribe via RSS

Subscribe via Email

We're on twitter

Proud Member

Recent Posts

Recent Comments

Web essentials