resources | Federated Search BlogFederated Search

Archive for the "resources" Category


Editor’s Note: This post is re-published with permission from the Deep Web Technologies Blog. This is a guest article by Lisa Brownlee. The 2015 edition of her book, “Intellectual Property Due Diligence in Corporate Transactions: Investment, Risk Assessment and Management”, originally published in 2000, will dive into discussions about using the Deep Web and the Dark Web for Intellectual Property research, emphasizing its importance and usefulness when performing legal due-diligence.

Lisa M. Brownlee is a private consultant and has become an authority on the Deep Web and the Dark Web, particularly as they apply to legal due-diligence. She writes and blogs for Thomson Reuters.  Lisa is an internationally-recognized pioneer on the intersection between digital technologies and law.


In this blog post I will delve in some detail into the Deep Web. This expedition will focus exclusively on that part of the Deep Web that excludes the Dark Web.  I cover both Deep Web and Dark Web legal due diligence in more detail in my blog and book, Intellectual Property Due Diligence in Corporate Transactions: Investment, Risk Assessment and Management. In particular, in this article I will discuss the Deep Web as a resource of information for legal due diligence.

When Deep Web Technologies invited me to write this post, I initially intended to primarily delve into the ongoing confusion Binary code and multiple screensregarding Deep Web and Dark Web terminology. The misuse of the terms Deep Web and Dark Web, among other related terms, are problematic from a legal perspective if confusion about those terms spills over into licenses and other contracts and into laws and legal decisions. The terms are so hopelessly intermingled that I decided it is not useful to even attempt untangling them here. In this post, as mentioned, I will specifically cover the Deep Web excluding the Dark Web. The definitions I use are provided in a blog post I wrote on the topic earlier this year, entitled The Deep Web and the Dark Web – Why Lawyers Need to Be Informed.

Deep Web: a treasure trove of and data and other information

The Deep Web is populated with vast amounts of data and other information that are essential to investigate during a legal due diligence in order to find information about a company that is a target for possible licensing, merger or acquisition. A Deep Web (as well as Dark Web) due diligence should be conducted in order to ensure that information relevant to the subject transaction and target company is not missed or misrepresented. Lawyers and financiers conducting the due diligence have essentially two options: conduct the due diligence themselves by visiting each potentially-relevant database and conducting each search individually (potentially ad infinitum), or hire a specialized company such as Deep Web Technologies to design and setup such a search. Hiring an outside firm to conduct such a search saves time and money.

Deep Web data mining is a science that cannot be mastered by lawyers or financiers in a single or a handful of transactions. Using a specialized firm such as DWT has the added benefit of being able to replicate the search on-demand and/or have ongoing updated searches performed. Additionally, DWT can bring multilingual search capacities to investigations—a feature that very few, if any, other data mining companies provide and that would most likely be deficient or entirely missing in a search conducted entirely in-house.

What information is sought in a legal due diligence?

A legal due diligence will investigate a wide and deep variety of topics, from real estate to human resources, to basic corporate finance information, industry and company pricing policies, and environmental compliance. Due diligence nearly always also investigates intellectual property rights of the target company, in a level of detail that is tailored to specific transactions, based on the nature of the company’s goods and/or services. DWT’s Next Generation Federated Search is particularly well-suited for conducting intellectual property investigations.

In sum, the goal of a legal due diligence is to identify and confirm basic information about the target company and determine whether there are any undisclosed infirmities with the target company’s assets and information as presented. In view of these goals, the investing party will require the target company to produce a checklist full of items about the various aspects of the business (and more) discussed above. An abbreviated correlation between the information typically requested in a due diligence and the information that is available in the Deep Web is provided in the chart attached below. In the absence of assistance by Deep Web Technologies with the due diligence, either someone within the investor company or its outside counsel will need to search in each of the databases listed, in addition to others, in order to confirm the information provided by the target company is correct and complete. While representations and warranties are typically given by the target company as to the accuracy and completeness of the information provided, it is also typical for the investing company to confirm all or part of that information, depending on the sensitivities of the transaction and the areas in which the values–and possible risks might be uncovered.

Deep Web Legal Due-Diligence Resource List PDF icon


Articles on Discovery lists a number of categorized resources about discovery services. Categories include:

  • Basics
  • Historical
  • Presentations by vendors
  • Debates
  • Library experiences, evaluations & case studies
  • Misc comments on other issues
  • Wikis/rough notes
  • Hacking
  • Webcasts (not free)

This great resource page includes more than fifty links to articles on the subject.

I highly recommend that libraries interested in discovery services give extra attention to the articles in the debates and library experiences sections so that they can learn about the technology with their eyes wide open.

You can access my writings about discovery services here.


Blog sponsor Deep Web Technologies has just published a whitepaper, “Next-Generation” Federated Search: Critical for Intellectual Property Research.

The whitepaper explains why “Next-generation federated search technologies are quickly becoming an essential and indispensable tool for attorneys, paralegals, expert witnesses, and owners of IP to create, protect, monitor and litigate their intellectual property portfolios.”

Larry Donahue, Deep Web Technologies’ Chief Operating Officer and Corporate Counsel, authored the whitepaper. Mr. Donahue is licensed to practice law in New Mexico and Illinois and is a registered patent attorney thus he very well understands the information needs of the legal profession.

Intellectual property litigation is but one field of law in which missing important documentation in preparing a case can be a very costly mistake in court to say nothing of the loss in credibility. The right federated search solution, configured to search all the relevant sources, can serve to sufficiently widen the net to avoid missing critical information while keeping the legal staff out of overwhelm.

At just two pages, the paper is a quick yet impactful read. And, of course, there are many industries outside of law in which the cost of missing information is high.


I recently ran into this review of a new book, “Going Beyond Google.” The book is authored by Jane Devine and Francine Egger-Sider. The review got me curious so I contacted the publisher, Neal-Schuman, and got a review copy.

The book’s subtitle is “The Invisible Web in Learning and Teaching” and that’s what the book is about, educating people about the part of the web that many of us refer to as the deep Web. The book is targeted to students in LIS programs who are first learning about search technologies. The book aims to broaden their horizons and to wean them from the attitude that Google knows all. The book is a small paperback, just 156 pages, but it’s densely packed with information.

Read the rest of this entry »


Blog sponsor Deep Web Technologies asked me to produce a white paper on why the quality of search results matters so I wrote a four-pager — Quality, Not Quantity: The danger of overlooking quality of search results. I wrote the paper to be easy to read while packing a good amount of information.

The white paper is available from Deep Web Technologies’ web-site as a PDF document. The paper is divided into short sections, some with pithy titles:

  • Why quality of results matters
  • What does “quality of results” mean anyway?
  • Too many results, not enough time
  • It’s not a popularity contest: the dirty little secret of the search engine industry
  • The need for speed and the price you pay
  • The myopic focus on features
  • What really matters
  • How Federated Search fits the bill
  • Not all federated search engines are created equal

Read the rest of this entry »


Jill Hurst-Wahl of Digitization 101, yesterday published a list of blogs and presentations related to federated search. Here is Jill’s blog list:

Read the rest of this entry »


Alissa Miller has produced an impressive list of deep web-related resources for the Online College Blog. I’m particularly impressed at how much time Alissa must have spent researching resources for the list.

The list is divided into nine sections:

  1. Meta-Search Engines
  2. Semantic Search Tools and Databases
  3. General Search Engines and Databases
  4. Academic Search Engines and Databases
  5. Scientific Search Engines and Databases
  6. Custom Search Engines
  7. Collaborative Information and Databases
  8. Tips and Strategies
  9. Helpful Articles and Resources for Deep Searching

Read the rest of this entry »


In an effort to help customers to clarify their needs when considering federated search products and services, I’ve produced a list of over 100 questions to consider when you talk to vendors.

     100 Federated Search Requirements Questions To Ask Vendors

I’ve purposely published the document in Word format, rather than as a PDF file, so that you can edit the list, and copy and paste from the list, to meet your needs.

The checklist is categorized and includes questions pertinent to self-hosted or vendor-hosted. I will be soliciting input from a number of vendors to fill in any gaps in question or topic coverage.

Read the rest of this entry »


Computers in Libraries has published a set of four federated search vendor-sponsored white papers: Federated Search For Your Library and in Your Enterprise.

  1. Building a Better Search Query Through Content Mining, by Swets
  2. Discover: It’s Not About Federated Search, it’s About Discovery, by Gale
  3. Primo Discovery and Delivery – Beyond the OPAC: A unified interface for finding and getting all library resources, by ExLibris
  4. Taming Multiple Search Engines in Your Organization, by Jean Bedord

Read the rest of this entry »


There are a number of excellent federated search presentations, freely available for downloading, if you know where to find them. The list that follows is my attempt to identify a number of presentations that I consider to be outstanding, either because they provide an excellent overview of federated search, or because they cover some aspect of the industry exceptionally well. Is the list exhaustive? Of course, not. I think of it as a start and I would love to hear your comments about which ones should be added to the list. You should be aware that I favor recent presentations as things change quickly in this industry.

Most of the presentations are in PowerPoint, a few are PDF files, and a couple are in web-based embedded slide show applications. One is a video!

Read the rest of this entry »