The ADL Librarian is pondering this question which I’m sure many of us have pondered as well:

When we list resources in a digital library; we have a need to create categories into which we place our lists of resources. The ongoing mystery for me is what is the best label for the database category. We all know that our user’s don’t know what a database is or does.

I love these kinds of questions because it forces us all to get clear about what we mean when we name things. Recently, a conversation about whether “federated search” means what I thought it meant was going on at code4lib. Here’s a link to one of those messages. But, I digress.

When you write or think about those “things” that hold content, what noun do you use? Here are some possibilities:

  • database
  • collection
  • source
  • information source
  • data source
  • resource

I’m looking at the broader question of not just what’s in databases but what you call a bucket of content that a federated search application accesses through a connector.

I was curious to see what federated search vendors were calling these things so I visited a number of their sites. What I found was that there is no consensus. Everybody uses whatever terms they want to use.

Here’s a table of the terms used at different vendor sites. I’m not saying that the only terms used by particular vendors are the ones in the table. There are other likely terms; I looked just deeply enough at vendor sites to find one or two terms:

Deep Web Technologies source, collection http://www.deepwebtech.com/product/index.html
Ex Libris e-resource, e-material, database, resource, electronic resource, e-collection http://www.exlibrisgroup.com/category/ElectronicResources
MuseGlobal source of searchable content, content source, source http://www.museglobal.com/solutions/index.html
Vivisimo data source, source http://vivisimo.com/products/demos
WebFeat Express e-resource http://www.webfeat.org/wfexpress.htm
WebFeat Enterprise database, resource http://www.webfeat.org/wfenterprise.htm

An interesting question is, “How are these terms different from one another?” It’s not particularly clear to me what differences there might be so I’d love to hear your thoughts on the matter. I did spend some time looking for an online glossary of information retrieval terms but didn’t find any that defined any of the terms in question.

I think that even the term “database” is getting harder to define. We talk about federated search connectors searching for content that lives behind web forms inside of databases but we don’t really know if there’s a “real” database, e.g. Oracle, MySQL, SQL Server, behind those forms or if the documents are stored in individual files and searched via an index. In my mind, a set of files could act just like a database; content in files can be fielded, searched, edited, and updated.

I’d be interested to hear from those of you who have been in the information retrieval world for a long time. Can you educate us about how particular terms came into use? Can you identify differences in meaning among terms?

If you enjoyed this post, make sure you subscribe to the RSS feed!


This entry was posted on Tuesday, May 5th, 2009 at 3:48 pm and is filed under viewpoints. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or TrackBack URI from your own site.

5 Responses so far to "What do you call that thing?"

  1. 1 Jodi Schneider
    May 5th, 2009 at 6:34 pm  

    Related: Library Terms that Users Understand

    Terms most often cited as being misunderstood or not understood by users include…
    Periodical or Serial

  2. 2 Peter Noerr
    May 7th, 2009 at 6:03 pm  

    We use the term “Source” (capitalised) very strictly to refer to this thing. The terms you [picked up from our site were conversational use in running text.

    We use Source for precisely the reason that what we are searching is not necessarily a database. (And, in passing, I agree that a database can exist independently of the software used to manage it. But I am not trying to define “database” here.)

    One immediate disqualification fro database is that, using its conventional conflation, it does not describe what a federated search system searches. There is a (very famous) database called “Medline”. But it is not a Source. We have to be more precise, because the Medline ‘thing’ is available from at least 6 different Hosts (notice capitalisation). Each of which spins it a different way (pun intended for those of you old enough to remember “database spinners” - here called Hosts). Thus to us a Source is a combination of the data(base) and the Host making it available. There are other nuances to allow for the actual functionality and the software involved, but this is the broad brush picture.

    Equally we needed something not overloaded as “database” is, because many of the things we connect to are not collections of data. They may be processing systems - think of a ILS circulation system. So we chose the more neutral “Source” (over “Resource” as it was shorter) and capitalised it for this defined meaning.

    And…. it seems a reasonably well understood term. Most people are familiar with a source as something from which things flow or where things are available, so it seemed a good fit.

  3. 3 Stephan Schmid
    May 8th, 2009 at 8:04 am  

    We call this abstract thing “data source”, since it delivers data from a specific source.
    Each data source contains attributes that describe it more concrete. For example a data source of type “database” could use the protocol “JDBC” and the format “ResultSet”, but these are more technical details.

    At first glance, I associate the term “database” with a relational database (since I am a software engineer). I also like the term resource or simply source.

  4. 4 Paul R. Pival
    May 21st, 2009 at 11:32 am  

    Last summer we redesigned our website, and the single most common complaint was that people couldn’t find what we used to call “research databases” and now call “online resources”. Still considering re-naming it this summer, maybe to e-resources, but it sounds as though that won’t help universally either. Glad it’s not just me though!

  5. 5 The Distant Librarian
    May 21st, 2009 at 11:39 am  

    First Impressions…

    I finally got around to reading a couple of posts I’d squirreled away, and they turn out to be somewhat related. Brian Mathews posts about 5 next-gen library catalogs and 5 students: their initial impressions. Important to us here at……

Leave a reply

Name (*)
Mail (*)