The ADL Librarian is pondering this question which I’m sure many of us have pondered as well:
When we list resources in a digital library; we have a need to create categories into which we place our lists of resources. The ongoing mystery for me is what is the best label for the database category. We all know that our user’s don’t know what a database is or does.
I love these kinds of questions because it forces us all to get clear about what we mean when we name things. Recently, a conversation about whether “federated search” means what I thought it meant was going on at code4lib. Here’s a link to one of those messages. But, I digress.
When you write or think about those “things” that hold content, what noun do you use? Here are some possibilities:
- information source
- data source
I’m looking at the broader question of not just what’s in databases but what you call a bucket of content that a federated search application accesses through a connector.
I was curious to see what federated search vendors were calling these things so I visited a number of their sites. What I found was that there is no consensus. Everybody uses whatever terms they want to use.
Here’s a table of the terms used at different vendor sites. I’m not saying that the only terms used by particular vendors are the ones in the table. There are other likely terms; I looked just deeply enough at vendor sites to find one or two terms:
|Deep Web Technologies||source, collection||http://www.deepwebtech.com/product/index.html|
|Ex Libris||e-resource, e-material, database, resource, electronic resource, e-collection||http://www.exlibrisgroup.com/category/ElectronicResources|
|MuseGlobal||source of searchable content, content source, source||http://www.museglobal.com/solutions/index.html|
|Vivisimo||data source, source||http://vivisimo.com/products/demos|
|WebFeat Enterprise||database, resource||http://www.webfeat.org/wfenterprise.htm|
An interesting question is, “How are these terms different from one another?” It’s not particularly clear to me what differences there might be so I’d love to hear your thoughts on the matter. I did spend some time looking for an online glossary of information retrieval terms but didn’t find any that defined any of the terms in question.
I think that even the term “database” is getting harder to define. We talk about federated search connectors searching for content that lives behind web forms inside of databases but we don’t really know if there’s a “real” database, e.g. Oracle, MySQL, SQL Server, behind those forms or if the documents are stored in individual files and searched via an index. In my mind, a set of files could act just like a database; content in files can be fielded, searched, edited, and updated.
I’d be interested to hear from those of you who have been in the information retrieval world for a long time. Can you educate us about how particular terms came into use? Can you identify differences in meaning among terms?
Tags: federated search