I am honored to have had the opportunity to interview Kate Noerr for the federated search luminary series. Kate is co-founder, Chairman, and CEO of MuseGlobal, a leading supplier of content integration software. Kate, through MuseGlobal as well as in prior businesses, has been developing innovative solutions to what are now termed “federated search” problems since the 1980s. Kate is not satisfied to address only the challenges of federation. Her company considers these interrelated areas to be critically important as well: harvesting, transformation, enhancement, security, source maintenance, and multiple delivery mechanisms.
Kate is the first luminary that I recognize in this blog. You can find future luminary interviews in the luminary category. I invite you to nominate people who deserve to hold the federated search luminary distinction.
- The MuseGlobal executive team page about you tells that you have extensive experience in the library and information industries, that you’ve taught in library schools, that you ran a European information network and that you were co-founder of the library automation company, IME Group.
How did these experiences prepare you to become a co-founder, Chairman, and CEO of MuseGlobal?
My experience over many years has focused around content and search (“information retrieval” in the olden days). I’ve dealt with document delivery (before the Internet), with indexing solutions, with extended networks providing content of various sorts, and with libraries. A word about libraries: libraries are extensive users of content of all sorts, whether that content is contained within a book or other medium, or through a wide variety of journals and other sources of information. As technology has advanced, so has the technology applied to providing automation solutions for libraries. Thus I (and most of the executive team at MuseGlobal) have broad and deep expertise in understanding and working with all forms of content, including access to that content – ‘search’.
The problem of content in multiple repositories has always been an obvious one, and we built MuseGlobal to solve not only that problem, but all the issues associated with it. These include integrating into third party solutions, authentication and other rights management concerns, and of course, processing the results, both iteratively and on a one-time basis, in both syntactic and semantic forms, and then delivering the results to a platform.
- You have quite a bit of exposure to the library automation space, with IME Group and other experiences. I’m curious, when did you first hear about federated search, and in what context? I assume it was prior to the proliferation of the World Wide Web, and prior to the use of the term “federated search.”
Well, the need for access to multiple content sources has always existed. The early online aggregators (Dialog, Orbit, BRS) all tackled this problem, and we (the founders of MuseGlobal) also tackled the problem in multiple ways. In fact, an early business of mine, well before the Internet, was to provide what we would now call ‘federated’ access to standards information, which was delivered in an updated form on a floppy disk and delivered by post. (Note: this was a great idea, and a precursor to much of what we do now, but it was the 1980s, and much too early to be widely accepted). What is now called ‘federated search’ is to me, only one aspect of what needs to be done to successfully gather, transform, and deliver content. For a while, in the late 1990s (and well before, in the library space), some of the aspects of federation were called cross-search, or broadcast search, and there was a concerted attempt to use the term metasearch, which of course, is now used mainly to refer to web federation.
- What inspired you to start MuseGlobal?
The technology was finally there to solve the problems we (again, the founder/owners of MuseGlobal) had been working with for a long time.
- How did MuseGlobal get its name?
Available url! Although initially Muse was an acronym, and I’ve long since forgotten what it stood for.
[ Editor’s note: Those of you who are dying to know what the “Muse” acronym stands for are invited to do a little bit of research on the Wayback machine! ]
- I understand that MuseGlobal uses the term “content integration” to mean something more than federation. Can you explain what you mean by the term?
Technically, federation is going out and gathering information from a variety of sources, then processing this information and delivering a single combined result set. Well, that’s bare bones. Content integration is a broader term, in our opinion, and carries with it the ability to integrate into a third party application, such as a CRM, a portal, an enterprise search engine, a business intelligence application, etc. It also means that the results either queried or harvested from the sources of content, can be normalized and processed in a wide variety of ways, and tailored to the target application(s). For example, a front-end widget with timed and constantly varying news information, needs certain mining techniques applied to the data, to be assured of the freshest and most accurate results to be displayed through the widget.
- How is MuseGlobal unique in the marketplace?
We are unique primarily because we offer the most complete and flexible content integration solution in the marketplace. Our functionality encompasses seven areas: federation, harvesting, transformation, enhancement, security, source maintenance, and multiple delivery mechanisms. In addition, there are several other elements to our technology that are exclusive to MuseGlobal, including:
§ Our solutions are designed to address integration at the back-end and as such, have a very robust Information Connection Engine (ICE) underlying our architecture.
§ We built the Source Factory, a constantly expanding library of more than 5,500 content sources, to handle building, fixing and delivering source connections (variously called connectors, adaptors, workers, etc., depending on the company). This is a highly scalable workflow system which automates checking all the many thousands of connectors we’ve built on a daily basis, and automating much of the building, diagnosis of fixes, and some fixing of the tools.
§ We deliver ‘fixes’ in an automated broadcast way, similar to the way anti-virus definitions are sent out. As you know, it is anticipated that a connector will ‘break’, because of a simple thing such as a url change, or a more complex thing, such as a switch-out of a search engine, a format change, an internal record structure change, etc.
§ Early on we started what we call the Content Partner Program, which is non-commercial relationships with hundreds of content providers, to ensure that we have access to their most appropriate api or gateway, to ensure that results from their content are served up to the user in the way the content provider wants and the users expect, and to proactively be informed about forthcoming changes to their environment.
- Back in May, MuseGlobal and Adhere Solutions announced a partnership to deliver federated search via the Google Search Appliance. How has the response been to the announcement?
Wonderful! The first few sales are closing now, and Adhere and MuseGlobal anticipate a continued highly successful partnership.
- You blogged recently about content integration and cloud computing, where content commonly lives in “the network” in a “cloud’ somewhere. How does the MuseGlobal content integration architecture support cloud computing?
We have dealt with the cloud from the beginning, and our ICE architecture (as above) is fundamental in supporting that. By the way, I love the term cloud computing – it conjures up all sorts of lovely images. And of course, the data center in the sky (cloud) is a variation on the asp model. I see this as another opportunity for us, for those organizations using cloud computing, considering it, or offering it. We deal with a number of companies along these lines, and technically it’s not really different from what we’ve been doing on the web all along.
- What is your vision for MuseGlobal?
To continue to grow and be successful with multiple lines of business, which we have achieved to date, and expect to continue to achieve. We will continue to be an OEM provider, as it is my firm belief that we are but part of an overall solution, whether that’s a CRM system, a content management system, an integrated library system, or any other system or service.
- What do you see as the biggest challenges in federated search?
To my mind, the biggest challenge is education of the marketplace to understand what problems federated search (and I would say content integration more broadly) addresses, and how that marries with their total environment. There are a large number of activities underway in the marketplace in general, particularly for the enterprise, which is great news.
In considering your recent blog on challenges to federated search, here are my comments. One of the fundamentals of federated search is normalizing search queries across different search engines, which for MuseGlobal, means mapping that search query in all its simplicity or complexity to the underlying (multiple) search engines. This involves a true understanding of search, and an ability to represent boolean where that doesn’t exist, and to expand where needed, up to the limits of whatever the search engine accessing the content can handle or be ‘forced’ to handle. An even more important aspect is to map the content, in as rich a form as is possible and therefore to deliver up to the end-user a full experience, and not a ‘dumbed-down’ version of a search or a result. This process can be enhanced where standards are used and well-implemented, although as is well known, standards use tends to vary, but standards themselves are critical, whether we’re talking SRU/SRW, XML and its variants, even http.
While it may be that the some think the ‘ideal’ situation is everything in one repository, that will assuredly never happen, under any circumstances. I also don’t happen to think that’s an ideal situation. Why would anyone want everything in one vast repository? What happens to all the specialized information, to all your own unique information, to sensitive commercial (and other) information, and on and on. Google Scholar is one of many many aggregations of collections of information. That’s fine, and a great thing to do, but it does not and will never contain all the world’s scholarly information, nor does it attempt to do so. It’s one of the thousands of sources to which we provide a very well-mapped connector, which combined with many other sources, will provide a more complete results set to the users.
It’s also well known in the library world that libraries are way under-utilized, but my point on the biggest challenge being education of the marketplace is aimed at the non-library marketplace, whether it’s education (as in Blackboard, etc.), or enterprise search (as in FAST, Endeca, Oracle’s Secure Enterprise Search, IBM’s OmniFind), content management systems (EMC’s content suite, etc.) or CRM systems (SalesForce, etc.). The world at large, so to speak, is beginning to understand and to address the issues surrounding multiple silos, and I consider this fabulous news. This I believe is what the Federated Search blog refers to as the ‘new e-content business environment‘. This gives enormous opportunity to technologies such as ours, and it’s what I find most satisfying and thrilling about building a company. The world is coming to our door!
- What do you see as major trends sweeping federated search in the next few years?
I think that federation itself will be an expected part of any environment. Take a look at various website messages, ranging from EMC to SalesForce to Oracle to IBM, and you’ll see that this issue is being incorporated into a broad series of offerings, all aimed at content integration. Again, the ‘econtent business environment’.
- What are you most proud of as Chairman and CEO?
Hard to say – perhaps that we’ve built a company and a technology which are both very scalable, and adaptable to a wide variety of lines of business.
- What’s the next big thing for your company?
We have several very large initiatives underway that will be revealed in the next two quarters.
Thank you, Kate, for a most informative interview that I’m sure blog readers will value.