Jan
Web 2.0 is a fascination of mine. I’m very community oriented and I’ve watched the computer industry evolve over the past nearly thirty years. I’m very excited about the potential for people and computers to change the world and to help solve our most pressing problems.
Lorcan Dempsey took a look at O’Reilly’s “Programming Collective Intelligence” and he inspired me to look at the book as well. I blogged about Lorcan’s blog article and was able to get a review copy of the book.
Tim O’Reilly, social media visionary and publisher of this and many computer software books, wrote the preface to this book. O’Reilly is credited with inventing the term “Web 2.0.” He writes in part:
I defined Web 2.0 as “the design of systems that harness network effects to get better the more people use them.” Getting users to participate is the first step. Learning from those users and shaping your site based on what they do and pay attention to is the second step.
In Programming Collective Intelligence, Toby Seagram teaches algorithms and techniques for extracting meaning from data, including user data. This is the programmer’s toolbox for Web 2.0. It’s no longer enough to know how to build a database-backed web-site. If you want to succeed, you need to know how to mine the data that users are adding, both explicitly and as a side effect of their activity on your site.
I’ve been following Tim O’Reilly on Twitter for a while and I value his thoughtful links. 23,721 other people apparently value his thoughts and follow him as well.
What does Web 2.0 have to do with federated search? I firmly believe that the future of scholarly research is in community, in collaborative discovery. While I believe that the technology behind federated search will always serve as a foundation to greater things, I also believe that the research process is quickly evolving — research has met social media. Tomorrow’s federated search applications need to move beyond just providing search to providing a community experience of searching together.
As a programmer, I was delighted to look at “Programming Collective Intelligence.” No, I’m not going to pretend to have read the book from cover to cover. I’ll enjoy the book as a reference book; others might read the book in its entirety. The book has 12 chapters and is chock full of ideas, algorithms and Python code examples.
You can get a good sense of what the book covers by considering these questions:
- How can email systems effectively identify spam?
- How do sites like Amazon.com and Pandora make recommendations?
- How do search engine relevance ranking algorithms work?
- How does Google’s PageRank algorithm work?
- How do you discover and visualize groups of things, people, or ideas that are related?
- How can documents be automatically classified?
- How can a site predict who will become paying customers without asking them lots of questions?
- How can software model home prices?
- How can you build a clustering engine?
Some of the questions the book addresses do pertain to federated search. But more broadly, the questions all pertain to how software can effectively mine user data to create a better experience for all users. That’s the power of Web 2.0 and Toby Seagram does an excellent job of making programming approaches easy to understand. I highly recommend the book even if you never plan to implement any of the algorithms, just to develop a conceptual understanding of how they work.
Tags: federated search, tim o'reilly