Recommendation engines: How they work, why we need them

By IKU KAWACHI

Chances are you’ve probably used one of them, even if you haven’t realized it: recommendation engines, also known as recommender systems, platforms or frameworks. They use information drawn from user profiles, viewing history and survey results to predict the “rating” you’d give a particular item you haven’t visited yet and suggest ones you might like, and they’re everywhere these days.

Ever used Pandora Radio, Flixster, or Last.fm? How about Amazon.com or even Facebook? All of these Web sites implement the technology in some way, whether it be to entice you to buy a book based on your interests and viewing history, listen to another artist with a similar style, or point you to photos, Wall posts, and other updates that the engine thinks you’d like to see.

The methodology that powers recommendation engines can be loosely divided into two types: collaborative and content-based filtering. Engines based on the former “guess” what a particular user might be interested in by drawing from all of its data on other users who have looked at the same items or displayed the same tendencies. If, for example, users who watched the bank heist thriller Inside Man also happened to often click on Déjà Vu, another Denzel Washington-starring thriller, the engine would suggest one film to all users who viewed the other.

The algorithm behind Amazon.com’s various recommendation tools is perhaps the most prominent and well-known example of this “item-to-item” collaborative filtering approach, which has become increasingly common amongst online shopping sites in recent years. Don’t assume that collaborative filtering has already long been perfected, though. Shopping, rental, and music- or video-related sites are more eager than ever to gain a competitive edge by refining their engines, perhaps none more so than the DVD/Blu-ray rental and video streaming giant Netflix: the company launched a massive open competition called the “Netflix Prize” in 2006, offering $1 million to whomever developed the most accurate collaborative filtering algorithm.

Yet there is another method to producing recommendations: the content-based approach. These engines compare the information they’ve gathered on a particular user to a set of unique characteristics pertaining to each item and evaluate how well the two match, predicting a user’s “rating” of that item. The Internet radio and music sharing site Pandora Radio’s Music Genome Project is perhaps the best implementation of this more complex and labor-intensive approach, one that involves analyzing over 400 musical attributes per song and sorting them by some 2,000 focus traits. (Time’s May 27 feature “How Computers Know What We Want — Before We Do” provides an intriguing in-depth look at the service.) While a collaborative filtering-based engine has no actually “knowledge” of the products or services it’s recommending — placing an inherent limit on its accuracy — content-based filtering draws from data within the engine on the item itself.

It’s not like arming yourself with knowledge of the ever-ubiquitous recommendation engine will necessarily make online shopping more stress-free or enjoyable. It might be something to think about, though, the next time you access a site like Amazon.com or Netflix. And, if you’re interested in using other such services to further enhance your Web experience, Freebase and Jinni are two worthy candidates.

This entry was posted in Iku Kawachi and tagged , , . Bookmark the permalink.

Leave a Reply