MIT prof: Netflix has its recommendations wrong
An MIT engineering professor says that absolute recommendation scales, such as star ratings, are not terribly helpful and that comparing two products is a far more reliable system.
How well does Netflix really know you? How far has Amazon crept under your pores in order to determine with arrant certainty that you would enjoy a little more Danielle Steel and a little less Tony Blair?
A vast-brained MIT professor insists that these brands know you about as well as your subway train driver.
Devavrat Shah, the school's Jamieson Career Development Associate Professor of Electrical Engineering and Computer Science, furrows his brow at the relative pointlessness of asking humble, subjective souls to rate books, movies, and even cars on an absolute scale, such as a five-star rating system.
Your five rating might, after all, be my three--because I am simply an innately more difficult, cruel, and cantankerous human being. The Netflix algorithm, Shah says, doesn't account for my being an inherently nasty piece of work.
Comparing two products against each other, though--at least according to Shah--neutralizes my unpleasantness. Moreover, Shah has a surprisingly simple way of explaining why, as human beings, we are always more likely to be accurate when merely comparing two things.
"If my mood is bad today, I might give four stars, but tomorrow I'd give five stars. But if you ask me to compare two movies, most likely I will remain true to that for a while," he explained.
You might imagine that Shah and his team are peculiarly attuned to improving the entertainment that humans get out of life. Indeed, they have built a Web site called Celect that intends to create a place where large groups of people (hullo, Congress) can make more harmonious and apposite decisions.
Much of Shah's work has been with car buyers. These are people who claim, for example, that they hate white cars and don't want an Audi but end up buying a white Audi. The professor says his algorithm was able to foresee car buyers' true preferences with 20 percent more accuracy than previously existing algorithms.
Those with mathematical innards will realize that the more comparisons Shah collects, the more permutations there are of ordering them. Not everyone is going to prefer, say, "La Cousine Bette" over "Uncle Vanya" or "Mommie Dearest."
So, like all the finest rationalists, he makes assumptions. His press release uses the example of Robin Williams' "Patch Adams." This is a movie that unaccountably escaped my eyes until now, but is apparently the worst-rated movie on Netflix (of those that have a statistically significant number of ratings).
Shah believes that "Patch Adams" is, as Malcolm Gladwell might call it, an outlier. It therefore has to be ignored in all orderings of preference, in order not to mess up the purity of the results.
The assuming doesn't stop there. The next is to "choose the smallest group of orderings that match the available data."
There's still one more step before an actual score is computed. This involves using "a movie's rank in each of the orderings, combined with the probability of that ordering." Have you got that? There will be a test at the end of this post.
The lay mind can see what Shah is trying to do. Just. He wants to get numbers to more accurately reflect the true comparative scale of the way in which humans judge one work of art (or Pontiac) with respect to alternatives.
Now, as he admits himself, he must put his algorithm into action in the real world, the world in which we get an e-mail from Netflix, try to remember what the movie was like and, usually, click "four stars" out of sympathy.
I am sure that most of us crave the day when a computer can understand us--truly understand us. This nasty piece of work (who is hoping to be nicer) therefore gives the professor's efforts at life improvement a hearty three stars.