Algorithm ranks world's top soccer talent

Researchers at Northwestern University say they have devised a way to scientifically name not only the top 20 footballers at EuroCup 2008, but their exact ranking as well.

Elizabeth Armstrong Moore
Elizabeth Armstrong Moore is based in Portland, Oregon, and has written for Wired, The Christian Science Monitor, and public radio. Her semi-obscure hobbies include climbing, billiards, board games that take up a lot of space, and piano.
Elizabeth Armstrong Moore
2 min read
This diagram looks at soccer players as nodes on a network during the three knockout-phase matches for Spain's team in the 2008 EuroCup tournament. Node position is a player's field position and node number refers to the player's jersey number. Amaral/PLoS

These days, in pretty much every sport, there is no hiding from statistics. Coaches, team owners, fantasy leaguers, and fans are tracking and analyzing a player's every move, fitness level, and more.

And now, thanks to a chemical and biological engineer at Northwestern University who is also a self-proclaimed football fanatic, we can compare our number-crunching with a much-touted new algorithm.

Professor Luís Amaral's rating system, unveiled Wednesday in the online journal PLoS ONE, was first put to the test after the 2008 European Cup, when it ranked the 20 best footballers that played--a list that lines up pretty well with general analysis. (Sergio Ramos and Xavi Hernandez, both of Spain, tied for first.)

Soon we should be able to see if the model's best-rated players for the 2010 World Cup in South Africa are lining up with the real-time results.

Amaral goes out on a limb by practically ignoring a variety of measures often considered crucial in understanding a player's power, including penalty cards, shots, misses, and assists.

Instead, he and his colleagues treat teams as networks and individual players as nodes, analyzing not so much individual play as the flow of passes between players.

"We looked at the way in which the ball can travel and finish on a shot," Amaral says in Northwestern's news release. "The more ways a team has for a ball to travel and finish on a shot, the better that team is. And the more times the ball goes through a given player to finish in a shot, the better that player performed."

Luís Amaral Northwestern University

The Northwestern model assigns a point to each player involved in a sequence of passes and finds the average point totals for that network of players. The average player is rated 0, with better-than-average coming out with positive ratings and lower-than-average negative.

In both the computer model and the EuroCup, the Spanish team came out on top, with a first-place tie between Hernandez (ranked 3.0), who scored highest for individual match performance, and Ramos (2.1), who scored highest for overall tourney performance.

After Argentina played Nigeria this past weekend, the computer identified Argentina's Lionel Messi as the top performer. Because Messi is widely considered one of the top players in the world, the program did not embarrass its makers.

By analyzing tens of thousands of matches, Amaral's algorithm could potentially identify the best player of all time, perhaps finally settling the great Pele vs. Maradona debate: "If you ask people to compare a performance today with a performance from 10 years ago, you start to romanticize performances. There are always biases, but our algorithm has no biases." (A good thing, as Amaral is, like Cristiano Ronaldo, from Portugal.)

Such a system could theoretically also help identify the best team players in all sorts of venues, including at work. So be careful not to forward this to your boss; it might rank as a bad move.