Think that song has appeared in your playlists just a few too many times? David Braue puts the randomness of Apple's song shuffling to the test -- and finds some surprising results.
Quick -- think of a number between one and 20. Now think of another one, and another, and another.
Starting to repeat yourself? No surprise: in practice, many series of random numbers are far less random than you would think.
Computers have the same problem. Although all systems are able to pick random numbers, the method they use is often tied to specific other numbers -- for example, the time -- that means you could get a very similar series of 'random' numbers in different situations.
This tendency manifests itself in many ways. For anyone who uses their iPod heavily, you've probably noticed that your supposedly random 'shuffling' iPod seems to be particularly fond of the Bee Gees, Melissa Etheridge or Pavarotti. Look at a random playlist that iTunes generates for you, and you're likely to notice several songs from one or two artists, while other artists go completely unrepresented.
This phenomenon has been observed widely across the world, with many conspiracy theorists suggesting there was more method than madness to Apple's randomisation routines.
Just what are they implying? Consider, for a minute, that you're a music industry marketer. There could be little more tempting than direct access to the ears -- and, indirectly, the wallets -- of tens of millions of iPod users around the world.
Through payment of a fee, the theory goes, a record label could increase the rotation frequency of their own music by tweaking Apple's randomisation formula. Popular songs and artists from their catalogue would pop up on playlists time and again, potentially explaining why your 50-strong playlist includes half a dozen Jackson 5 tracks but no Jackson Browne.
Less insidiously, iTunes could be tracking the songs you like the most -- it already does this -- then rotating them more often into its playlists.
Concerns over the randomness of Apple's randomness have even reached the ears of Steve Jobs, who has emphatically denied that the iPod's shuffle feature -- and the design of the iPod Shuffle itself -- is anything more than random. Just tell that to the hundreds of forum participants posters who have posted their complaints about the devices' playlist approach.
After an afternoon spent listening to far too much Bon Jovi, we decided to put iTunes to the test.
Building the perfect library
To evaluate iTunes' randomness, we borrowed a Mac Mini from Apple, with its fresh install of Mac OS X ensuring that we were working with an empty iTunes library and an otherwise completely clean slate.
We purchased AU$170 worth of Apple iTunes Music Store prepaid cards, then proceeded to go on a carefully planned shopping spree. As it was necessary to have multiple songs from one artist to observe any untoward clustering, we purchased five songs from each of four artists, with four artists chosen arbitrarily from the online artist lists of each of the major music labels (EMI, Sony, Universal and Warner Music).
This gave us a total of 80 songs. To see whether popular songs were being rotated more frequently, we also purchased 20 more songs from Billboard's current (as of late February) Top 50 chart, which represented a variety of labels. All told, we purchased and downloaded 100 iTunes songs from the iTunes Music Store (download the spreadsheet for the full song list here).
We then used the Smart Playlist feature to force iTunes to make random playlists 25 and 40 songs long, respectively. Ten playlists of each length were created, providing a total of 20 playlists and 650 possible song positions. Each song list was exported to a text file for analysis using Microsoft Excel.
If Apple and the labels were including any information to change songs' priority, it would arguably be stored in the downloaded AAC files. To test this, we also added another 100 MP3 files, previously ripped from a variety of CDs, that definitely contained no extra coding information whatsoever. These artists included Def Leppard, Bon Jovi, Erasure, Maroon 5, Bob Seger and even John Denver & The Muppets for variety.
With 200 songs in the iTunes Library, we then repeated the random playlist test, creating an additional ten playlists with each of 25 and 40 songs.
Say You, Say What?
Using Excel, we collated the results and counted the number of times that each song and artist appeared in the playlists iTunes had generated.
All things being equal (and random), one would expect that a field of 1300 song slots would provide enough opportunities for each of the 56 artists to be equally represented. However, this was not the case. In fact, even though each specifically chosen had five songs in the library, there was a large discrepancy between the most popular and least popular artists.
Lionel Richie (Universal) proved to be iTunes' most popular artist, appearing 59 times all told, for an average of 1.475 times per possible playlist (or TPP, an objective measure reflecting the fact that iTunes-purchased songs were available to iTunes during creation of 40 playlists while CD-ripped songs could only have appeared on 20 playlists).
Red Hot Chili Peppers and The Veronicas tied for second place (55 times / 1.375 TPP) with Keane and Robbie Williams (53/1.325 TPP), Eskimo Joe (52/1.3 TPP), Good Charlotte (51/1.275 TPP) and Grinspoon (50/1.25) all appearing at least 50 times in our playlists.
Looking further down the list, however, a curious trend appeared. Artists whose five songs were bought from iTunes were consistently more likely to appear on the random playlists than those whose songs were ripped from CD. Each of the top 15 artists, by number of songs played and songs per playlist, was bought from iTunes; those ripped from CD were far less likely to rate.
Songs by Def Leppard, the most frequently-played artist from CD, were chosen 24 times in 20 possible playlists for a TPP of 1.2 -- but the rate of selection for other CD artists quickly dropped off: Bon Jovi (21/2 TPP), Creed (20/1 TPP), and Gloria Estefan (18/0.9 TPP) were all slightly ahead of Andrew Lloyd Webber, the Bee Gees, Dido, Erasure, Jackson Browne, Maroon 5 and MÃ¶tley CrÃ¼e, all of which were played 17 times for a TPP of 0.85.
The least frequently played artists were all those whose songs were taken from CD, with the bottom of the rung inhabited by Kate Bush (12/0.6 TPP), Anderson Bruford Wakeman Howe (11/0.55 TPP), and Christina Aguilera and Oasis (10/0.5 TPP).
When the artists with just one song were factored in, things got even more interesting. Smack That -- a current hit by Akon and Eminem -- was played 17 times, which was the mean for the artists with five songs in the iTunes Library. Its TPP was 0.425, a frequency that translates to 2.125 if treated as though there were five copies of the song (we'll call that ratio TPP5). In other words, we were 1.44 times as likely to hear Smack That than any song by Lionel Richie, even though Lionel Richie had five songs in the iTunes Library.
A role for labels?
Statistically speaking, however, we would have expected songs from artists with 5 tunes each to be heard more frequently since they represented a larger proportion of the total song pool. In practice, however, this simply was not the case. Most of the singles we purchased -- all are currently in the Billboard top 50 and all are wildly popular with iTunes buyers -- had a disproportionately high profile in the frequency with which they were chosen.
By contrast, four songs -- all ripped from CD and from Christina Aguilera, Creed, Crowded House and Led Zeppelin -- were never chosen at all. Not once, in 40 playlists representing 1300 songs.
To address conspiracy theorists that might suggest this imbalance was the result of manipulation on the part of the studios, we calculated the percentage of playlist songs attributable to each label, then compared it with the percentage of all songs each label represented.
If the labels were each weighted evenly, these percentages should have been similar. We found, however, that there were noticeable differences. Songs handled by Universal, for example, were loaded onto 24.46% of available playlists but represented just 21.43% of all songs. Warner, on the other hand, comprised just 21.01% of the playlists but 26.53% of our songs. EMI was higher than expected, with 19.07% of the playlists and 17.86% of all songs. Sony, despite having the largest number of songs (67, or 34.18%) out of the 196 ultimately shuffled, represented just 18.8% of selected songs.
When only the iTunes-sourced songs were considered, Universal, Warner and Sony were actually represented on a smaller percentage of playlists (26.57%, 24.55% and 23.1%) than would be expected (27%, 28%, and 28%) while EMI was disproportionately represented: its songs were, on average, found on 23.59% of playlists but only comprised 16% of available songs.
The new random
It's hard to say any results are absolutely conclusive -- after all, strange things happen all the time completely by random -- but the revelations in our study certainly lend weight to suggest that people suspecting iTunes is less than random may be on to something.
Here's a summary of our results:
20 playlists (10 of 25 songs, and 10 of 40 songs) were created from a pool of 100 iTunes Music Store sourced songs, and 20 additional playlists when the pool was expanded to 200 songs using CD-ripped songs. This provided a total of 1300 slots to be filled at random.
On average, one would expect each song to appear on 6.5 playlists.
Popular, top-50 singles were rotated onto our playlists far more frequently than would be expected. Some artists, having just one song in the iTunes Library, were played more often than the entire 5-song collections of other artists.
Artists and singles purchased through iTunes were played more frequently than those that were not.
Four songs -- Christina Aguilera's At Last, Creed's What's This Life For, Crowded House's World Where You Live and Led Zeppelin's Nobody's Fault But Mine -- were in the iTunes Library but were not chosen for any of the 40 playlists generated during this exercise.
Lionel Richie (Universal) was iTunes' favourite artist; his songs were chosen 59 times for 40 playlists [iTunes songs only]. Times per possible playlist (TPP) = 1.475.
Def Leppard (Universal) was iTunes' favourite artist among songs ripped from CD; their songs were chosen just 24 times for 20 playlists [iTunes songs and ripped MP3s]. TPP = 1.2
John Mayer (Sony) was iTunes' least favourite artist; his songs were chosen just 32 times for 40 playlists [iTunes songs only]. TPP = 0.8.
Oasis (Sony) was iTunes' least favourite artist; their songs were chosen just 10 times out of 20 playlists. TPP = 0.5.
Songs from Universal and EMI showed up in more play lists than their share of the iTunes Library would suggest.
Songs from Warner and Sony showed up in fewer play lists than their share of the iTunes Library would suggest. The disparity was striking in Sony's case, with the company's 67 songs (the largest single label representation amongst our Library) accounting for 34.18% of our songs, but chosen for just 18.8% of possible playlists.
Could this be a result of the relative popularity of each label's artists, or is somebody conspiring to keep Sony's numbers lower? Or is this just a natural manifestation of the known deficiencies in computers' random-number algorithms?
It's obviously difficult to tell whether back-room marketing deals or just dumb luck were responsible for the results we saw, but it appears that we can safely lend credence to the suspicions of myriad iPod users around the world. When it comes to choosing songs, 'random' clearly is relative.
Have you noticed any patterns in iTunes' randomiser? Do you the matter should be studied further? Leave your comments below.