Hadoop breaks data-sorting world records

The Hadoop project is the first open-source product to break records in the annual GraySort contest.

Yahoo's grid-computing team announced that Apache Hadoop broke world records in the annual GraySort contest in the Gray and Minute sorts in the general-purpose (Daytona) category.

Hadoop is the only open-source software to ever win the GraySort competition, adding another notch to last year's win at the Terasort competition, where Hadoop sorted 1 terabyte of data in 209 seconds. That beat the previous record of 297 seconds in the terabyte sort benchmark.

Within the rules for the 2009 Gray sort, our 500 GB sort set a new record for the minute sort and the 100 TB sort set a new record of 0.578 TB/minute. The 1 PB sort ran after the 2009 deadline, but improves the speed to 1.03 TB/minute. The 62 second terabyte sort would have set a new record, but the terabyte benchmark that we won last year has been retired.

If you want to learn more about Hadoop, the Cloudera blog has a great post titled 5 Common Questions About Hadoop that explains things pretty well.

