X

Hadoop breaks data-sorting world records

The Hadoop project is the first open-source product to break records in the annual GraySort contest.

Dave Rosenberg Co-founder, MuleSource
Dave Rosenberg has more than 15 years of technology and marketing experience that spans from Bell Labs to startup IPOs to open-source and cloud software companies. He is CEO and founder of Nodeable, co-founder of MuleSoft, and managing director for Hardy Way. He is an adviser to DataStax, IT Database, and Puppet Labs.
Dave Rosenberg

Hadoop
Hadoop Hadoop

Yahoo's grid-computing team announced that Apache Hadoop broke world records in the annual GraySort contest in the Gray and Minute sorts in the general-purpose (Daytona) category.

Hadoop is the only open-source software to ever win the GraySort competition, adding another notch to last year's win at the Terasort competition, where Hadoop sorted 1 terabyte of data in 209 seconds. That beat the previous record of 297 seconds in the terabyte sort benchmark.

Within the rules for the 2009 Gray sort, our 500 GB sort set a new record for the minute sort and the 100 TB sort set a new record of 0.578 TB/minute. The 1 PB sort ran after the 2009 deadline, but improves the speed to 1.03 TB/minute. The 62 second terabyte sort would have set a new record, but the terabyte benchmark that we won last year has been retired.

If you want to learn more about Hadoop, the Cloudera blog has a great post titled 5 Common Questions About Hadoop that explains things pretty well.

Follow me on Twitter @daveofdoom