X

Hadoop, the elephant in the enterprise

The open-source software, which has emerged as the de facto standard for big data processing, may be what tips enterprise in the favor of open source, according to some high-level execs.

Dave Rosenberg Co-founder, MuleSource
Dave Rosenberg has more than 15 years of technology and marketing experience that spans from Bell Labs to startup IPOs to open-source and cloud software companies. He is CEO and founder of Nodeable, co-founder of MuleSoft, and managing director for Hardy Way. He is an adviser to DataStax, IT Database, and Puppet Labs.
Dave Rosenberg
2 min read
Hadoop logo

PALO ALTO, Calif.--This is a big-data week in Silicon Valley, kicking off last night with a Churchill Club event here called "The Elephant in the Enterprise: What Role will Hadoop Play?" and featuring a high-powered group of big-data executives.

Hadoop, the open-source software that has emerged as the de facto standard for big data processing, may be what tips enterprise in the favor of open source. The desire to get more data and find value in it has become a business priority, and Hadoop is playing a major role in making sense of data.

And while the Hadoop platform is enterprise-ready, applications are what will drive the business case, according to Cloudera CEO Mike Olson, a sentiment echoed by MapR CEO John Schroeder.

Hadoop at Churchill Club
The Hadoop panel included Oracle's Andy Mendehlson, Cloudera's Mike Olson, and Facebook's Jay Parikh. Matt Asay

Facebook is a long-time user and uses the platform across its entire business, according to Jay Parikh, Facebook's vice president of infrastructure engineering. The company has seen a broad set of use cases for the technology and found that that it needs even more extensibility. Facebook's single-largest cluster is over 100 petabytes of data, and the company is looking to make the data available at all times.

Metamarkets CEO Mike Driscoll told the crowd that Hadoop is a technology -- not a solution. And instead of batch processing for later analysis, users need interactive dialogs with data. You need data to be in conversation, and Hadoop is more like a penpal and should be available as a service. Olson and Oracle SVP Andy Mendelsohn stated that users want their data source to be close to their Hadoop installation.

One of the big issues discussed by the panel is trust between the owner and the data custodian. According to Olson, applications are more easily accepted as opposed to data stores. Metamarkets uses Amazon's Elastic Map Reduce (EMR) -- a service that allows them to scale efficiently in conjunction with AWS storage, a wholesale cloud approach to big data.

The open-source angle factors heavily into the adoption of Hadoop as well as its evolution. According to Parikh, open source leads to faster, cheaper evolution of the software. The use cases are so different that it's essential it remain open source. More openness in APIs and software will move the needle forward, according to Schroeder.

According to Olson, customers continue to like open source because it allows them to not be locked in to a vendor. Hadoop is really the first open-source project that is innovative -- not knocking off a proprietary vendor with an open alternative. No single vendor can outcompete a global open-source community.

The winning open-source company of the future is based on open-source platform with unique proprietary components. Making it easy to use is what people will pay for. And more proprietary complements are coming because enterprise need a vendor to support the open-source software.