Hadoop, the elephant in the enterprise
The open-source software, which has emerged as the de facto standard for big data processing, may be what tips enterprise in the favor of open source, according to some high-level execs.
PALO ALTO, Calif.--This is a big-data week in Silicon Valley, kicking off last night with a Churchill Club event here called "The Elephant in the Enterprise: What Role will Hadoop Play?" and featuring a high-powered group of big-data executives.
Hadoop, the open-source software that has emerged as the de facto standard for big data processing, may be what tips enterprise in the favor of open source. The desire to get more data and find value in it has become a business priority, and Hadoop is playing a major role in making sense of data.
And while the Hadoop platform is enterprise-ready, applications are what will drive the business case, according to Cloudera CEO Mike Olson, a sentiment echoed by MapR CEO John Schroeder.
Facebook is a long-time user and uses the platform across its entire business, according to Jay Parikh, Facebook's vice president of infrastructure engineering. The company has seen a broad set of use cases for the technology and found that that it needs even more extensibility. Facebook's single-largest cluster is over 100 petabytes of data, and the company is looking to make the data available at all times.
Metamarkets CEO Mike Driscoll told the crowd that Hadoop is a technology -- not a solution. And instead of batch processing for later analysis, users need interactive dialogs with data. You need data to be in conversation, and Hadoop is more like a penpal and should be available as a service. Olson and Oracle SVP Andy Mendelsohn stated that users want their data source to be close to their Hadoop installation.
One of the big issues discussed by the panel is trust between the owner and the data custodian. According to Olson, applications are more easily accepted as opposed to data stores. Metamarkets uses Amazon's Elastic Map Reduce (EMR) -- a service that allows them to scale efficiently in conjunction with AWS storage, a wholesale cloud approach to big data.
The open-source angle factors heavily into the adoption of Hadoop as well as its evolution. According to Parikh, open source leads to faster, cheaper evolution of the software. The use cases are so different that it's essential it remain open source. More openness in APIs and software will move the needle forward, according to Schroeder.
According to Olson, customers continue to like open source because it allows them to not be locked in to a vendor. Hadoop is really the first open-source project that is innovative -- not knocking off a proprietary vendor with an open alternative. No single vendor can outcompete a global open-source community.
The winning open-source company of the future is based on open-source platform with unique proprietary components. Making it easy to use is what people will pay for. And more proprietary complements are coming because enterprise need a vendor to support the open-source software.