San Francisco-based Capricorn Technologies has crafted blueprints, available from the Internet Archive on an open source basis, which effectively lets people build multi-terabyte and multi-petabyte storage systems fairly inexpensively. The company also builds its own line of storage systems, called the PetaBox, and has landed deals with several universities and research departments with its low-budget approach.
Capricorn Technologies has developed blueprints that allow users to build multi-terabyte and multi-petabyte storage systems. The company also builds high-capacity storage systems for the relatively low price of $2 a gigabyte.
Universities and research departments have purchased Capricorn's storage systems, although competitors and industry observers say that those clients tend not to require the higher-performance (and much more expensive) storage systems needed by mainstream businesses. Still, Capricorn says it plans to become a major player in the storage market.
How cheap are they? Capricorn's storage systems cost about $2 a gigabyte, said the company's chief executive, C.R. Saikley. At that price, the cost breakdown would be about 65 cents for the gigabyte of storage and $1.35 for racks, software, networking, management tools and other components.
That means that a Capricorn 1-terabyte system (which consists of 1,000 gigabytes) would sell for about $2,000, while a 1-petabyte system (1,000 terabytes) would cost about $2 million.
By contrast, a petabyte-class storage system from EMC might cost $20 a gigabyte, while similar systems from smaller companies might cost $10 a gigabyte, said Arun Taneja, an analyst with Taneja Group. A petabyte-class storage system will run into the millions, said an EMC spokesman.
"We're a fraction of the price of those guys," Saikley said. "Our goal is to become the low-cost leader in storage."
The growth of the Internet and services such as Google's Gmail and Apple Computer's iTunes have caused a corresponding explosion in the amount of data that needs to be archived. A petabyte is a vast amount of storage space. It represents around 450,000 hours worth of TV programming, or all the e-mail produced in the world on a single day, according to storage makers.
Mushrooming amounts of data have in turn fueled demand for large storage systems. Luckily, the drive industry has continued to improve its technology, doubling the density of hard drives every two years or so while dropping the price. While drive makers regularly lose money, consumers and others benefit.
The higher price of commercial storage systems comes with significant performance advantages, said an EMC spokesman. The systems that EMC specializes in are geared toward handling thousands of transactions simultaneously for hours on end without failure. A lot of university labs don't need that sort of horsepower.
"The challenge is providing the performance that scales with capacity," the spokesman said.
A Taneja analyst added that the low price raises red flags about Capricorn's commercial viability and performance of the systems, particularly for mainstream business users. Still, "two dollars is a miserably low price for disk-based storage," the analyst said. "The price they are talking about is about the price of the hardware."
The company emerged out of a collaboration between Brewster Kahle, founder of the Internet Archive, and Saikley. The archive, which, needed to expand its storage capacity but was constrained by its budget. The archive also wanted to keep power consumption down.
"We were unable to find what we knew was possible," said Saikley, who added, "I've been a personal friend of Brewster's since the Carter administration."
In 2004, Saikley devised a 100-terabyte storage system that consumed approximately 60 watts per terabyte.
Subsequently, he formed Capricorn and continued to tweak the technology. The company's flagship product is now the PetaBox TB64, a 64-terabyte storage system that consists of several 1U (1.75 inches high) modules slotted into a rack measuring approximately 2 by 2 by 6 feet. It consumes 50 watts per terabyte. The modules come with 400GB drives fromand processors from Via Technologies. Versions using Intel chips are also available.
In June, Capricorn shipped a petabyte worth of PetaBoxes to the Internet Archive. The petabyte system occupies about 16 racks and contains a few thousand hard drives.
The Internet Archive submits all of its intellectual property to the open-source community. Since the storage system was designed on a commission from the organization, the organization owns the designs to the system and hence opened them to the public. Still, because customers don't necessarily want to assemble storage systems themselves, Capricorn is landing contracts. The company is also looking at ways to enhance its portfolio.
"I see us expanding our market presence and adding features and services," Saikley said.