X

Start-up looks to do the math with new chip

ClearSpeed prepares to show off a chip designed to handle the complex, but repetitive, calculations involved in protein matching, data compression and other demanding tasks.

Michael Kanellos Staff Writer, CNET News.com
Michael Kanellos is editor at large at CNET News.com, where he covers hardware, research and development, start-ups and the tech industry overseas.
Michael Kanellos
4 min read
Start-up ClearSpeed Technology hopes to turbocharge supercomputers and servers with a new chip designed to absorb a lot of the excess math.

The company's CS301 processor, which will be demonstrated publicly for the first time at the SC2003supercomputing conference in Phoenix next week, is a low-power, multiple-core chip built to handle the complex, but repetitive, math calculations involved in data compression, protein matching or film rendering. These tasks can bog down even the most powerful computers.

Although the chip only runs at 200MHz, it contains 64 separate "processor elements" that can simultaneously work on parallel instructions, or different, but structurally similar, math problems. In terms of raw calculating power, that gives the CS301 a top performance of 25 gigaflops, or 25 billion operations per second. That's about twice as many as a 3GHz Pentium 4. The gigaflops measure doesn't give an exact indication of performance, but it gives some insight into how quickly a chip or computer can complete certain tasks.

"We see a phenomenal need, particularly on the biosciences side," said Mike Calise, ClearSpeed's president. "The matrix of multiples gets astronomical."

Lockheed Martin has already incorporated the chip into some of its computers. Los Gatos, Calif.-based ClearSpeed is also talking to several server makers, in particular smaller manufacturers such as Linux Networx and RLX Technologies, about incorporating the chip into servers next year. IBM Semiconductor is manufacturing the chip.

ClearSpeed's philosophy can largely be summed up with the phrase "divide and conquer." In most chips, instructions are fetched from memory, decoded and then sent to floating point units or integer units for execution. A typical microprocessor on the market today will have three floating point and three integer units.

By contrast, the CS301 is a single chip made up of 65 separate chips: a single RISC-based chip that fetches and decodes processor instructions from memory and 64 processing elements, small math factories which then execute the instructions in parallel. Each processing element contains two floating point units, for calculating decimal problems; an integer unit, for whole numbers; and 4K of memory.

The architecture works because the underlying equations in instructions in many applications are often identical--only the numerical variables change. In biological research, for instance, scientists will prepare data on the hundreds of different ways a single protein can be folded, and then need the same exact corpus of data on folding for thousands of separate proteins. With 64 processing units, this sort of computational grunt work can be accomplished in parallel in less time than usual.

"Each of these processing elements is going to run the same code," said David Hoff, director of technical marketing for ClearSpeed.

Energy consumption is also reduced. Because math can be performed simultaneously in 64 places, the chip can be run at a far slower speed than a standard microprocessor. The chip consumes less than three watts of power, far lower than even a standard notebook chip.

The chip's architecture, however, means that the CS301 can't be used in all situations. The chip will mostly be used as an offload or co-processor for Intel- or AMD-based server clusters, said Calise.

"This will be an augmentation to existing hardware," Calise said. Applications have to be recompiled so that the computer will direct tasks to the co-processor, but ClearSpeed provides a C compiler with its boards.

In other situations, it will function as a primary processor for fixed tasks. Defense agencies, for instance, could use it inside radar equipment or to compress images.

Another benefit that defense contractors in particular like is gradual degradation. If a single processing element burns out or fails to work, the remaining 63 will continue to function. At that point, the chip will provide slightly less performance, or can be sped up to compensate for a loss of 1/64th of its execution power.

The chip is also expandable, Calise added, noting that there is no reason a 128- or 256-unit chip couldn't be built.

Eventually, ClearSpeed's technology could be used to enhance desktops, but the applications right now really don't exist, Calise said.

The company's heritage can largely be traced back to PixelFusion, one of the several graphics companies that crumbled in the late '90s. At the time, there were more than 40 graphics chip companies and the majority were losing money. Most of these companies were bought, or faded away. PixelFusion's Fuzion 150 chip could process different scenes or images in parallel, similar to how the CS301 works.

PixelFusion "had stunning technology, but the product got launched at the wrong time," Hoff said. PixelFusion's patents, along with the design team, went to form ClearSpeed.

If anything, the co-processor market could prove to be more lucrative. The company's development kits, which include a ClearSpeed board and all of the necessary software to incorporate it into a server, start at $25,000, while additional boards without the software start at $10,000.

Next year, the company will sell individual chips not on boards for $1,000 each in volume quantities.