A decade of research into that problem has resulted in Tilera, Agarwal's company, which has invented a 64-core processor with an embedded high-speed network that can pass up to 32 terabits of data a second between the various cores.
The company's Tile64--designed for networking equipment and video streaming servers--can provide 10 times the performance of an Intel Xeon chip while consuming far less power, or 40 times the performance of a digital signal processor from Texas Instruments, the company says.
And 64 cores is just the start.
Agarwal and other executives from the company will discuss the architecture further on Monday at the Hot Chips conference here at Stanford University. Researchers at Intel, IBM, Advanced Micro Devices, and the
Promising chip companies with strong technical backgrounds rise and fade out on a regular basis in the semiconductor industry. Still, Tilera is trying to tackle one of the thorniest--and thus one of the potentially most lucrative--problems for computer designers today: slow, clogged data paths. Processor speed and transistor count has climbed at a rapid, steady pace for decades, but the buses and interconnects between them get upgraded at a much slower rate.
"The fundamental limitation of CPUs is no longer (core) performance but I/O (input/output),"said in a presentation to reporters in June on Sun Microsystems' efforts in supercomputing. "You don't get more I/O just because you shrink the manufacturing process."
Sun has been working on a technology calledthat allows different chips to talk to each other without wires by virtue of just being close. It's not ready yet.
Last September, Intel's Justin Rattner unveiled Intel's proposed answer: anin which the cores are linked through an embedded network.
The Intel chip is conceptually similar to the Tile64, Agarwal said. Intel, though, has given itself five years to come out with 80-core chips.
Tilera has already delivered samples to customers and will start shipping chips commercially in the fourth quarter. It has 12 customers including networking gear manufacturers 3Com and TopLayer.
Intel's 80-core chip, however, also contains, which unclog the processor-to-memory pathways. The Tile64 employs conventional memory controllers.
Under the hood
Tilera's chips consist of small, individual building blocks, or tiles. Each tile sports a RISC processing core that runs at 600MHz to 1GHz as well as a switch that can send data in four directions: up, down, right and left. These switches form a mesh network, called iMesh, that lets the chips communicate.
The mesh network itself is also divided up into five layers, depending on the type of transaction. One layer handles cache-to-cache transfers, while another handles streaming data.
Each tile contains two caches of memory for rapid data access. Although each tile contains its own cache, the tiles can access all of the cache (depending on how it's programmed).
Individual tiles consume a low 170 milliwatts to 300 milliwatts on average. Cores also power up and down independently when not in use to cut power consumption.
The size of the chip, and its ultimate performance, depend on how many tiles are included. The first product will contain 64 tiles and a 5MB distributed cache. Next year, the company says it will come out with a less expensive 36-tile version and then a 120-tile version close to, or in, 2009. Tiles on a single chip can be grouped into virtual processors assigned to different computing tasks.
Performance gains over conventional chips arise directly out of Tile64's design. A distributed network of slower processors can get jobs done quicker and with less overall energy than two or four larger, faster, more complex cores. Rather than powering a large bus, the chip can rely on shorter connections.
The Tile64 runs Linux and can be optimized for different applications.