X

Chipmakers aim to unclog data paths

Tilera's Tile64 chip, along with other cutting-edge designs, will take center stage at this week's Hot Chips conference.

Michael Kanellos Staff Writer, CNET News.com
Michael Kanellos is editor at large at CNET News.com, where he covers hardware, research and development, start-ups and the tech industry overseas.
Michael Kanellos
PALO ALTO, Calif.--If you are going to build processors with large numbers of cores, argues Anant Agarwal, you have to figure out how to connect them to each other, too.

A decade of research into that problem has resulted in Tilera, Agarwal's company, which has invented a 64-core processor with an embedded high-speed network that can pass up to 32 terabits of data a second between the various cores.

The company's Tile64--designed for networking equipment and video streaming servers--can provide 10 times the performance of an Intel Xeon chip while consuming far less power, or 40 times the performance of a digital signal processor from Texas Instruments, the company says.

And 64 cores is just the start.

"by="" 2014,="" we="" could="" have="" 1,000-core="" chips,"="" agarwal,="" a="" professor="" at="" the="" massachusetts="" institute="" of="" technology="" and="" cto="" santa="" clara,="" calif.-based="" company,="" said="" in="" an="" interview.="" "buses="" (the="" traditional="" chip-to-chip="" connection)="" start="" to="" run="" out="" steam="" after="" four="" eight="" cores.="" you="" really="" go="" with="" clean="" slate."="" <="" p="">

Agarwal and other executives from the company will discuss the architecture further on Monday at the Hot Chips conference here at Stanford University. Researchers at Intel, IBM, Advanced Micro Devices, and the the University of Texas, among others, also will present papers.

Promising chip companies with strong technical backgrounds rise and fade out on a regular basis in the semiconductor industry. Still, Tilera is trying to tackle one of the thorniest--and thus one of the potentially most lucrative--problems for computer designers today: slow, clogged data paths. Processor speed and transistor count has climbed at a rapid, steady pace for decades, but the buses and interconnects between them get upgraded at a much slower rate.

HyperTransport, found inside processors from AMD, has probably been the most significant achievement in this regard in the last decade. HyperTransport accounted for a substantial percentage of the performance gains AMD achieved with the Athlon chip.

"The fundamental limitation of CPUs is no longer (core) performance but I/O (input/output)," Andy Bechtolsheim said in a presentation to reporters in June on Sun Microsystems' efforts in supercomputing. "You don't get more I/O just because you shrink the manufacturing process."

Sun has been working on a technology called proximity communication that allows different chips to talk to each other without wires by virtue of just being close. It's not ready yet.

Last September, Intel's Justin Rattner unveiled Intel's proposed answer: an 80-core chip in which the cores are linked through an embedded network.

The Intel chip is conceptually similar to the Tile64, Agarwal said. Intel, though, has given itself five years to come out with 80-core chips.

Tilera has already delivered samples to customers and will start shipping chips commercially in the fourth quarter. It has 12 customers including networking gear manufacturers 3Com and TopLayer.

Intel's 80-core chip, however, also contains Through Silicon Vias, which unclog the processor-to-memory pathways. The Tile64 employs conventional memory controllers.

Under the hood
Tilera's chips consist of small, individual building blocks, or tiles. Each tile sports a RISC processing core that runs at 600MHz to 1GHz as well as a switch that can send data in four directions: up, down, right and left. These switches form a mesh network, called iMesh, that lets the chips communicate.

The mesh network itself is also divided up into five layers, depending on the type of transaction. One layer handles cache-to-cache transfers, while another handles streaming data.

Each tile contains two caches of memory for rapid data access. Although each tile contains its own cache, the tiles can access all of the cache (depending on how it's programmed).

Individual tiles consume a low 170 milliwatts to 300 milliwatts on average. Cores also power up and down independently when not in use to cut power consumption.

The size of the chip, and its ultimate performance, depend on how many tiles are included. The first product will contain 64 tiles and a 5MB distributed cache. Next year, the company says it will come out with a less expensive 36-tile version and then a 120-tile version close to, or in, 2009. Tiles on a single chip can be grouped into virtual processors assigned to different computing tasks.

Performance gains over conventional chips arise directly out of Tile64's design. A distributed network of slower processors can get jobs done quicker and with less overall energy than two or four larger, faster, more complex cores. Rather than powering a large bus, the chip can rely on shorter connections.

The Tile64 runs Linux and can be optimized for different applications.

Who needs this sort of computing power? Firewalls, Agarwal said. The avalanche of spam has created a market for networking devices that can more accurately and thoroughly examine data packets and toss out the unwanted ones. Video-on-demand systems, high-definition video, security systems and video conferencing are also growing and will require faster systems.

Ultimately, these kinds of computing tasks are also going to downgrade the role of processing cores in the computing world.

"The processor is becoming more and more anonymous, and the system is becoming more and more important," Agarwal said. "The processor is the new transistor."

Who is this guy?
Agarwal has been a fixture in high-end chip designing for years. While a professor at Stanford in the early 1980s, he worked on the design of the MIPS chips, which helped Silicon Graphics achieve its gains back then. (Stanford University President John Hennessy was the leader of that project.) In 1991, Agarwal was a co-author on a paper presented at Hot Chips on Sparcle, a Sun processor that touted multithreading.

In 1996, he started to work on integrating mesh networking into chips. Tilera was founded in 2004.

So far, the company has raised $40 million from Bessemer Partners, Walden International and VTA, the venture capital arm of Taiwan Semiconductor Manufacturing Co. TSMC also will manufacture the chip.

The architecture behind the Tile64, however, may only be adequate for cutting-edge chips for a decade or so, Agarwal theorized. His lab at MIT--as well as those at Intel, Luxtera and a number of other companies--are already examining ways to replace the metal connections between chip cores with faster, cooler, optical fibers.

Shrinking optical components so that they can connect chip cores will take time. The technology will likely be used to connect boards and components. Still, the mushrooming growth in cores may demand it.

MIT's research into inter-core optical connections "may see the light of day in 12 years--maybe 2016, 2017," Agarwal said. "When we want to go to 4,000 or 5,000 cores, we may need other technologies."