In "core hopping," for example, a stream of calculations will jump from one microprocessor core to another. Localized heat generated by transistors during intense number-crunching can create "hot spots" that place a ceiling on performance, said Wilf Pinfold, technical director of microprocessor research at Intel labs. By rotating application processing, key transistors will stay cooler, heat will be dispersed over a greater geographic area and overall performance will climb.
Other areas of research involve specialization, in which small chips-within-a-chip perform specific jobs such as encryption or compression, improving the way processing priorities are set.
The changes are important in that, without a microprocessor overhaul, the industry's ability to continually make computing faster, cheaper and more powerful will begin to slow.
"What we are really doing is moving beyond simply using the (manufacturing) process improvements and known architectural techniques," Pinfold said. "We're heading into a newer area of architectural design. We've run out of ideas and are now forging ahead with completely new ideas."
Other companies are working on the dual-core theme as well. IBM last year released the Power 4, the first dual-core chip for servers, and it's planning to expand the concept with theand chips. Sun Microsystems is also working on a dual-core UltraSparc, according to sources.
Meanwhile, vacant space on the memory controller inside Advanced Micro Devices' Hammer chip indicates the chip could be transformed into a dual-core processor, said Kevin Krewell, an analyst at Microprocessor Report, an industry newsletter.
Intel's research direction, though, sheds light on the diverse evolutionary path for multicore chips, and highlights the importance of controlling power consumption in chips, which will contain 1 billion transistors by 2005.
"What do you do with a 1 billion transistor budget? That is the bleeding edge of research," Krewell said. "When it comes to power, that is the No. 1 concern. People in servers don't want to go back to water cooling yet." An approach like core hopping, which Krewell likened to a game of "hot potato," could help alleviate the problem.
The creative leap that lies ahead for microprocessor design could be termed Moore's Sunburn. Microprocessor performance has steadily increased over the past 30 years through the action of Moore's Law, which dictates that the number of transistors on a given chipevery two years.
Increasing transistor budgets, though, are such that in the next decade or so microprocessors could begin exuding as much heat, for their size, as a.
To defuse the conflict, engineers are concentrating more on efficiency than on simply pumping up performance. In other words, they're trying to figuring out ways to better exploit the computing power of a chip and the growing transistor budget, rather than gunning the clock speed, juicing the voltage or throwing even more transistors into a device to make it more powerful.
"You can very efficiently use the expensive resources on a processor," Pinfold said. "There has been a progressive change in our thinking. In the early '90s we were still in the mode of 'performance is everything.' Now we start by defining a power envelope."
Two basic approaches
In general, there are two basic approaches to building multicore processors. Symmetric multiprocessing chips, such as IBM's Power 4 and presumably chips with core hopping, essentially squeeze two equal processors into a single piece of silicon, so that the chip provides the same computing power as a dual-processor server.
The approach saves on computing real estate and can increase efficiency because the chip cores can share cache memory or buses.
In asymmetric multiprocessing, the two internal chip cores differ from each other and perform specific functions, offloading work from the central processor. Additionally, you could get "little co-processors that do various tasks now handled by software," said Krewell, jobs such as TCP/IP processing or encryption.
Similarly, chip designers could build high-intensity regions into the chip. Intense, high-priority number-crunching calculations could be directed toward certain transistors being supplied with greater amounts of power, Pinfold said. Less significant tasks, meanwhile, could be shunted off to other regions.
The ultimate design and techniques used will depend on the whether the chip will go into mobile devices, servers or desktops. Research is being conducted in the company's labs in the United States, but also in Israel and, Spain, owing to the diverse nature of the work.
"The microprocessor of the future will be much more appropriate to its use," Pinfold said. "We will go to where we can find the best architects."
The processor changes will take place in tandem with increased thread-level parallelism. Under thread-level parallelism, software instructions get separated into individual streams. Once broken down, the streams of an application can be processed in parallel, rather than sequentially, thereby saving time.
Cache misses--when a processor doesn't have the required data in its nearby cache of memory--can hinder computer performance because the processor has to spend cycles digging the data out of main memory. Helper threads could anticipate potential cache misses and retrieve the data before the required calculation, Pinfold said.
Current benchmarks show substantial performance benefits through application threading. Hyperthreading, which takes advantage of threaded applications, was introduced to Intel's Xeon line earlier this year and will soon come to the Pentium line, sources have said.
The increased emphasis on application threading comes at a crucial time. For the past decade or so, designers have squeezed performance out of instruction-level parallelism, which involves juggling the processor's instructions for greater efficiency. But the ceiling is in sight.
"We've pretty much mined that vein as far as performance is concerned," Pinfold said.
Although all of the ideas show promise, it's difficult to predict how they will be embodied in commercial products. Some of these chips will need extraordinary amounts of cache to work properly, which will force designers to balance the performance-power equation, said Nathan Brookwood, principal analyst at Insight 64.
Some processors, such as the Itanium chip, already take advantage of application threading, so they don't need to adapt multicore ideas right away. Still, the mathematics make multicore chips inevitable in many markets.
"Multicore is a very efficient way to use up transistors and increase performance," Brookwood said.