IBM plans to endow its Power5 and Power6 processors with an ability called "Fast Path" to take over tasks that software currently handles more slowly.
Power5 will be able to take over software tasks commonly used in the operating system such as packaging data to be sent to networks, said Ravi Arimilli, an IBM Fellow and the chief technology officer for the Power line of chips. Power6 will extend its reach further, taking over tasks now handled by higher-level software such as IBM or Oracle database software or IBM's WebSphere e-commerce software, he said.
The Power5 and Power6 processors, to be detailed at IBM's analyst conference in Palisades, N.Y. and expected to arrive in 2004 and 2006, are the successors to the highly regarded Power4 chip at the heart of the p690 "" Unix servers. IBM hopes the chips will help increase its 20.3 percent share of the Unix server market, second to Sun's 35.2 percent.
The technique is sound and likely to be followed by competitors, experts said.
"Power4 is the most sophisticated server processor on the market. It looks like they're simply going to be extending their lead," said Peter Glaskowsky, a senior editor for Microprocessor Report, referring to IBM's technical lead. The application acceleration is an innovative answer to the question of what tasks chips should handle as ever-smaller circuitry gives the chips more processing power.
IBM launched the Power6 team in December, Arimilli said. "Everything is in the concept phase," with the team currently working on the vision of where Power6 and later Power7 can take IBM in the second half of the decade.
Sun, however, called software acceleration unremarkable. "It sounds like what every architect pursues. You try to pick what you want to be really good at and move it into the hardware," said Sue Kunz, director of marketing for Sun's processor group, adding that Sun has had chip-based acceleration multimedia operations such as media streaming for years.
Brad Day, an analyst with Giga Information Group, however, sees a nearer-term challenge for IBM, since its Power4 processor still has two years to go. This year, IBM will repackage Power4 into "single-chip modules" for lower-end systems with one to four Power4 processors.
"What IBM needs to do is to bring their Power4 architecture down through their product line such that they're competing from a position of lower cost but higher performance," Day said.
But Day is impressed with IBM's ability to have multiple product groups work together--in particular, formerly isolated teams from different server hardware and software groups. "You're starting to see these fruits of integrated product development," he said.
Walking and chewing gum at the same time
Power5, which will be built initially with 130-nanometer (0.13 micron) features, also will feature "simultaneous multithreading," a feature that allows a single chip to act as two. Intel's "hyperthreading" version of this technology adds a modest performance increase--roughly 20 percent or so, depending on what program the chip is running--but Arimilli said Power5's multithreading will allow a single processor to behave like two processors running full throttle.
Here, IBM's lead isn't as strong. Not only does Intel have some multithreading abilities today in its Xeon server chip line, but it inherited the team designing the now-cancelled EV8 processor that had very sophisticated multithreading.
"That's likely to show up in," Glaskowsky said, and there are some comparatively simple ways Intel can speed multithreading in its Pentium and Xeon lines.
Power4 already has two CPUs (central processing units) on each slice of silicon, with four Power4 processors mounted into a large package called a multichip module with thousands of high-speed wires. With Power4, each module has eight CPUs, but the arrival of simultaneous multithreading will increase that to 16, Arimilli said.
Power5 won't be much larger than Power4 in terms of transistor count, Arimilli said. Through minimizing the increases and circuitry, "We're trying to drive the cost way down," he added.
IBM gets the simultaneous multithreading abilities not through new circuitry but through a different use of existing "execution units," the part of the chip responsible for digesting and executing instructions.
"We didn't grow more units, we just used the existing units more intelligently," Arimilli said. The new chip also has faster communications channels to the chip so it isn't starved of data as well as better sharing of data in high-speed "cache" memory.
IBM plans several other features in the chip as well:
The system will come with added circuitry not only to detect when errors have occurred transmitting data but also to fix those errors, a feature that historically has been reserved to the domain of mainframes. It's part of IBM'sinitiative to make servers self-healing.
"With Power4, we detected a lot of errors and recovered on a significant amount of them. With Power5, we detect errors and recover from almost every one. We're now maybe 95 or 97 percent of a mainframe" in terms of chip technology, Arimilli said.
Where Power4 was intended for high-end Unix servers, Power5 has a broader mandate, Arimilli said. IBM plans to use it in "" servers as well, super-thin servers stacked densely like books in a bookshelf.
Glaskowsky said IBM will have to curtail the sizable power consumption and resulting waste heat of Power4 to achieve this target. Power4 produces 125 watts of power, but a blade processor is constrained to about 25 to 40 watts.
"Partitioning," the ability to split a single big server into several smaller ones, will improve. Power4 permits a partition that's the size of a single processor, but Power5 will allow hundreds of partitions, Arimilli said.
That hardware move will dovetail with coming versions of AIX--5.2 in late 2002 and 5.3 in 2003--that increasingly will let hardware resources be easily reassigned to different partitions, Day said.
But the Fast Path acceleration feature wins the spotlight, Glaskowsky said.
"We've heard nothing from Intel application-specific acceleration features in Itanium, and we can see out into the 2004, 2005 timeframe," Glaskowsky said, adding that Sun can't afford to spend as much on chip design as IBM and Intel and that SGI and Hewlett-Packard Unix chips eventually are being phased out.
Arimilli said CPUs tend to spend a large fraction of their time executing a relatively small number of software tasks. It's these tasks the Fast Path acceleration features offload. IBM selected only mature software processes that don't change often so it's not a problem when the operation is hardwired into immutable silicon.
"We reached around and tried to find some common things customers do within the operating system that get called frequently," Arimilli said of the feature.
The acceleration feature will speed up several communication tasks, including the TCP/IP processing used to read and write data on the Internet and corporate networks.
Accelerating TCP/IP makes sense, Glaskowsky said; the software for running a single network connection with a 1 gigabit-per-second transfer capacity soaks up the entire processing power of a processor on an UltraSparc processor in a Sun server, he said. However, other chips can handle the task, and indeed companies such as Alacritech and Adaptec are working on special-purpose chips that do so.
Power5 will accelerate other communications processes as well, including the Message Passing Interface (MPI) used to harness clusters of computers into a collective supercomputer, Arimilli said. And the chip will accelerate virtual memory subsystem, a frequently used operating system feature that manages how higher-speed regular memory can be expanded by using slower but bigger hard drives.
Sun cautions that there can be problems accelerating software functions. "If all you're going to do is a custom operation, then building a custom chip makes sense. But computers are still in the general-purpose range," Kunz said.
And designing the processor poorly--for example, hardwiring specific software operations that aren't used frequently--wastes silicon real estate, making the chip more costly and power-hungry.
Glaskowsky said a good chip design could intercept requests to the operating system to handle jobs the chip itself can perform faster, but that automation would be difficult for higher-level software.
Arimilli said Power5 and Power6 will be faster for any software, but the acceleration features will require support from software makers. Such support isn't too difficult, Glaskowsky said.
Initially, IBM's version of Unix, AIX, will be able to take advantage of the new chip features, Arimilli said. The company also is working with Linux programmers so that Unix variant also can tap into the chip's acceleration resources, he said.
"We created these interfaces to the silicon accelerators open so the Linux guys could take advantage of it," Arimilli said.