AMD's 3DNow technology was designed to improve the 3D graphics of games. But the feature can also be used to speed up mathematical calculations, said Hank Dietz, a professor at the University of Kentucky and the architect of a new 64-processor Linux supercomputer built out of 700-MHz AMD Athlon microprocessors.
The use of AMD chips is unusual for such systems. Athlons, and AMD processors in general, are almost exclusively found in single-processor consumer computers. The company will actually make one of its first forays into the business market later this year. The university's effort could conceivably help the company smooth that effort by effectively serving as a lab rat.
"Because of the 3DNow support, we've been able to get an awful lot more performance out of them than we're able to get with the current Intel line," Dietz said. AMD donated the chips to the university.
The machine, called the Kentucky Linux Athlon Testbed 2 (KLAT2), is a "Beowulf" computer, a collection of smaller computers networked together to throw their collective might at a single computational task. Each of the nodes of such machines typically runs the Linux operating system, a clone of Unix that's popular in academia because it can be tweaked as much as a researcher wants.
KLAT2 isn't very powerful compared with some Beowulf systems. But it's only a step on a path to greater glory for AMD and the University of Kentucky's computing program. Within the next year, Dietz expects his university will have made a supercomputer with at least 1,000 processors and perhaps as many as 4,000.
High-powered supercomputers are used for computationally intense problems that can't be solved with lesser machines. Typical customers include researchers who want to simulate three-dimensional models of nuclear explosions, intelligence agencies that need to decode messages, car manufacturers that need to model car crashes, and companies that need to mine gems of useful information out of mountains of data.
Beowulf clusters provide a cheap way to get supercomputer performance without having to pay supercomputer prices. Because communication between different nodes typically is slower than with more specialized designs, Beowulf systems aren't good for all types of computing tasks.
While very common at labs and universities, Beowulf machines are attracting the attention of major companies such as Dell, Compaq and IBM and specialized firms such as High Performance Technologies and Atipa.
Both Atipa and HPTI use Compaq's Alpha chip. While performance remains higher with Alpha, Alphas cost much more than Athlons. Compaq hopes Beowulf computers will boost sales of its Alpha chips.
KLAT2 offers good bang for the buck, Dietz said. The machine cost $41,000, not including labor. It can perform 64 billion calculations per second. That makes it faster than a machine ranked as the 150th fastest supercomputer on the top 500 rankings.
To get the machine to work, though, the University of Kentucky researchers had to take advantage of a number of tricks.
First was the use of 3DNow, which actually handles numbers more like the Alpha chip. Using 3DNow is hard, though, because it's not something that ordinary programming tools know how to exploit. The software has to be written in a somewhat unusual programming language before it can use the 3DNow capabilities, Dietz said.
Second was the extension of the "lots of cheap parts" Beowulf philosophy to the network infrastructure that ties the nodes together. Usually very few fast, expensive switches are used to connect the nodes, but the University of Kentucky used a collection of nine cheaper switches.
The key to this approach was figuring out how to wire together 64 computers with four network cards each through nine 31-port switches such that any pair of computers was as close as possible--a mathematical tangle the researchers had to use another of the university's computers to solve, Dietz said.
While 3DNow chips work well, one disadvantage is that only one processor can be used in a node. Intel chips can be used in two-processor or four-processor configurations, a design that can circumvent the sometimes laggardly communications between nodes on a Beowulf system.
Multiprocessor AMD machines will arrive eventually, Dietz said. "They'll be coming, but AMD has been slow with the chipset on that," he said.