SAN FRANCISCO--More than a decade ago, Intel ran into an issue trying to deliver what was to be the world's top-ranked supercomputer: it looked possible that its new Pentium Pro processors at the heart of the system might not arrive in time.
As a result, the chipmaker made an unusual move by paying Hewlett-Packard $100,000 to evaluate building the system using its PA-RISC processors in the machine, said Paul Prince, now Dell's chief technology officer for enterprise products but then Intel's system architect for the supercomputer. Called ASCI Red and housed at Sandia National Laboratories, it was designed to be the first supercomputer to cross the threshold of a trillion math calculations per second.
Intel ultimately met that 1-teraflops performance deadline using the Intel chips, HP dropped its PA-RISC line in favor of Intel's Itanium processor line, and the Pentium Pro paved the way for Intel's present powerhouse status in the server market. But the supercomputing division within Intel was phased out, and ASCI Red was its last job, Prince said in an interview here on the eve of the Intel Developer Forum.
The division had enough independence that it could have used another company's chips, but doubtless eyebrows would have been raised had a rival processor design showed up in such a high-profile machine that ultimately used more than 9,000 processors.
It wasn't the only hurdle the Intel group overcame in the design and construction of ASCI Red, which used ordinary processors but plenty of one-off technology including a customized operating system and Intel's own router chips to send data from through the system.
The first version of the router chip had a data integrity problem, and Intel didn't have time to fully validate a fixed version even though the engineers knew what caused the problem, Prince said. However, in a presentation titled "Statistics for the Common Man," Prince convinced Intel management that a variety of worst-case scenario tests could reduce the validation time from more than a dozen weeks to about four to six weeks. He prevailed.
"It worked, and they didn't fire me," Prince said. ASCI Red, developed for the Energy Department's Accelerated Strategic Computing Initiative to simulate nuclear weapons physics in a computer rather than with real-world tests, led the Top500 list of supercomputers from June 1997 until November 2000, when IBM's ASCI White took the top spot.
Meanwhile, in today's world
Naturally Prince now is focused on the best directions for getting Dell servers, storage, and networking gear into customers' hands. And though he's comfortable with nitty-gritty chip details, he said customers these days are gravitating toward higher-level discussions.
"At this point nobody's keeping up with the gigahertz rating of chips," he said, no doubt to the delight of Intel and AMD, who ran into physical limits on clock speed and focused their attention on multiple processing cores and getting more work done in each tick of a chip's clock.
Instead, he said, customers are asking, "How does this fit into my virtual environment? What's my management look like?" Thus, Dell is leading a lot of marketing with virtualization, which lets a single physical computer house many independent operating systems called virtual machines. Dell had expected Microsoft and various Linux players to challenge virtualization expert and EMC subsidiary VMware, but it's withstood the competition so far, he said.
Dell itself has about 6,000 VMware-hosted virtual machines running on about 620 real machines in its own computing infrastructure, but that's only a small fraction of the 12,000 physical servers total the company has. Some physical machines house as many as 20 virtual machines, but for business-critical tasks, Dell puts 10 virtual machines on a physical server, Prince said.
In Dell's analysis, using virtual machines saved $60 million in capital equipment expenses, he said. But virtualization poses problems, too--the virtual equivalent of server sprawl, in which new servers are added to a company's infrastructure faster than administrators can keep up.
"You can deploy new servers in hours instead of weeks. The downside is you crank 'em out, so you have this proliferation of resources," Prince said, and virtual machines don't come with handy tracking technology. "The reason it's hard to get rid of them is it's hard to track them. There's no asset tag. There's no depreciation on a virtual server."
Hardware still matters
Though sales have moved to a higher level, hardware details still matter, Prince said. One he's most excited about is solid-state drives, which use flash memory rather than the spinning platters of conventional hard drives.
Many SSDs today directly replace hard drives, using the same size and SATA or SAS communication protocols to connect to a machine in a way that makes them interchangeable with conventional hard drives. But Prince is more interested in a technology that bypasses that older hard drive technology in favor of a more direct connection over a computer's PCI Express subsystem.
Companies including Fusion-io and Texas Memory Systems supply the technology, and Prince is among those in the server realm who like the idea. "You can get a massive performance upgrade in terms of IOPS," or input-output operations per second.
He's also a believer in a technology called wear leveling, which moves data around the physical storage device so elements don't get overused and therefore effectively worn out. "The life has to be better than enterprise-class drives," he said.
Prince also predicted the eventual triumph of Ethernet over more special-purpose high-speed network fabrics, Fibre Channel and InfiniBand. Fibre Channel will reach 16 gigabits per second, probably won't move beyond 40 gigabits per second, but Ethernet is headed for 40 and 100 gigabits per second today with 400 gigabits and even 1 terabit per second on the horizon, he said.
"Everybody is converging on Ethernet as the high-performance fabric of the future," Prince said.