Sun finds glitch in new UltraSparc III chip

The server giant discovers a problem with computers that use its new UltraSparc III processor and says it's working on a software fix.

Stephen Shankland
Stephen Shankland principal writer
Stephen Shankland has been a reporter at CNET since 1998 and writes about processors, digital photography, AI, quantum computing, computer science, materials science, supercomputers, drones, browsers, 3D printing, USB, and new computing technology in general. He has a soft spot in his heart for standards groups and I/O interfaces. His first big scoop was about radioactive cat poop.
Expertise processors, semiconductors, web browsers, quantum computing, supercomputers, AI, 3D printing, drones, computer science, physics, programming, materials science, USB, UWB, Android, digital photography, science Credentials I've been covering the technology industry for 24 years and was a science writer for five years before that. I've got deep expertise in microprocessors, digital photography, computer hardware and software, internet standards, web technology, and other dee
4 min read
Sun Microsystems has found a problem with its new UltraSparc III processor, and fixing it will cause system performance to drop by about 5 percent for some customers, the company said Wednesday.

The problem affects only initial models of the Sun Blade 1000 workstation, one of the first computers based on the critical new design from the Palo Alto, Calif.-based company, said Fred Kohout, director of Sun's technical markets products group.

Though the problem can be fixed by running a software patch that disables a feature of the chip, the repair reduces performance and a true fix won't come until a future revision to the chip, he said.

The UltraSparc III, the company's first chip redesign in years, is the centerpiece of Sun's effort to revamp its entire server line. The problem is a black eye for Sun, but because the fix is relatively simple, Sun is likely to sustain more damage to its reputation than to its finances.

The problem causes the computer to perform incorrect mathematical calculations when the chip accesses data in a certain pattern, Kohout said.

"We characterize this as highly unlikely, occurring only in a rare set of circumstances, but we felt the right thing to do was to put patch in place that would eliminate any possibility of this occurring at all," Kohout said.

"I don't think it looks good for them, obviously, but I don't think this is going to have too much bearing on new product sales" because Sun appears to have the problem in hand, said ARS Market Intelligence analyst Steve Greenberg.

But Sun already has had some problems with its new UltraSparc III chip, which is critical to its effort to keep IBM, Hewlett-Packard and Microsoft at bay. For one thing, Sun couldn't meet initial demand for UltraSparc III computers. For another, manufacturing problems have prevented the company from switching as fast as planned from 750MHz chips to 900MHz models using copper circuitry.

In addition, the UltraSparc III chip and systems using it are late. The chip originally was expected to debut at the end of 1999 but didn?t emerge until late last year. And servers using the chip are arriving months later than Sun hoped.

The problem affects the part of the chip that controls the "prefetch pipeline," which essentially tries to predict what information the chip will need, Kohout said. The fix shuts off the feature, which is new with the UltraSparc III.

Insight 64 analyst Nathan Brookwood said that part of the chip is in charge of anticipating what information might be needed and fetching it from memory so the chip doesn't have to wait when it needs the information.

"If you turn that off, you clearly are going to lose some potential performance. How much is really hard to say," Brookwood said.

Sun's problem bears some similarities to Intel's embarrassing experience with its first Pentium chips, which were afflicted with the "FDIV" glitch that caused some mathematical calculations to go awry. Though Intel argued few customers would suffer from the problem, public outcry forced the company to recall the chips to try to prevent further damage to its reputation.

However, Sun caught the problem earlier than Intel, which spent $500 million to replace all the defective chips it had shipped out and didn't have a software fix for the problem.

Sun has been steadily winning an ever-larger share of the server market, which IDC said accounted for $60 billion in sales in 2000.

The problem doesn't affect the Sun Fire 280R server introduced at the same time and later Sun Fire 3800, 4800 or 6800 models introduced in March, Sun spokeswoman Kasey Holman said.

Sun discovered the problem itself, Holman said. That contrasts with a glitch involving high-speed cache memory that afflicted high-end UltraSparc II-based systems. With that problem, which caused computers to unexpectedly reboot, Sun customers brought the problem to Sun's attention.

Though Sun knows exactly which customers were affected, the company refused to say how many computers have the problem. No systems shipped after March 14 were affected, Kohout said.

Sun began notifying customers three weeks ago, though the company found the problem considerably earlier than that--soon enough that Sun could fix the problem before any customers with the 280R server were affected.

The computing jobs typically handled by servers don't involve as much mathematical calculation, so server performance won't be affected as much.

But workstation users are in a different category. "The majority of applications, on average, are impacted by less than 5 percent," Kohout said.

Sun plans to republish lowered performance benchmarks, Kohout said.