IBM to research proteins with supercomputer

IBM launches a $100 million research initiative to build a supercomputer that can help researchers understand how proteins develop, which could lead to a better understanding of diseases and uncover possible cures.

IBM today launched a $100 million research initiative to build a supercomputer that can help researchers understand how proteins are created, knowledge that could lead to a better understanding of diseases and uncover possible cures.

The 50-person initiative, dubbed Blue Gene, will culminate in a computer that can perform a quadrillion calculations per second, about a thousand times faster than the company's Deep Blue machine that beat world chess champion Garry Kasparov two years ago and about 500 times faster than IBM's Blue Pacific nuclear weapons computer at Lawrence Livermore National Laboratory, IBM said.

The Blue Gene program is the successor to Deep Blue: an initiative to push IBM researchers faster in creating the software and hardware that ultimately will be useful for ordinary businesses as well as scientific researchers needing the most powerful machines, said Paul Horn, senior vice president of IBM research at a news conference today.

But IBM will have to wrangle with others. Sun Microsystems, for example, has its eye on the supercomputer market, a new area for the company. Sun is set to announce that the Naval Oceanographic Office, a customer of Cray Research supercomputers from SGI, has bought three top-end E10000 computers from Sun. The E10000 machines--which Sun originally acquired from Cray--have given Sun a significant presence on a list of the top 500 computers.

Blue Gene will take on the problem of protein folding, the biochemical process by which complex molecules are constructed by instructions carried in DNA. As proteins are assembled from components called amino acids, the long strand of molecules twists and folds into a three-dimensional bundle, leaving some "active" sites protruding from the protein to react with the environment.

How exactly the protein will fold up is governed by basic rules of how atoms attract and repel each other, Horn said. But the size of proteins, often with thousands of atoms, makes predicting that arrangement a very difficult task. Hemoglobin--also known as the red blood cells that carry oxygen throughout the body--is made of 600 amino acids, for example.

Blue Gene's final product, due in four or five years, will be able to "fold" a protein made of 300 amino acids, Horn said. But that job will take an entire year of full-time computing. In the meantime, IBM will produce lesser computers and tackle simpler proteins, including some whose structure already is known, Horn said.

The research will speed up the design of new drugs--theoretically even drugs customized to individuals.

The computer will use a new architecture that has more than a million CPUs connected in ever-larger bunches, said Ambuj Goyal, vice president of computer science at IBM Research.

The chip itself will extend an IBM design philosophy that will emerge in coming years with IBM's Power4 processor. That processor will package four CPUs on a single chip, IBM has said.

Blue Gene will use 32 CPUs in a single chip, Goyal said. But in a new twist, these chips will contain the computer memory as well, which in today's computers it is completely separate from the CPUs. "We'll put memory together with processors, many packaged on the same chip. We'll get more mathematical calculations out of a chip than is traditionally possible," he said.

A total of 64 of those 32-CPU chips will be packaged in a computing node; then eight nodes will be stacked in each rack. Building 64 of these racks will get IBM to its goal of a petaflop--a quadrillion "floating-point" mathematical operations per second.

The system will inherit some "self-healing" capability that IBM uses in its S/390 supercomputer line, he added. That feature lets a computer automatically detect and shut down faulty processors. This overall approach is called SMASH, which stands for "simply, many and self-healing."

The chips will understand a new language, or "instruction set," that's pared down to be as simple as possible. It's a step beyond the simplification that took place when the industry moved from complex instruction set computer (CISC) designs to reduced instruction set computers (RISC), Goyal said.

IBM expects pharmaceutical companies and others to begin helping with the research, Horn said.

The computer itself will be housed at IBM's Watson research center in Yorktown Heights, New York.

Featured Video

Common battery myths that need to die

Sharon Profis busts a few overplayed battery myths on "You're Doing it All Wrong."

by Sharon Profis