Experts work to aid compiler behind open source

GCC is used to produce most programs in the open-source movement, so a little improvement or degradation can propagate widely.

Programmers are working to debug and speed performance of the newly released GCC 4.0, the compiler at the foundation of the open-source and free-software movements.

Lead programmer Mark Mitchell released GCC 4.0 on April 22. It includes a new optimization framework designed to improve the process of translating source code written by humans into binary code a computer understands.

The new version is still very much a work in progress, though, and it will take time for a clear performance advantage to emerge, Mitchell said in an interview. "It's got all this new optimization infrastructure. All that new infrastructure hasn't been as carefully tuned as much as the old one was," Mitchell said.

One of the first rocky moments of the GCC 4.0 debut came with KDE, the graphical interface software widely used on Linux computers. The package wouldn't compile with GCC 4.0, and KDE organizers blacklisted GCC 4.0 for the time being.

The bug that hampered KDE has been fixed now and should be available soon, Mitchell said. "We'll probably do a 4.0.1 refresh release earlier than planned," within a month rather than two months as originally forecast, he said.

GCC is used to produce almost all programs in the free and open-source software movements, so a little improvement or degradation in the compiler can propagate to thousands of projects.

Another rocky patch for GCC 4.0 was a review published this week by programmer and author Scott Ladd. He compared GCC 4.0 to its predecessor, GCC 3.4.3, and found that the new version often took longer to produce code and that the code was bulkier and ran more slowly.

"Is GCC 4.0 better than its predecessors? In terms of raw numbers, the answer is a definite 'no,'" Ladd said after testing the compilers on four software packages. But, he cautioned, "No one should expect a 'point-oh-point-oh' release to deliver the full potential of a product."

Mitchell was more neutral about performance. "As far as I can tell, it's roughly a wash, and I think that's a tremendous achievement," he said. "To rewrite your optimizer top to bottom and not make things substantially worse is a real achievement."

There are other issues besides performance, Mitchell added. GCC 4.0 fixes hundreds of bugs, can produce software for processors that were previously unsupported, and can compile software written in the Fortran 95 programming language, widely used in scientific circles.

Making better use of registers
One improvement Mitchell hopes to introduce in GCC 4.2 will change how software stores data in precious processor resources called registers. Widely used 32-bit x86 chips such as Intel's Pentium have a scarce supply of registers, but newer 64-bit versions double the number, making x86 chips more like other processor families such as IBM's Power or Intel's Itanium.

But choosing which data to store in registers is a complicated task that only gets more complicated when there are more registers to choose from.

On 32-bit x86 chips, the constrained number of registers means the data must be constantly exchanged with data stored in a computer's slower but larger main memory. But as the number of registers increases, software can run faster if it keeps the right data in the registers, Mitchell said.

"Your program decides which data gets put in at what point. If it doesn't put the right data in, it has to move data back and forth between memory more than it should," Mitchell said. "We call that 'spills and fills,' and it can really interfere with performance."

Featured Video