Almost all open-source software is built with GCC, a compiler that converts a program's source code--the commands written by humans in high-level languages such as C--into the binary instructions a computer understands. The forthcoming GCC 4.0 includes a new foundation that will allow that translation to become more sophisticated, said Mark Mitchell, the GCC 4 release manager and "chief sourcerer" of a small company called CodeSourcery.
"The primary purpose of 4.0 was to build an optimization infrastructure that would allow the compiler to generate much better code," Mitchell said.
A new version of the GNU Compiler Collection, a set of tools used to build nearly all open-source software, is set to debut in the next few months.
The new compiler software is likely to create programs that are higher quality and faster, meaning that everything from Apache to Linux and Firefox should get a speed boost.
Compilers are rarely noticed outside the software development community, but GCC carries broad significance. For one thing, an improved GCC could boost performance for the open-source software realm--everything from Linux and Firefox to OpenOffice.org and Apache that collectively compete with proprietary competitors from Microsoft, IBM and others.
For another, GCC is a foundation for an entire philosophy of cooperative software development. It's not too much of a stretch to say GCC is as central an enabler to the free and open-source programming movements as a free press is to democracy.
GCC, which stands for , was one of the original projects in the Gnu's Not Unix effort. Richard Stallman launched GNU and the accompanying Free Software Foundation in the 1980s to create a clone of Unix that's free from proprietary licensing constraints.
The first GCC version was released in 1987, and. A company called , an open-source business pioneer acquired in 1999 by Linux seller Red Hat, funded much of the compiler's development.
But improving GCC isn't a simple matter, said Evans Data analyst Nicholas Petreley. There have been performance improvements that came from moving from GCC 3.3 to 3.4, but at the expense of backwards-compatibility: Some software that compiled fine with 3.3 broke with 3.4, Petreley said.
RedMonk analyst Stephen O'Grady added that updating GCC shouldn't compromise its ability to produce software that works on numerous processor types.
"If they can achieve the very difficult goal of not damaging that cross-platform compatibility and backwards-compatibility, and they can bake in some optimizations that really do speed up performance, the implications will be profound," O'Grady said.
What's coming in 4.0
GCC 4.0 will bring a foundation to which optimizations can be added. Those optimizations can take several forms, but in general, they'll provide ways that the compiler can look at an entire program.
For example, the current version of GCC can optimize small, local parts of a program. But one new optimization, called scalar replacement and aggregates, lets GCC find data structures that span a larger amount of source code. GCC then can break those objects apart so that object components can be stored directly in fast on-chip memory rather than in sluggish main memory.
"Optimization infrastructure is being built to give the compiler the ability to see the big picture," Mitchell said. The framework is called Tree SSA (static single assignment).
However, Mitchell said the optimization framework is only the first step. Next will come writing optimizations that plug into it. "There is not as much use of that infrastructure as there will be over time," Mitchell said.
One optimization that likely will be introduced in GCC 4.1 is called autovectorization, said Richard Henderson, a Red Hat employee and GCC core programmer. That feature economizes processor operations