Open-source developers have long done their work in the public eye. Now, they're doing it under an academic microscope.
Walt Scacchi, a senior research scientist at the University of California at Irvine's Institute for Software Research, has been looking at open-source projects from an analytical perspective, studying the open-source model in an ongoing, 10-year project that draws some comforting conclusions for open-source sponsors and developers.
Scacchi and fellow researchers have found a significant failure rate among open-source projects. But among those that get off the ground, research has shown not only that the open-source approach can yield better software more quickly
and for less money than traditional methods but also that volunteering for an open-source
project can be an effective way to get a job.
| || |
Get Up to Speed on...
Get the latest headlines and
company-specific news in our
expanded GUTS section.
Often, Scacchi's work is as much sociological as technical, as he and
colleagues examine phenomena like "community building" and cultural institutions
alongside drier subjects like code and project design.
And academia's work on open source is more than academic. Three projects by
Scacchi and colleagues at UC, Santa Clara University and the University of
Illinois will use the data to design new development tools for big,
Scacchi and colleagues are at work on four different research projects. Their
first National Science Foundation grants came through in the fall of 2000, following a few
years of unfunded research. Current funding will bring the project through 2006, and
Scacchi estimates that it will be at least a 10-year research investment. He spoke to
CNET News.com about his work from his office in Irvine. What exactly is your research trying to determine?
In general, we're trying to understand how free and open-source software development works
in practice. Are the processes the same that are taught in engineering classes, the
guidelines that we teach in academia? Or are they doing something else? If so, is it a poor
version of software engineering, bumbling along in such a way that they wouldn't do it if
they knew better?
We've looked at free and open-source projects, multiple projects in multiple
communities, not only the popular areas like Web or Web infrastructure software--Mosaic and
Apache are two examples--but are also looking at open-source practices in the computer game
community or in the world of astrophysics and deep-space imaging or academic-software design. By looking at multiple projects across these different arenas, what we see is
something different than what's advocated in the principles of software engineering.
What are some of the differences you've found, apart from the obvious ones?
For example, in software engineering, there's a widespread view that it's necessary to elicit
and capture the requirement specifications of the system to be developed so that once implemented, it's possible to pose questions as to what was implemented, compared with what was specified.
We do not see or observe or find in open-source projects any online documents that
software engineers would identify as a software requirements specification. That poses the
question: What problem are they solving, if they haven't written down the problem? While it's
true that there's no requirements specification, what there is instead is what we've
identified as a variety of software informalisms.
What do you mean by "informalism"?
That word is chosen to help compare to the practice advocated in software engineering, in which one creates a formal systems specification or design that might be delivered to the
customer. Informalisms are such things as information posted on a Web page, a threaded
e-mail discussion or a set of comments in source code in a project repository. It may be a
set of how-tos or FAQs on how to get things accomplished. Each is a carrier of fragments of
what the requirements for the system are going to be.
If they're put together in such a haphazard way, can they really be considered
Yes and no. Clearly, they're distributed, but in order for people to contribute to the
project, those people need to understand where those requirements are and how they relate to
each other and how to pull them together. Part of how the community works is that each of
the participants discusses what the system should do in whatever informalism they feel
is the most appropriate to them.
What the licenses do in practice is reinforce and institutionalize a set of beliefs, values
and norms for how free or open-source software should be developed.
Once the requirements are figured out, how are systems designed in open source?
We've begun to codify the practice we observe with the label "continuous
design." That would mean, much like requirements, that there is no unique baseline design, necessarily. Instead, there is today's understanding of the design, which may be different
from yesterday's or tomorrow's. It's not a bounded activity with a fixed and targeted
deliverable, but an ongoing activity, and the system design is represented across the web of
these informalisms. The key point is the evolving nature of it.
What's the relationship between your design informalisms and continuous design?
Informalisms refer to artifacts, to the medium. Continuous design characterizes the practice
or process; what produces or consumes the artifacts.
What about management?
There's a self-management process we're calling "virtual project management." As the
participants start to make choices, to create functionality in certain ways, using certain
tools with certain architectural tendencies, they constrain how subsequent choices can be
made and how the system can be expanded or not. I look at the project and say, "Here's
something I can do," and I become a virtual owner or a designated leader who has a certain
amount to say in an area. From an organizational standpoint, this looks less like a
hierarchical organization than a meritocracy. People ascend to positions of authority based
on accomplishment and expertise.
And yet, there is a hierarchy, isn't there? Projects have owners.
The idea of a meritocracy is not independent of hierarchies. They tend to be not as tall and broader or wider than others. There might be a group of elders or a single individual providing the vision. There is a sort of layering going on here, but the layers are permeable.
Have you been looking at just the sort of grassroots open-source projects? Or have you
also been looking at the more recent corporate projects?
Yes, we have been looking at the modern-day version of these corporate-sponsored open-source
projects. An example would be NetBeans at Sun. Another
is Eclipse at IBM, and a third is the Gelato Federation sponsored by Hewlett-Packard. There's a growing number of these large corporate-sponsored open-source projects,
meaning that the corporation is assigning its salaried employees to work full- or part-time
on the project. They may be either trying to put together what the volunteer community
is doing or addressing the parts of the system no volunteers have stepped forward to do.
Open-source projects also serve as venues for recruiting, looking at the volunteers
for potential employees. Companies that get involved in sponsoring a project can find out who are the good people here in the community who have the natural talent or the track record or experience, which they would be unlikely to find through traditional recruiting means. These people might be in geographic locations that are inconvenient, but they are really capable and have deep expertise, and let's see if there is a new kind of employment relationship that might be able to engage them to make the voluntary contributions and engage in work for pay. And those people tend to get higher-than-average pay. People who are typically in the core contributors--people near the center of the project--they're the ones who have this higher level of participation; their work products are publicly available for others to
individually evaluate, and companies find that that's an extremely important resource.
So what does your research say about the effectiveness of open-source development?
One thing we find with respect to participation is that in a
couple of other surveys, 60 percent of open-source software developers who show up as
core contributors tend to be contributors to two to 10 other projects. Once you've established
a reputation of expertise in a certain area, you can take that to another project, or
conversely, people seek out your expertise, because you know how to do certain kinds of things.
The overall dynamic that starts to emerge is that there's a social mechanism for the creation
of critical mass that lets these projects coalesce and come together, so systems can grow and
evolve at rates that far exceed what's predicted by good software practice.
Software engineering predicts that projects grow by the inverse square law, meaning
that initial growth is fast. It then slows down, and then, with a project shift, you get steady growth.
But in the more successful open-source projects, you get a hockey stick (curved line) on your
graph--a longer period of slow growth, then critical mass starts to kick in, and the growth curve starts
to shoot up in a greater-than-linear growth rate.
So what, exactly, is happening to spur that faster growth you're seeing in open source?
What's an example?
Let's say you're a master of UI (user interface)
technology, so you hook up to another project and can import or reuse the code and the
People are breaking away from the tradition of the individual artist, saying there's another way to build upon the work of others.
expertise that's been acquired so far. If our projects form that symbiosis, they can merge
with a third, so this starts to account for why you see that substantial growth. This is a
manifestation of software reuse that is different than what's being advocated by the software
engineering community, which says everything's in a library that everyone dips in to in order to take what they need, and then it goes away. Here you say, "I create something you want, so I make my work
in both contexts and together we have new context, and as we build this social network, what
we're doing is bringing software expertise and source code with us so that in comparatively
short amounts of time we can have large amounts of people create a large system without the
coordination or management of a central corporate authority or project manager."
What else are you looking at in your research?
One thing is the role of free and open-source public licenses--things like the GPL. We're not going to address legal issues, but what the licenses do in practice is reinforce and institutionalize a set of beliefs, values and norms for how free or open-source software should be developed. It's a statement of affiliation, of
how to build software, of the reasons why to build software. Here open-source licenses not
only serve community property rights but also act as a way of declaring affiliation with this
broader social movement. Open-source is becoming a global social movement, so it can grow
beyond the boundaries of software
Where are you seeing open-source principles adopted beyond
There's an open-source community in architecture, working in developed countries, of
people who will contribute their designs in developing or emerging countries, where hiring an architect to do something is prohibitively expensive. There's open-source education--
Like at MIT.
That's at the college
level, but also in grade schools and high schools globally. People in the United States and Europe are
contributing content for math and science
classes for their own countries and developing countries, where purchasing textbooks is prohibitively
expensive. In the visual-arts community, there's a movement to explore what it means to do
works of art for sharing, or building upon works of art of other people. People are breaking
away from the tradition of the individual artist, saying there's another way to build upon the work of others.
And in the area of government,
a number of European and Third World countries are looking to adopt open-source systems for
reasons of perceived cost or low cost, but at the same time they bring in the open-source
systems, they also embrace the ideology of openness, which in turn may be a revitalization of
what it means to be an open, democratic nation or government. So the process becomes open source so that citizens can better
understand how their governments work and how a corporate provider of information technology is serving its own interest in selling systems to its government or if it's
helping the people.