CNET también está disponible en español.

Ir a español

Don't show this again

Tech Industry

The attribution problem

As open data, creative writing and media, and code merge, we're going to increasingly need to reconcile the issues that matter most to the communities who own the copyrights to their respective bodies of work.

One of the reasons I attend O'Reilly's Open Source Conference (OSCON) is that, more so than others I go to, it gets into the intellectual and--dare I say--philosophical underpinnings of things as well as the things themselves.

To be sure, this sort of thing may not be especially important if we're talking about things like servers--although these too interact with long-term undercurrents such as massively multi-core programming  that are largely removed from day-to-day concerns but which are immensely important in the long view. In the case of Open Source, however much it has blended into the mainstream of software, is still also very part and parcel of the history and motivations behind it.

Much of that background, the continuing areas of conflict that are part and parcel of it, hints at how Open Source may evolve, and some of the opportunities (and challenges) of bringing Open Source into domains other than code were on display at the Participate 08 panel discussion yesterday. The complexities of the many interweaving threads are neatly captured in these whiteboards drawn by Collective Next during the panel.

But for our purposes here I'm going to focus on one specific thread. I'll be following up with further discussion of other points.

One of the panelists was John Wilbanks, who run the Science Commons project (within Creative Commons). He had some interesting perspectives on the concerns of scientists, as opposed to programmers. For example, in the Open Source code world, as it has evolved, attribution (at least formal attribution) isn't a component of most licenses. But, in the academic community, it's all about attribution. As he described it: "the motivation is to be associated with the publication of an idea... to own a fact."

This is a potentially huge disconnect between the data/science world and the code world. This is especially so because attribution clauses are not a part of most Open Source licenses for deliberate reason. The problem is that attributions "stack"--that is, they acquire threads of contributors that may go back years. Thus, to have a legal requirement to preserve some list of all that historical accretion of intellectual property would get enormously unwieldy to implement in a practical way.

Academics deal with this sort of thing all the time. However, it's handled within the context of social norms and customs and violations are dealt with largely by corresponding social censures rather than legal ones. Attribution is serious business in academia--but it's not implemented through formal legal strictures that require literature searches for previously unknown Russian papers of 30 years past. (Of course, there are often bruised egos and perceived slights all the time--welcome to the world--but these are issues mostly resolved within a community rather than in a court of law.

As a side issue, John also noted that, in the sciences, he does not recommend that work be limited to non-commercial use or to prohibit derivative (i.e. transformed) use of the work. He said that such restrictions have a very chilling effect on integration and federation. I've written previously about the Non-Commercial clause of some Creative Commons licenses in the context of photographs. Increasingly strictures against commercial use, an area that Open Source code licenses have largely stayed away from to their betterment, seem to be something that appear reasonable and fair but, in fact, have far more cons than pros.

As open data, creative writing and media, and code merge, we're going to increasingly need to reconcile the issues that matter most to the communities who own the copyrights to their respective bodies of work.