Standing Against License Proliferation

Wednesday, May 28, 2008 at 5:14 PM



The proliferation of Open Source licenses has been a known problem in the community for a few years. As the number of licenses grows, the possible combinations and interactions skyrocket. This makes it very difficult to mix code from multiple sources because the licenses may be incompatible or create burdensome notification or publishing requirements. Many people in the community have been talking about how to stem the tide of new Open Source licenses, and how to reduce the existing set, in order to simplify the legal framework surrounding the Open Source universe.

When we set out to create the project hosting service on Google Code, I turned to my colleague Chris DiBona and said, "Dude. I think we have an opportunity here. What do you think if we offer only a limited set of licenses?" He replied, "Hunh... Cool idea!" Fifteen minutes later, we had a list of six licenses: Apache License v2.0, (New) BSD, MIT, GPLv2, LGPL, and MPL 1.1. About four weeks before launch, Robert Spier pointed out our oversight in not remembering the Artistic/GPLv2 dual-license used by the Perl community. Whoops! We launched with those seven licenses (and added GPLv3 a year later). Further, we do not allow any dual-licensing (other than the Artistic/GPLv2 combination), nor the dreaded tri-license. I'm not going to get started on what I think about that.

Two years later, we are nearing 100,000 projects hosted on Google Code. The trend around licensing is obvious: GPLv2/GPLv3 represent 42.6% of the projects, and Apache is 25.8%. MIT, BSD, and LGPL are at about 8% each, Artistic at 3.5%, and MPL 1.1 at a mere 2.7%. This follows my own observation about how people license their projects. If they are advocates of Free Software, they will choose GPL; advocates of Open Source will choose Apache (a more modern and thorough permissive license, compared to BSD or MIT). And this is exactly what I recommend to people: choose GPLv3 or Apache v2 based on your personal philosophy.

People very rarely choose licenses like MPL, EPL, or CDDL unless they are part of those communities (Mozilla, Eclipse, and Sun, respectively). Unfortunately, licenses that are typically community-specific also tend to create islands of code. And while there is no good reason that I cannot lift code from an EPL'd Eclipse plugin and incorporate it into some Apache-licensed server application that I'm writing, that kind of blending just does not happen. Based on the low popularity, and in an attempt to limit this artificial segregation of code, we plan to remove the MPL from the set of licenses on our project hosting service (for new projects; existing projects will not have to relicense).

At Google, we believe that the software community would be much better served by a reduction in the number of licenses used by software projects. We hope to steer people in that direction with our license choices on Google Code. We also highly recommend choosing the Apache License or GPLv3 for your projects.

11 comments:

jackr said...

What about Affero (now GPL and OSI)? We can't lean on past use patterns to know how popular this one will be, it's too new. It's also not quite the same sort of successor to (non-A) GPL as is, say, Apache to BSD/MIT: Apache refines intent by clarifying terms and extending permission, while Affero consciously seeks to retain intent by explicitly expanding prohibitions (viewing the scope limitation of GPL as a bug).

At Tigris, we've decided to permit it, on an "it's OSI-approved" argument (which I'm not by any means saying makes everything clear and wholesome). I believe Google has decided to deny it, on an argument that appears to me unclear. What's up with that?

Russell said...

Unfortunately, Greg, the problem is MUCH harder than you perceive it to be. Take, for example, iometer. It's a project created by Intel, run for a few years, and then thrown to the winds. It's now at http://www.iometer.org/ and is still licensed under the Intel Public Source License. It's pretty clearly an Open Source license, and yet Intel isn't using it for new projects so it's a deprecated license, and yet iometer is stuck with it.

If you can figure out what to tell the iometer folks, then you are smarter than me. I suspect that "just relicense" will fall on deaf ears, but by all means try it.

Russell said...

One more thing: be very, very careful about generalizing the community's interest in using licenses from Google Code user's interest in using licenses. You limited licenses from the start to certain licenses. There are many alternatives to Google Code, so just because you've gotten away with restricting licenses, that doesn't mean that there's any reduced interest in other licenses. You may simply have sent these people elsewhere.

Shot said...

The problem with this set it that it lacks AGPL, which is basically ‘GPL for web applications’. With this set, there’s no way to have a web application licensed in a GPL-like way (‘take my code, but if you modify and offer it to someone else, you also need to offer them your patches’), so there’s little incentive for Free Software-style web applications to be hosted on Google Code.

Vinnl said...

I suppose I'll play the devil's advocate: what about the Affero GPL? Too threatening for Google I suppose?

Sandy said...

Awesome work guys, I'm glad to see progress on this front.

As a contributor to GPL, LGPL, and MIT-style projects, I certainly feel the pain of excess license proliferation. Thanks for taking steps to help.

I too think that Affero GPL should be offered, as there is nothing else that really fills that niche, and it is very relevant in today's environment. I understand there's probably not a lot of data on the demand and use of it, but I think it should be encouraged for free web applications. What do you think?

Henri said...

I think the Affero suggestions are off the mark - Affero take up is not looking impressive so far. The number of AGPL projects listed here:

http://www.blackducksoftware.com/oss/project-list

is very small - 63 out of 2737 *GPL3 projects, and I've yet to see one that seems like a big deal.

With LGPL3/GPL3 making up 42% of the projects at Google.Code, that would imply that only 1 out of every 42 projects would want to move to AGPL. Too small a number to justify itself as a major license.

It's a shame to see MPL being removed - of the MPL/CDDL/EPL/CPL set that means there are is no representative, but a) as with the justification for AGPL not being there above they aren't significant and b) as Greg points out, they are mostly restricted to a single community. From a conceptual point of view, it'd be nicer to see MPL stay and LGPL go given the page on "Don't use LGPL" at the FSF, but the numbers have spoken.

It's odd not to see Greg also suggesting Artistic should go. It's also another single-community - why do we see the Perl community getting something but there's no Ruby license, Mozilla, Eclipse or CDDL (allegedly nicer version of Mozilla, but much less of a community around it)? Seems odd to me anyway.

Lastly - this is just what they state themselves as. Past experience with SF.net has shown that the claimed license in the metadata and the actual license in the source control can be quite different; seems likely the same would be happening at Google Code. "BSD, but oh look we modified it" etc.

Rob Lanphier said...

The "license proliferation problem" is way overblown, IMHO. I know from past experience that custom OSI-compliant licenses are a gateway drug to more standard licensing. By freaking out about custom licenses, the community is probably repelling companies that would consider moving toward open source models by reinforcing the stereotype that the community is filled with inflexible ideologues.

Greg Stein said...

jackr, and others: it is a simple numbers things with respect to AGPL. If/when it gets a decent percentage of projects out there, then we'll consider it. Nothing conspiratorial.

russell: yeah, relicensing is a bitch. And that is exactly one of the problems that is caused by proliferation. You're right that there is little hope for iometer's licensing :-(

russell, again: I do recognize that I'm looking at a self-selected crowd. Freshmeat and SF have license breakdowns, which I look at occasionally.

Regarding license types: things like Affero and MPL/CDDL/EPL fill niches, yet complicate the licensing regime out there. I'd rather see those projects stick to a strong copyleft license, or just go with a fully permissive one. The in-between is troublesome.

Henri: thanks for the commentary. We might remove Artistic at some point -- you're right about its popularity and community. IMO, the Perl guys should have just gone with Apache rather than revising the Artistic license. Why Perl? Mostly I'd say that I allowed it because I didn't want to alienate the entire Perl community. I didn't think the numbers would be this low tho. EPL and CDDL combined were less than Artistic, so those were not on the list.

If I wanted to completely disregard percentages, then I'd only have GPLv3 and Apache on the site. But that's not very realistic :-P

And one claiming one license, but using another? Not good. You're misrepresenting your project to users. We've found some like that. They've been asked to relicense properly, or to leave Google Code. There are plenty of hosting options which have flexibility around licensing.

Greg Stein said...

Rob: proliferation is very much a problem. The Open Source Program Office at Google spends an inordinate amount of time dealing with all this stuff. It really sucks. Our job would be much easier if there were fewer licenses.

Rob Lanphier said...

I think non-OSI-compliant license proliferation is a WAY bigger problem than OSI-compliant license proliferation. You and I may spend an inordinate amount of time untangling all of the different OSI licenses (believe me, I spend a lot of time on it, too), but I'm WAY happier spending time on that than I am when I have to read a non-OSI compliant license.

The non-open source world considers it a given that every single company have multiple unique licenses for the products they produce (often more than one per product). It's hard enough introducing the seemingly orthodox position that OSI-compliance is important. It sucks that it's considered a dealbreaker if one of a small menu of licenses is chosen.

It's totally your prerogative to limit Google hosting to a small list of licenses. I still stand by my statement that this problem is generally blown way out of proportion by the community.