Why would a Googler use Solr for search?

The open-source Apache Solr, rather than Google's own search technology, is powering the All for Good site, which says as much about Google as it does about Solr.

Google is arguably the world's largest open-source company, not only releasing a minimum of 14 million lines of open-source code but also hosting over 250,000 open-source projects on Google Code, in addition to its open-source advocacy work like Summer of Code.

Despite these open-source bona fides, it's still surprising to see someone at Google adopting Solr, an open-source search server based on Apache Lucene, for its All for Good site.

Google is the world's search market leader by a very long stretch. Why not use its own search technology? Why use Solr?

Google's Public Sector team suggested an answer last week:

One of the top concerns we've been hearing from nonprofit organizations who list volunteer opportunities on All for Good is that their opportunities aren't updated on the site as frequently as they need. This happens because...we crawl feeds from partners like VolunteerMatch and Idealist just like Google web search crawls web pages. Crawlers don't immediately update, they take time to find new information.

Today, we're rolling out improvements to All for Good that will help solve this problem and improve search quality for users. The biggest change, which you won't see directly, is that our search engine is now powered by SOLR, an incredible open source project that will allow us to provide higher quality and more up-to-date opportunities. Nonprofits should start seeing their opportunities indexed faster, and users should see more relevant and complete results.

I don't think this means that Google thinks Solr provides better results than its own code. Rather, I suspect this was simply a case of a Googler using her 20 percent "free" time to get a job done. It was likely easier to roll a service using Solr than to get official approval from Google to use its search technology for an important but nonprofit purpose. (My request for comment by Google had not been answered at the time of this post's publication.)

To me, this says much about the power of Google's culture: Googlers appear to be unfettered to use the best tool to get a job done, which may not always be the best technology, per se, but simply the most easily available technology for a given project at a given time.

The decision also says a tremendous amount about the value of open source, and of Solr in particular. If it's good enough for Google, as David Fishman notes, it's probably going to be just fine for you, too.

Update, 11:41 a.m. PDT: I heard back from Chris DiBona, open source and public sector program manager at Google, who offered this reasoning behind the move, in response to the suggestion that Google uses Solr:

I think you meant "Googler chooses Solr." You see, Allforgood.org is run by Our Good Works, a non-profit that works with technology companies and the whitehouse on that site. I'm on the Board of OGW, but it is run by Jonathan Greenblatt.

That said, we chose Solr because it made sense for the project we had. We want other companies/countries to be able to use the code we've written for Allforgood.org and to have it depend too heavily on Google Base precluded that, but specifically, technically speaking Solr fit the problem better than Google Base did.

So, it's not accurate to say that "Google chose Solr," but it is accurate to suggest that All for Good was founded by Googlers in their "20-percent time" and continues to be hosted by Google, as TechCrunch has reported, and that those Googlers, along with the rest of the board, opted for Solr over Google.

As DiBona mentions, and as I blogged above, this is a reflection of fit-for-purpose, and not any problem with Google's code. All for Good is completely open source, so it makes sense that it would opt for open-source Solr over Google Base.

