“How much is open source worth?” is an age-old question. Thanks to new work from professors at Harvard and University of Toronto, it is also a question with a new, creative, and important answer.
You can trace the roots of this question to David Wheeler’s work on the “sloccount” tool, which attempted to measure the size of codebases and their value, using algorithms from the established computer science literature. That tool was then used in a variety of ways—including measuring the value of the entire Debian operating system, c. 2002. That work suggested that, as early as 2000, Debian (including the Linux kernel) would have cost somewhere between one and two billion US dollars to replicate. Of course, this was not a measure of “the value of open source”—just one part of it, as captured by a specific operating system. There was, at the time, no good way to understand how widely-used Debian was, nor to understand what parts of it were used more or less widely.
In 2013, Jonathan Band and Jonathan Gerafi took a different tack to estimate the value of another branch of open—Wikipedia. Their work suggested that Wikipedia could be valued anywhere from two to eighty billion dollars, depending on the methodology. Critically, unlike the Debian estimates of the early 2000s, these researchers had some idea of how many people use Wikipedia. This allowed them to calculate not just “how much would it take to rewrite Wikipedia from scratch?” but also more nuanced answers like “how much would people pay to keep it in their lives?” or “how much would a business with this many users be valued at?” Each of these are valid answers—and by any of these metrics, Wikipedia’s value is huge.
More recently, the Open Source Impact Study, conducted by OpenForum Europe and Fraunhofer ISI for the European Commission, assessed the impact of open source on Europe's technological independence, competitiveness, and innovation. The study estimates that OSS contributes between €65 to €95 billion to the European Union's GDP—a more dynamic, annual measure than the ones above.
Because this estimate focused on the EU, and was done for the benefit of the European Commission, it has gone beyond earlier hypothetical estimates and become a helpful tool in European policy discussions. In particular, I have heard from friends in Brussels that it has helped the EU realize that open source has “homegrown” economic benefits, just like transportation infrastructure and manufacturing. This counters narratives that open source is “just” a hobby, or that it is “just” an American “big tech” project.
A new paper from Harvard Business School tries to pull a lot of these threads together, and reaches some pretty bold conclusions.
Critically, the paper makes two important moves. Like the EU and original Debian papers, it tries to value open source by measuring how big the code bases are. It also uses an interesting dataset that scrapes websites to understand what open source is actually in use in the wild, allowing the authors (like the Wikipedia paper) to estimate the value of open source from the point of view of creators.
Their conclusion is staggering: on what they call the “supply side” (what would it take to reproduce), they estimate open source is worth something like four billion dollars. On the “demand side”, i.e., how it is used, they estimate that the value of open source is $8.8 trillion dollars. (Insert Doctor Evil pinky image here!)
This multi-trillion dollar valuation is amazing in a couple of ways.
First, even if this is too optimistic, and the real number is smaller, open source still has economic value similar to very visible pieces of infrastructure like national highways, railways, or the electrical grid. Also, unlike all those other forms of infrastructure, it has achieved this despite receiving very few direct subsidies from national governments.
Second, the number is almost certainly too small. The authors explicitly exclude operating systems from their measures, and yet we know that open source operating systems are (1) extremely complex to create (see their 2000 valuation of $1-2B!) and (2) extremely widely deployed, both on servers and on consumer devices (Android phones and in new home embedded devices like TVs). In addition, the survey does not appear to capture the value of web browsers, which are second only to operating system kernels in complexity, possibly more widely deployed than open source operating systems, and central to modern e-commerce. A total value number that captured those missing components would likely be even larger.
In some way, this paper means very little: we’ve long known that open source is very impactful to the world. At the same time, policymakers can’t help but take notice when Harvard Business School uses numbers starting with a “t”. (We’ll certainly be using this liberally in our future responses to federal government requests for information!)
At a time when governments all over the world are discussing regulating open source, this number will absolutely drive conversations. And we hope, with a newly quantified value measuring just how impactful open source is, that it’ll drive investment as well.