On June 7, 2021 Tidelift hosted a new event called Upstream, a one-day celebration of open source, the developers who use it, and the maintainers who make it. All of the talks are available to watch at upstream.live, and several of our speakers offered to share their thoughts in a post-event blog post as well. Georg Link shares 5 best practices for using open source community leaderboards in this blog post, which was originally published on opensource.com.
It takes a community of people with varying skill sets and expertise to build open source software. Leaderboards have become a way for open source communities to track progress and showcase and celebrate top-performing contributors. If leaderboards are done right, they can increase participation, motivate contributors with gamification, and enhance the community. But leaderboards can also have adverse outcomes—including discouraging participation.
The Community Health Analytics Open Source Software (CHAOSS) community, a Linux Foundation project, focuses on bringing standardization to open source project health and metrics. Leaderboards are a topic that keeps coming up during those conversations. Initially, this blog post was a presentation I made for Upstream 2021, the Tidelift event that kicked off Maintainer Week, a weeklong celebration of open source maintainers. This article will explore five best practices to help communities use leaderboards successfully and improve their project health through metrics.
Leaderboards display rankings among a group of people or companies. Leaderboards work very well in sports where athletes compete with the same goal of running the fastest, jumping the furthest, or lifting the most weight. There is no such competition in open source communities, and the context is much more complex, but leaderboards may still be helpful. For example, GitHub Insights shows contributors with the most commits to a repository, which shows at a glance whether a community has more than just one or two active contributors. Another example is the Drupal Marketplace, a leaderboard of ranked companies based on a community-created algorithm to reward companies with better positioning for being active in the community.
(Georg Link, CC BY-SA 4.0)
The above graphic shows two examples of sites measuring community involvement:
Open source communities depend on voluntary contributions, whether from individuals or company employees (where a company expects an employee to devote a certain amount of time to supporting a specific open source project). Tracking top contributors helps people to easily see who is active, what they're achieving, and who is behind a project.
Leaderboards create a sense of community for contributors where some members get highlighted for their open source work. Contributors who enjoy competition and like to engage in activities counted for the leaderboard's scoring, the desire to be on the leaderboard can inspire increased participation. This narrative makes it tempting to use leaderboards in open source communities. However, contributors who participate in ways that don't get counted in the scoring would not have any chance of showing up in the leaderboard and might be discouraged by it. This shows the importance of understanding how leaderboards work before using one in an open source community.
There are multiple theories behind the principles of how leaderboards work to incentivize people. Still, primarily, leaderboards serve to satisfy a human need for competence, autonomy, and social relatedness, as well as a human tendency to rank and compare each other to understand our abilities better. Gamification inspires competition with a continual goal-attainment-reward design.
The schematic below is an imperfect representation of how leaderboards in open source work because it ignores the social elements of community engagement. Community activity gets logged as trace data, which a scoring algorithm converts into a score that determines the ranking in the leaderboard. Community members respond to the ranking and may change their community activity, which, in turn, may influence their leaderboard ranking.
(Georg Link, CC BY-SA 4.0)
However, leaderboards must be designed well to be effective. Leaderboards score contributors based on specified algorithms and data. The score indicates progress towards a success measure, as determined by what a leaderboard is designed to measure. Open source communities can use trace data. This kind of data is transparent, but it is created "accidentally" and contains personally identifiable information (PII) such as names and emails. This means that the data used for leaderboards need to be carefully obtained and utilized in awareness of privacy laws, such as the European Union's GDPR.
For example, GitHub Insights uses commits as the data that gets added up. Contributors are scored by how many commits they contributed to an open source repository. This simple example demonstrates the limitation of leaderboards. Because leaderboards rely on a scoring algorithm, anything that cannot be reduced to a score is ignored. In the example, complexity and value for commits get ignored. Thus a vulnerability fix that eliminates a cybersecurity risk is counted with the same value as a typo fix in the documentation.
This brings us to the problem of gamification. Gamification principles state that if contributors are not of equal ability, less skilled competitors will become less motivated as a task seems unreachable. In contrast, highly skilled competitors will find activities too easy, eventually demotivating them as well. Gamification of leaderboards also drives contributors to focus on activities rewarded in the scoring, often to the detriment of other significant contributions.
Open source engagement is a social activity, and reducing it to a game of numbers can alienate many contributors who would otherwise have made the community whole. Further, many activities in open source cannot be counted or measured because they do not leave any trace data. For example, learning, marketing, socializing, or coordinating occurs from direct interactions rather than community platforms. Consequently, leaderboards can worsen the issues open source faces regarding creating a welcoming and inclusive environment—only a specific subset of contributors can be rewarded by leaderboards, regardless of how well they are designed.
I wish I could say we have best practices for leaderboards. The reality is that leaderboards are extremely challenging to do well in open source, and attempts to create leaderboards can result in community backlash.
However, the CHAOSS community has developed best practices around project health and measuring communities to guide our thoughts and decision-making.
Lastly, there are several open source metric solutions available. They got developed to understand project health, not create leaderboards. However, they solve the problem of collecting data that we can use for a scoring algorithm for leaderboards.
With thoughtful design and in limited contexts, leaderboards can be a great way to help open source communities thrive, inspire participation, and celebrate and reward contributors. However, we cannot solve the overarching goal of understanding and promoting project health with leaderboards. The Upstream 2021 talk and this blog post are only one argument and a forcing function to formalize ideas in the ongoing conversation around leaderboards. Join the CHAOSS Community to continue the conversation.
You can also join the conversation for the Panel Discussion: Contributor Leaderboards to Incentivize Good Community Citizenship at the Open Source Summit North America 2021 in Seattle, WA.
This was originally posted on opensource.com.