January 24, 2022
Software for all: How do open-source communities work?
Open-source systems are a type of software that can be freely modified and distributed. Open-source projects are at the heart of the infrastructure of our digital society, but they are susceptible to significant sustainability problems because many people use them but very few contribute to their development.
Research by Javier Cánovas (a member of the UOC's Faculty of Computer Science, Multimedia and Telecommunications and researcher with the Systems, Software and Models Research Lab (SOM Research Lab) group at the IN3 Internet Interdisciplinary Institute), together with Jordi Cabot (ICREA research professor and group leader), has analyzed the profiles of the users involved in these projects. The results show that the presence of contributors who do not develop code is highly significant, and that there is also a certain degree of specialization among these people. According to the researchers, these data "demystify the idea that only developers drive open-source projects" and could be used to design new strategies to improve the sustainability of such initiatives.
Completing the partial picture of open-source projects
The structure of open-source projects fundamentally depends both on the community of contributors (who keep the projects alive) and on them collaborating in an active and enriching way. However, the vast majority of research on these communities focuses on studying the profiles of users who are responsible for programming and other technical tasks, such as reviewing or combining code. "This is only a partial picture of what an open-source project really consists of and how it moves forward, which is generally based on a community of users in charge of a wide variety of tasks (such as marketing, promotion and design), who also help draw up documentation or take part in discussions on the future evolution of the project," explained Javier Cánovas.
To gain a deeper understanding of collaboration dynamics in open-source systems, the researchers analyzed the 100 most important npm projects (npm is the package manager for Node.js, one of the most popular web application servers) found on GitHub, a leading social coding platform. "This study has allowed us to verify that non-code tasks (non-technical), such as reporting a problem, suggesting an improvement, taking part in a discussion or simply reacting to other people's comments (for example, with an emoji to communicate acceptance of a proposal), are a common feature in open-source systems. In fact, their presence is highly significant, demonstrating their involvement in the life of the project," pointed out Javier Cánovas.
Division of project tasks
The study also investigated whether project contributors usually have a single task or whether they perform several tasks and, therefore, the different roles overlap. The results show that there are users who only contribute to the project with non-technical activities, which would complement the work of the people focusing on programming and code development, who, in contrast, would have little involvement in other tasks.
These data give new clues for designing onboarding and governance strategies that facilitate the evolution of these users and better collaboration between the various roles. "In most open-source projects, efforts to attract and bring in new contributors are clearly aimed at developers, but this means they miss the opportunity to attract other types of profiles that could be easier to bring in and could also help the progress and long-term sustainability of the project," the authors of the study noted.
"In fact," they added, "projects interested in attracting more technical contributors should also make an additional effort to help some of the non-technical contributors to take part in the programming side, as this is not a natural evolution."
Studying the evolution of the community over time
This research is part of the SOM Research Lab's work focused on optimizing and promoting contributor collaboration in open-source systems, which has different ramifications. "The most significant aspect right now is considering the temporal dimension, i.e. how the state of a project and its community evolves over time," the researcher said.
Other lines of work in this area include studying mechanisms for attracting new contributors to open-source projects, exploring new ways of visualizing the contributions of community members or proposing solutions for defining community governance rules (or models).
The research was published in Empirical Software Engineering.