Software Solutions For Newcomers' Onboarding in Software Projects: A Systematic Literature Review
Software Solutions For Newcomers' Onboarding in Software Projects: A Systematic Literature Review
Computer
Procedia Computer Science 00 (2024) 1–?? Science
Abstract
[Context] Newcomers joining an unfamiliar software project face numerous barriers; therefore, effective onboarding is essential to help them
engage with the team and develop the behaviors, attitudes, and skills needed to excel in their roles. However, onboarding can be a lengthy, costly,
and error-prone process. Software solutions can help mitigate these barriers and streamline the process without overloading senior members.
[Objective] This study aims to identify the state-of-the-art software solutions for onboarding newcomers. [Method] We conducted a systematic
literature review (SLR) to answer six research questions. [Results] We analyzed 32 studies about software solutions for onboarding newcomers
and yielded several key findings: (1) a range of strategies exists, with recommendation systems being the most prevalent; (2) most solutions are
web-based; (3) solutions target a variety of onboarding aspects, with a focus on process; (4) many onboarding barriers remain unaddressed by
existing solutions; (5) laboratory experiments are the most commonly used method for evaluating these solutions; and (6) diversity and inclusion
aspects primarily address experience level. [Conclusion] We shed light on current technological support and identify research opportunities
to develop more inclusive software solutions for onboarding. These insights may also guide practitioners in refining existing platforms and
onboarding programs to promote smoother integration of newcomers into software projects.
Keywords: Systematic Literature Review, Software projects, Open source software, Onboarding, Turnover, Tool, Newcomers, Novices
tributing [103]. Therefore, investigating newcomer onboarding In this section, we detail the protocol used for the systematic
in this context is crucial [5, 42, 102, 103, 106, 114, 130]. literature review, specifying the research questions and defin-
Onboarding strategies can include courses [83, 96], boot- ing the search strategy, selection process, selection criteria, and
camps [82], and mentorship [4, 17, 34, 82]. These strategies data collection and synthesis processes. We present the results
are known for being costly in terms of time and money and lack for the research questions in Section 3.
scalability [17]. For example, senior developers pointed out
that working as mentors impacts their productivity [82]. Hav- 2.1. Research questions
ing new hires read relevant source code without assistance is
According to Park and Jensen [80], the continuous influx of
also costly regarding time investment [7], leading to long ad-
newcomers and their active participation in development activ-
justment periods.
ities play a vital role in the success of software projects. In this
Some solutions can be automated to facilitate the onboard-
context, this SLR aims to identify studies that propose software
ing process for a large number of projects and newcomers. Such
solutions that facilitate the onboarding processes for newcom-
software solutions are still not largely used in practice but have
ers in software projects. We translated our research goal into
been investigated in the scientific literature. However, this evi-
the following research questions (RQs):
dence is spread across different venues and disciplines.
This study aims to identify studies that propose software so- RQ1. What software solutions are proposed in the literature
lutions that facilitate the onboarding of newcomers in software to facilitate newcomers’ onboarding in software projects?
projects [8] using a Systematic Literature Review (SLR). Soft-
ware solutions can actively support diverse aspects of onboard- As the field of software development continually evolves,
ing. The literature is vast and covers many of these aspects, the challenges newcomers face during their onboarding process
such as reducing onboarding time and cost for companies [3], continue. By answering RQ1, we aim to provide a comprehen-
supporting independent learning [119], supporting the need for sive understanding of the existing software solutions for sup-
training [21], helping newcomers to deal with the high amount porting newcomers during the onboarding process. By lever-
of information [25, 130], and supporting newcomers in under- aging existing knowledge and commonly used onboarding so-
standing complex source code structures [17, 34]. lutions, organizations can create a smooth onboarding process,
Our systematic literature review consolidates this informa- promote productivity, and foster a positive team dynamic.
tion into a single resource, providing a clearer understanding of
the existing software solutions that facilitate onboarding new- RQ2. How were the software solutions implemented?
comers in software projects. This paper presents a comprehen-
sive analysis of 32 primary studies published until 2023 to iden- While numerous software solutions have been proposed to
tify the state-of-the-art related software solutions for newcom- enhance newcomers’ integration into software projects, under-
ers’ onboarding and to identify potential gaps that can be ad- standing the specific implementation details is essential to as-
dressed by developing new software solutions. The outcomes sess their feasibility, effectiveness, and real-world impact. An-
of this study inform practitioners and researchers working on swering RQ2 enables the software development community to
smoothing onboarding for newcomers and provide a basis for identify successful approaches and technological gaps.
further research in this area.
We have organized the remainder of this paper as follows. RQ3. How do the proposed software solutions improve new-
Section 2 details the SLR planning and its execution. Next, comers’ onboarding?
Section 3 presents the results and answers the study research
Onboarding is a complex and multifaceted process. By an-
questions. Section 4 outlines the paper discussion, Section 5,
swering RQ3, we aim to provide evidence and insights into
the implications. The threats to validity are discussed in Sec-
which aspects of onboarding have been addressed by existing
tion 6. In Section 7, we introduce related work. Finally, Sec-
solutions. Understanding the goals of those proposed software
tion 8 concludes the work concerning our main findings and
solutions enables software projects to find solutions that better
suggests future work.
address their needs.
2
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 3
Software projects considering the adoption of a software we consider that software solutions are alternatives to mitigate
solution may be particularly interested in how these solutions onboarding barriers and offer (semi-)automated support, help-
have been evaluated, especially in practical settings. Address- ing newcomers adapt to new environments, understand com-
ing RQ5 helps to understand the research strategies employed plex systems, and access the necessary information without re-
to evaluate the quality and applicability of these solutions, guid- quiring constant human guidance.
ing transfer to practice and future research in the field.
2.3. Search strategy and selection process
RQ6. How do the software solutions address the diversity
We systematically searched for relevant studies, as illus-
and inclusion of newcomers?
trated in Figure 1. The search process included eight stages,
Literature shows [19, 23, 98] that the way information cur- applied sequentially, as follows.
rently provided in software projects (e.g., documentation, issue Stage 1. For our search string formulation, we defined our
description) benefits certain cognitive styles (e.g., those who population as ’software projects’ and the intervention as ’on-
learn by tinkering) over others (e.g., process-oriented learners). boarding newcomers’ derived from our research questions.
The prevalent approach in building software solutions is more Upon careful analysis of terms associated with the popula-
beneficial to the majority, and the literature shows that not con- tion and intervention components, we formulated a set of
sidering the minorities in the design increases barriers to their keywords and their synonyms to construct our search string.
participation [92]. This is counter-intuitive to most designers The selection of these synonyms was carried out with the
because software is often built/designed by representatives of assistance of domain experts, and we also drew upon rele-
the majorities. Therefore, the information architecture of doc- vant SLR [56, 105] to enrich our collection of synonyms fur-
umentation and tools usually appeals to those who have high ther. Subsequently, we performed a pilot search on Google
self-efficacy and are motivated by individual pursuits such as in- Scholar to fine-tune the search string, and we created a con-
tellectual stimulation, competition, and learning technology for trol group containing a set of five (5) studies previously known
fun. These pursuits cater to characteristics associated with men, by the authors for search string validation [3, 48, 99, 106,
which can neglect women and other contributors who may have 120]. The first author (named R1—Researcher 1—from this
different motivations and personal characteristics [19]. RQ6 point on) used the keywords and their respective synonyms,
brings this awareness and contributes to the effort of making presented in Table 1, to build the search string, as detailed
projects more welcoming for people who do not follow the cog- in Table 2. The final search string was derived after nu-
nitive and behavioral standards of the majority. merous trials and iterations, considering the studies estab-
lished as the control group. R1 applied the search string in
2.2. Selection criteria the most commonly used publication databases in Computer
For the selection criteria, we established one Inclusion Cri- Science [13, 32, 60], including IEEE Xplore,1 ACM digi-
teria (IC) and five Exclusion Criteria (EC), detailed below: tal library,2 , Scopus,3 Springer link,4 and Web of science.5
We did not include Google Scholar in our search because it
IC1 – The primary study proposes software solutions for can produce inaccurate results and has considerable overlap
newcomers’ onboarding in software projects; with other databases we used in our search. For example,
Valente et al. [116] found that Scopus alone returns 93% of
EC1 – The study does not have an abstract;
relevant papers in a computer science literature review, and
EC2 – The study is just published as an abstract; although Google Scholar’s recall is high, its precision is low
due to the inclusion of non-peer-reviewed documents like
EC3 – The study is not written in English; arXiv, PhD theses, and technical reports. Similarly, Harz-
ing and Alakangas [45] concluded that while Google Scholar
EC4 – The study is an older version of another study already
provides broader coverage for most disciplines, Web of Sci-
considered;
ence and Scopus yield fairly similar results. This is con-
EC5 – The study is not a scientific paper—such as editori- sistent with the concerns of other researchers [22, 59, 126]
als, summaries of keynotes, workshop proposals/reports, and about Google Scholar’s effectiveness in retrieving primary
tutorials. studies. For instance, Kitchenham et al. [59] suggest that
Google Scholar is more suitable for searching grey literature,
In our review, we focused on papers that propose software which was not the focus of our review.
solutions—such as tools, applications, or platforms—designed Our search across the five selected digital libraries yielded
to facilitate the onboarding of newcomers to software projects. 9,734 candidate studies, was conducted in January 2023, and
These software solutions support various aspects of onboard-
ing, such as reducing time and cost and aiding newcomers learn-
1 http://ieeexplore.ieee.org
ing. We excluded papers that only investigated the onboarding 2 http://portal.acm.org
process without proposing software solutions, such as studies 3 http://www.scopus.com
that examined the code of conduct, as they do not align with our 4 https://link.springer.com/
3
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 4
Search process
9734 Stage 1 Stage 2 Stage 3
Search String (Remove duplicate) (Read title, abstract, and keywords) (Read introduction)
9734 Selection 2245 7489 7489 Selection 6440 1049 1049 Selection 735 314
01, 2023 criteria criteria criteria
S S S
Stage 7 (Snowballing)
32 37
Consensus
Meeting S
√
Figure 1. Search and selection process describing the number of studies selected in each stage: Ⓢ: studies — : included — ×: excluded.
no period restrictions were applied. Subsequently, we re- Stage 2. Our study selection was a multistage process [60].
moved 2,245 duplicate candidate studies, resulting in an ini- Initially, R1 reviewed the candidate studies’ titles and ab-
tial set of 7,489 unique candidate studies to commence the stracts to assess their adherence to the inclusion and exclu-
selection process. To mitigate biases related to the search sion criteria described in Subsection 2.2, and 1,049 studies
string, we included the forward and backward snowballing were included. We applied the selection criteria, and unless a
approaches to find other relevant studies that could not be study could be excluded only based on the title and abstract,
returned in the initial search. we obtained its full text to have additional information [60].
Stage 3. Since many SE abstracts are too poor to rely on
Table 1. Keyword and synonyms used to build search string terms.
Keyword Synonyms
when selecting studies [16], we decided to exclude a study
“software project”, “software engineering”, “software development” after reading other sections (such as the introduction and, if
Software
project
OSS, “open source”, “open-source”, “free software”, FOSS, FLOSS, necessary, the conclusions). R1 re-evaluated and added the
“OSS projects”, “open source software”
Onboarding, onboard, joining, engagement, newcomer, contributors,
reading of the introduction section of the studies selected in
Onboarding novice, newbie, “new developer”, “early career”, “new member”, the previous stage. 314 candidate studies were included since
newcomers “new contributor”, “new people”, beginner, “potential participant”,
joiner, “new committer”
they matched the inclusion criteria. For example, in some
cases, we identified that a software solution was proposed
reading the abstract. However, whether it could support on-
boarding newcomers needs to be clarified, as required in our
Table 2. Final search string. inclusion criteria (IC1 - The primary study proposes software
(“software project” OR “software engineering” OR solutions for newcomers’ onboarding in software projects).
“software development” OR “open source” OR Therefore, we read the introduction section to clarify the con-
“open-source” OR “free software” OR FOSS OR FLOSS text. In cases of doubt, we also read the conclusion to un-
OR OSS OR “OSS projects” OR “open source software”) derstand better how the software solutions proposed support
AND (“joining process” OR onboarding OR onboard OR newcomers to ensure that the IC1 was met.
joining OR engagement OR newcomer OR novice OR
newbie OR “new developer” OR “early career” OR Stage 4. Aiming to obtain a new layer of information, in
“new member” OR “new contributor” OR beginner OR addition to the sections already read, the conclusions of the
“potential participant” OR joiner OR entrance) studies included in the previous stage were then analyzed,
and R1 reapplied the selection criteria, resulting in 43 studies
4
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 5
5
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 6
Sciences. 1
1 1 1 1 1 1 1
Table 6 presents the geographic distribution of the selected
primary studies, which originate from five continents and nine 2003 2007 2008 2009 2011 2012
Publication Year
2015 2016 2017 2018 2020 2021 2022 2023
all the data at hand. These systems proactively tailor sugges- (1 study) environment.
tions that meet users’ information needs and preferences. Rec- Note: A single study may fit into multiple categories.
cal system. Humans prefer receiving information in a graphic to the project interface (PS23), such as platform usability en-
format to process it efficiently. Some solutions focused on in- hancements, to create a more user-friendly and welcoming at-
formation visualization tools (PS01, PS11, PS18, PS20, PS32) mosphere for newcomers. PS23 aims to optimize GitHub’s ef-
that provide dynamic and visual representations of project re- fectiveness by addressing distinct aspects. Santos et al. [92]
sources, documentation, and contributions. In addition, some (PS23) included in the GitHub interface visual elements such
solutions use tools and techniques to capture, organize, and as tooltips, progress bars, and feedback messages. Environment
present data within a project environment, enhancing the acces- redesign solutions focus on enhancing the platform’s usability
sibility and comprehensibility of project-related information, for newcomers during the contribution process (PS23). Santos
including metrics (PS07, PS30) and structured documentation et al. [92] (PS23) highlight that the current environment does
(PS15). not adequately support newcomers’ onboarding. However, with
Information visualization tools can enhance user engage- changes in the interface, the platform can become more inclu-
ment and retention by making content more interactive. PS01, sive (PS23) and enhance users’ performance when onboarding.
PS11, PS18, PS20, and PS32 propose dependency visualization
Research Question 1
tools for organizing and visually presenting information related
to the project. For example, Azanza et al. [3] (PS01) intro- Answer: The software solution strategies proposed in the
duced SPL Cmaps to aid newcomers in grasping the complexity literature incorporate systems that recommend projects, arti-
of SPL by visually representing concepts and connections, and facts, tasks, labels, labeling, and mentors. Other solutions fo-
Nagel et al. [75] (PS11) developed node-link diagrams to vi- cus on gamification for engagement and enhancements, pro-
sually represent source code by presenting code relationships. viding information via dashboards, web portals, and graphi-
The other three studies (PS18, PS20, PS32) explored the dif- cal aids.
ferent aspects of relationships between OSS projects: socio-
technical networks including developers, code, and software 3.2. RQ2. How were the software solutions implemented?
bugs (PS18 and PS32); and the relationship between program The software solutions for onboarding were organized in a
structure and project versions to explore the software evolution taxonomy by implementation type, presented in Table 8. The
(PS20). lines represent categories on how the software solutions are im-
In the metrics subcategory, Guizani et al. [44] (PS07) pro- plemented, such as web environment, machine learning model,
pose a dashboard solution to support community managers in and IDE plugin. The columns are the software solutions types
monitoring and acknowledging newcomers’ contributions. In previously mentioned in RQ1, including project and issue label
addition, Venigalla et al. [118] (PS30) presents GitQ to auto- recommendations. It is important to note that a study may fit
matically augment GitHub repositories with badges represent- into multiple categories.
ing source code and project maintenance information. Web environment. In a web environment, end users can
Concerning structured documentation, Steinmacher et al. configure or program applications using domain-specific or even
[106] (PS15) proposed a web portal that guides newcomers in application-specific languages [55]. Throughout our research,
their first contribution. These solutions encompass pertinent we identified studies that proposed modifications to the envi-
and complementary concepts and provide valuable information ronment to facilitate the success of newcomers during the on-
for software projects, aiding the onboarding of new contribu- boarding process and implemented in a web environment set-
tors. ting, with a focus on gamification (PS04, PS17), platform us-
Environment redesign. Some software solutions were de- ability enhancement (PS23), metrics (PS07, PS30), structured
signed to foster an environment facilitating active newcomer documentation (PS15), information visualization (PS18, PS32),
engagement. The studies often include the implementation of issue label (PS07, PS08), mentor/expert (PS02), project (PS21),
gamification (PS04, PS17, PS26), which introduces game-like artifact (PS03, PS10, PS27) and task/bug (PS12, PS13, PS18).
elements to enhance newcomers’ motivation, participation, and Concerning the gamification solutions, two studies (PS04
learning within the project context. Among the studies, two and PS17) demonstrate the potential of integrating gamification
(PS04 and PS17) delved into the integration of game design ele- elements into web environments to enhance engagement and
ments such as Rankings, Quests, Points, and Levels (PS04) and motivation among newcomers in OSS projects. Diniz et al. [30]
Gameboard, Unlocking, Tips, Badges, Forum, Voting, Profile, (PS04) integrated gamification elements on GitLab for under-
and Leaderboard (PS17). The authors applied those game ele- graduate students, and Toscani et al. [113] (PS17) demonstrate
ments in distinct contexts, specifically in GitLab (PS04) and the that gamification can be effective in engaging a diverse range of
FLOSScoach portal (PS17). Heimburger et al. [48] (PS26) was newcomers. This opportunity implies that gamification can be
the only study that explored gamification by developing a mo- customized to cater to various demographic groups, ensuring
bile onboarding application tailored explicitly for youth genera- inclusivity and widespread participation.
tions. The gamification solutions used game elements to orient, Platform usability enhancement solutions, such as the OSS
engage, and motivate users (PS04, PS17, PS26). These findings environment redesign (PS23), facilitated newcomers’ understand-
emphasize increased newcomers’ motivation when using these ing of repositories and aided their decision-making process.
solutions, even though they took place in specific contexts, like Santos et al. [92] (PS23) tackled inclusivity bugs on the GitHub
OSS platforms (PS04 and PS17) and private companies (PS26). interface by implementing fixes via a JavaScript plugin, con-
Additionally, other software solutions encompass changes tributing to a more inclusive experience.
8
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 9
Table 8. Taxonomy overview of software solutions for newcomers’ onboarding by implementation types.
Platform
Project Issue label Mentor/Expert Artifact Task/Bug Information Structured
Metrics Gamification usability
recommendation recommendation recommendation recommendation recommendation visualization documentation
enhancement
PS09, PS16,
IDE plugin PS31
PS25
Mobile
PS26
application
Concerning project information visualization, PS30 presented tions recommending repositories to developers. Both works
visual cues conveying project information to developers on GitHub leverage historical development activities, technical features,
repositories, and PS07 introduced dashboard prototypes. In ad- and social connections to predict developers’ interests and pref-
dition, PS15 developed a web portal to provide targeted in- erences.
formation and recommendations. Other studies (PS03, PS10, IDE plugin. Integrated Development Environment (IDE)
PS27, PS15) emphasize the need to facilitate newcomers’ ac- plugins are software extensions or add-ons that enhance the
cess to relevant information. Some studies proposed software functionality and features of software. Four software solutions
solutions that assist newcomers with issue labels (PS07, PS08). (PS09, PS16, PS25, PS31) developed a plugin they applied as
Moreover, other studies presented software solutions to engage an external software component in an IDE, which users can
newcomers with tasks matching their skills and interests (PS12, add to enhance and extend its functionality. Those software
PS13) and enabling newcomers to explore project bug descrip- solutions are related to mentor/expert (PS09, PS16, PS25) and
tions (PS18). project recommendation (PS31).
Machine learning. According to Lo et al. [66], machine Each study offers unique perspectives on how the solutions
learning is adopted broadly in many areas, and data plays a crit- can guide and engage developers. A significant subset of stud-
ical role in machine learning systems due to its impact on model ies (PS09, PS16, PS25) focuses on enhancing collaboration among
performance. Machine learning is an artificial intelligence tech- newcomers, developers, and the project community through var-
nique that makes decisions or predictions based on data [1]. ious means, such as suggesting mentors (PS16) and identifying
We identified eight (8) studies that harnessed the power of ma- experts in real-time (PS25). Some studies (PS09, PS16, PS25,
chine learning techniques. These studies predominantly cen- PS31) leverage historical project data, such as source code his-
ter on offering recommendations to newcomers, honing in on tory, email threads, development activities, and social connec-
crucial aspects such as issue label (PS08, PS14, PS19, PS24), tions, to inform their recommendations and tailor their software
mentor/expert (PS06) and projects (PS22, PS28, PS29). Across solutions to individual newcomers.
these studies, Fu et al. [39] (PS06) used machine learning tech- Interactive graph. When developers aim to commit a con-
niques to provide expert recommendations by using the random tribution to an existing project, their initial step involves read-
forest method to suggest suitable experts for developers based ing and comprehending the project’s code in alignment with
on domain-specific file embedding. Meanwhile, He et al. [47] their contribution objectives [127]. In our results, we came
(PS08) showcases the integration of machine learning into new- across three studies (PS01, PS11, PS20) incorporating visual-
comer onboarding by automating task selection and enhanc- izations to aid newcomers in understanding complex aspects
ing newcomers’ participation. Software projects can optimize of software projects. These visualizations range from domain-
collaboration and knowledge sharing using domain-specific file specific visualizations in SPL (PS01), visualizations for un-
embedding and behavioral patterns, as demonstrated by Fu et al. familiar codebases (PS11), and visualizations for knowledge
[39] (PS06), by connecting newcomers with experienced indi- graphs (PS20). These studies support newcomers’ comprehen-
viduals who can guide them. sion of complex concepts, navigate project environments, and
The utilization of historical data and machine learning tech- facilitate their learning paths within software projects.
niques (PS14, PS19, PS22, PS24) highlights the importance of Chatbot. According to Nagarhalli et al. [74], chatbots can
automating the categorization of issues based on their charac- perform many tasks at lower costs across a wide range of fields,
teristics and historical context. Projects can improve efficiency such as customer service, healthcare, pedagogy, and personal
by automatically assigning relevant labels and tags, analyzing assistance, many companies have invested heavily in this tech-
resolved issues, extracting pertinent details from titles and de- nology. Three primary studies proposed chatbots to aid on-
scriptions, and simplifying the issue management process. boarding. They proposed chatbots that focus on different types
Two studies (PS28 and PS29) introduced ML-driven solu- of interactions with users by recommending issue label (PS08),
9
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 10
Research Question 2
Answer: The studies implemented software solutions utiliz- helping newcomers find projects aligned with their interests and
ing web environment enhancements, machine learning, IDE skills.
plugins, interactive graphs, chatbots, and mobile applica- Personal. In our analysis, we identified four studies (PS04,
tions. A trend is the prevalence of web-based implementa- PS23, PS17, PS26) that enhanced individual newcomers’ needs
tions over the years. and experiences. Such solutions increased engagement and mo-
tivated newcomers to accomplish tasks (PS04, PS17, PS26).
3.3. RQ3. How do the proposed software solutions improve These software solutions primarily utilized gamification tech-
newcomers’ onboarding? niques with newcomers, fostering their engagement and boost-
ing motivation. Additionally, the solution proposed by San-
To address RQ3, we categorized the goal of each primary
tos et al. [92] (PS23) improved the newcomers’ self-efficacy
study into four categories, as presented in Table 9. The cate-
by providing a software solution that enhances newcomers’ be-
gories draw parallels with the categorization outlined by Balali
lief in their ability to perform tasks within the project context.
et al. [4], although we tailored them to the context of software
Further, their solution improved the onboarding experience of
solutions for onboarding. Software solutions focusing on pro-
newcomers with different cognitive styles.
cess revolve around refining onboarding procedures and work-
Interpersonal. We identified five studies (PS02, PS06, PS09,
flows within a software project. Regarding the personal as-
PS25, PS26) that propose solutions that foster community build-
pects, we found solutions geared toward enhancing individual
ing among newcomers in OSS projects. One of these solutions
newcomers’ needs and experiences during the onboarding pro-
(PS26) enhanced social integration and team building by in-
cess. Software solutions that focus on interpersonal aspects
troducing an application designed to support the onboarding
encompass those that enhance relationships among team mem-
process within a software company, particularly targeting users
bers, including both newcomers and existing contributors. Fur-
from generations Y and Z. Four solutions (PS02, PS06, PS09,
thermore, software solutions focusing on technical aspects aimed
PS25) facilitate mentorship for newcomers by enhancing men-
to provide newcomers with the necessary tools, resources, and
tor and expert recommendations.
technical skills required for their roles within the software project.
Technical. Two studies (PS03, PS27) improved artifact
It is important to note that some studies appeared in multiple
recommendation based on user requirements. PS03 and PS27
categories.
aimed to refine how OSS projects suggest and deliver artifacts
Process. PS08, PS10, PS13, PS14, and PS24 proposed
to newcomers, aligning with their needs and preferences. Addi-
solutions that improved how newcomers select a task to start
tionally, one study (PS11) enhanced newcomers’ code compre-
contributing by streamlining the assignment process based on
hension by providing visual representations of OSS projects.
newcomers’ skills and interests. Additionally, PS07, PS19, and
PS24 improved how issues could be better labeled to support
maintainers. Four studies (PS01, PS11, PS20, PS32) changed
the artifact representation and enabled interactive exploration of
the relationships among different project elements to reduce in-
formation overload. Furthermore, some primary studies (PS21,
PS22, PS28, PS29, PS30, PS31) enhanced project discovery,
10
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 11
Table 10. Software solutions for onboarding to overcome barriers identified by Steinmacher et al. [107], only 18 out of the 58 barriers are addressed by existing
software solutions.
BARRIERS/SOFTWARE PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS PS
SOLUTIONS 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Newcomers’ orientation
Finding a task to start with – – – X X – X X – – – X X X X – X X X – – – – X – – – – – – X –
Finding a mentor – X – – X X – – X – – X X – X X – – – – – – – – X – – – – – – –
Finding the correct artifacts to fix
– – X – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
an issue
Newcomers' characteristics
Lack of domain expertise X – X – X – X X – X X X – – X – – – – X – – X – – – X – X – X X
Lack of knowledge in project
X X X X X X X X – X X – – – X X X – – X X X X X – X X X X X X –
process and practices
Lack of technical background X – X – X – – – X X X – – – X – – X – X – – X – X – X – – – X –
Communication
Not receiving an answer – – – – – – – – – – – – – – X X – – – – – – – – X – – – – – – –
Send a message that is considered
– – – – – – – – – – – – – – X X – – – – – – – – X – – – – – – –
impolite
Delayed answers – – – – – – – – – – – – – – – X – – – – – – – – X – – – – – – –
Documentation problems
Information overload X – X – – – – – – – X – – – X – – – – X – – X – – – – – – – – X
Lack of documentation – – – – – – – – – – – – – – X – – – – – – – – – – – – – – – – –
Spread documentation – – – – – – – – – – – – – – X – – – – – – – – – – – – – – – – –
Technical hurdles
Local environment setup hurdles X – X X – – – – – X – – – – X – – – – – – – – – – – – – – – – –
Code/architecture hurdles – – X – – – – – – – X – – – X – – – – X – – – – – – X – – – – –
Understanding flow of information – – – – – – – – – – – – – – – – – – – – – – X – – – – – – – – –
Research Question 3 system (PS08, PS19), and leveraging task complexity levels
Answer: Our research emphasizes the significant impact to match newcomers’ skills and interests (PS05, PS07, PS31).
of software solutions on newcomers’ onboarding in OSS Concerning the barrier of finding a mentor, some studies shed
projects, categorizing onboarding into personal aspects (fo- light on solutions to streamline finding a mentor from differ-
cusing on boosting motivation and self-efficacy); interper- ent perspectives, such as mentorship programs (PS12, PS15),
sonal (focusing on community building and mentorship); mentor-mentee and matching systems (PS02, PS06, PS09, PS16,
process (addressing task selection and information over- PS25), and establishing efficient communication channels be-
load); and technical (emphasizing skill development and ar- tween newcomers and mentors (PS05, PS13).
tifact recommendations). The literature lacks methods to assist newcomers in find-
ing the correct artifacts to fix an issue. Cubranic and Mur-
3.4. RQ4. How do the software solutions mitigate newcomers’ phy [25] (PS03) is the only study that presents a solution to
barriers to joining software projects? recommended artifacts from the archives that are relevant to
Steinmacher et al. [107] conducted a qualitative analysis of a task that a newcomer is trying to perform–and it was pub-
relevant literature and collected data from practitioners to iden- lished 20 years ago. Concerning the barrier of poor “How to
tify the barriers that hinder newcomers’ initial contributions to Contribute” availability, it is crucial to emphasize the need
OSS projects. As a result of their comprehensive investigation, for improving the availability and accessibility of comprehen-
the authors developed a model comprising 58 distinct barriers. sive, user-friendly resources that can guide newcomers through
Based on the previously published studies, our study analyzes the contribution process. To overcome this barrier, PS15 de-
the existing software solutions for onboarding and how they livers well-structured documentation, tutorials, and interactive
could mitigate these identified barriers. It is important to note guides. Only the solution presented by Steinmacher et al. [106]
that only 18 out of the 58 barriers were covered by the existing (PS15) offers clear and concise guidance to address the bar-
software solutions, as illustrated in Table 10. rier of newcomers’ lack of awareness of the contribution flow,
Newcomers’ orientation. Newcomers’ orientation is a crit- ensuring that newcomers comprehend the necessary steps and
ical phase in facilitating newcomers’ successful integration and expectations for their contributions.
contribution to various settings, and several barriers hinder this Newcomers’ characteristics. Newcomers are expected to
process. Among the primary studies, 13 could address the chal- possess a minimum requirement of previous technical back-
lenge of finding a task for newcomers. PS12, PS13, PS18, and ground to perform a development task [107]. Fifteen solutions
PS24 offer insights into task selection, providing clear guide- can address the barrier of lack of domain experience, bridging
lines (PS04, PS14, PS15, PS17), utilizing task recommendation the knowledge gap and gradually empowering newcomers to
11
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 12
acquire domain expertise, enabling them to contribute to their of lack of documentation, only one solution (PS15) focuses on
expertise domain. These solutions include broadening new- actively creating and improving documentation resources, in-
comers’ domain knowledge and reducing information overload cluding dedicating resources and efforts to document essential
(PS01, PS11, PS20, PS32), forming an implicit group mem- aspects of the project. To tackle the barrier of spread docu-
ory from the information stored in a project’s archives (PS03, mentation, one solution (PS15) delved into methods of consol-
PS31), and providing newcomers’ support not only during their idating and centralizing documentation resources. Steinmacher
first contribution (PS23, PS27, PS29) but by acting as an agent et al. [106] (PS15) offered newcomers a dedicated “Documen-
to engage them in the project (PS05, PS10, PS12, PS15) and tation” section, housing project documentation organized into
promoting collaboration between newcomers and domain ex- subsections for easy access and navigation.
perts (PS07, PS08). Technical hurdles. We found 9 (nine) software solutions
To mitigate the barrier of lack of knowledge in project pro- targeting technical challenges newcomers encounter when try-
cess and practices, 24 solutions can enhance newcomers’ tech- ing to understand and navigate the technical aspects of a project.
nical skills, fill the gaps in their knowledge, and build their Concerning the barrier of local environment setup hurdles, three
confidence to contribute to technical projects actively. These solutions (PS04, PS10, PS15) can provide orientation on how to
include providing comprehensive documentation (PS03, PS15) set up the development environment. PS01 and PS03 suggest
and resources that explain project workflows (PS01, PS04, PS07, pre-configured development environments to ensure a smooth
PS11, PS17, PS20, PS23, PS26), provides project recommen- onboarding experience. Five (5) software solutions can mit-
dation (PS05, PS21, PS22, PS28, PS29, PS31) mentoring (PS02, igate the barrier of code/architecture hurdles. These solutions
PS06, PS10, PS16), coding standards (PS08, PS24, PS27, PS30), encompass various initiatives to assist newcomers in their code-
and communication channels (PS05). Additionally, to over- base navigation and comprehension of the project’s architec-
come the barrier of lack of technical background, 13 solutions ture, such as furnishing architectural diagrams (PS15) and pre-
can help by offering guidance during the contribution process senting high-level project structure overviews (PS03, PS10, PS20,
(PS01, PS11, PS15, PS20, PS23, PS27), recommendation of PS27). We want to highlight that Santos et al. [92] (PS23) was
project documentation (PS03, PS05, PS15, PS18, PS31), and the only work that could mitigate newcomers’ cognitive barriers
pairing newcomers with experienced developers as mentors (PS09, during the contribution process.
PS10, PS25).
Research Question 4
Communication. According to Steinmacher et al. [107],
newcomers are sometimes unaware of community communica- Answer: Most software solutions for onboarding presented
tion protocol. Three solutions (PS15, PS16, PS25) can tackle in the literature focus on mitigating the barriers related to
the barriers related to not receiving an answer and sending im- newcomers’ characteristics. The software solutions assist
polite messages. To alleviate the first barrier, PS15 focuses on newcomers in finding suitable tasks and mentors, bridging
creating designated communication channels to better visibility gaps in domain knowledge, project processes, and technical
of newcomers’ questions and increase the chances of receiving background, improving communication, maintaining user-
timely answers from the community members. PS16 and PS25 friendly documentation, simplifying technical aspects, and
recommend appointing experienced members as mentors, en- enhancing their onboarding experience. Our results also re-
suring newcomers receive timely responses. For the second bar- veal a need for solutions that target communication barriers,
rier, PS15 offers newcomers guidance on effective communica- documentation issues, technical challenges, and newcomers’
tion with other project members, while PS16 and PS25 recom- orientation.
mend avoiding unintentional rudeness or misunderstandings.
Nine studies can address the barrier of need to contact a 3.5. RQ5. What research strategies were employed to evaluate
“real” person. These include mentoring initiatives such as pair- the software solutions?
ing newcomers with experienced community members by rec- This question investigates the research strategies used to
ommending mentors to newcomers (PS02, PS06, PS09, PS16, evaluate the proposed software solutions for newcomers’ on-
PS25) and providing clear guidelines (PS05, PS12, PS15, PS26). boarding. Table 11 presents the study types identified in the
Concerning the barrier of receiving delayed answers, two solu- selected primary studies.
tions (PS16, PS25) recommended mentors who can expedite re- We categorized the evaluation methods employed in the pri-
sponses and collaborate with newcomers to assist them in their mary studies according to the ABC Framework, as initially de-
initial contributions. fined by Stol and Fitzgerald [109]. The ABC Framework under-
Documentation problems. We identified six (6) software scores the essence of knowledge-seeking research, emphasizing
solutions that can mitigate barriers related to documentation the involvement of actors (A) engaging in behavior (B) within a
problems. The solutions can tackle the barrier of information specific context (C). Within this framework, we identified three
overload include creating clear and concise documentation (PS15), predominant research strategies to assess primary studies con-
breaking down complex concepts into manageable sections (PS03), cerning software solutions for onboarding.
providing a straightforward visual representation of the project The predominant research strategy employed by the pri-
(PS01, PS11, PS20, PS32), and offering contextual guidance mary studies (23 studies, 71%) was laboratory experimentation,
to help newcomers find the most relevant information based on involving meticulous manipulation of variables to precise mea-
their specific needs (PS23). Moreover, to mitigate the barrier surements of actors’ behavior [109]. These experiments encom-
12
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 13
diversity gap in OSS, progress in addressing this issue has been to model purely through algorithms or lack satisfactory solu-
limited [35, 87, 115]. tions [128]. This integration allows for innovative solutions and
Our analysis of the selected studies showed that 15 out of advancements in the field. Among the primary studies, machine
32 (47%) proposed software solutions for onboarding targeting learning techniques were employed to improve recommenda-
a general newcomer population without considering or evalu- tion systems, enabling personalized and automated suggestions.
ating their effectiveness for integrating different types of users Web environment offers a versatile platform for creating
into OSS projects. Ten studies (PS01, PS03, PS04, PS11, PS13, software applications that are universally accessible and can be
PS15, PS17, PS26, PS30, PS32) proposed solutions address- executed through web browsers. Furthermore, the openness
ing the diversity aspect of educational backgrounds, specifically and flexibility of the web simplify the process of writing and
aiding students during the onboarding process. This is particu- deploying code, contributing to the proliferation of a rich and
larly pertinent given previous research indicating that variations diverse array of applications globally [78]. In the software solu-
in educational backgrounds can lead to heightened task-related tions highlighted in this study, the predominant implementation
discussions within work teams [52]. Additionally, nine stud- types observed were based on web environments. These solu-
ies (PS02, PS07, PS13, PS14, PS17, PS20, PS21, PS24, PS32) tions significantly contribute to fostering a more welcoming and
presented software solutions targeting newcomers with more supportive onboarding experience for newcomers by leveraging
development experience—developers transitioning to new soft- the advantages offered by web environments.
ware projects seeking solutions to comprehend project charac- Increasing newcomers’ engagement and motivation. The
teristics and source code structures. OSS movement has attracted a globally distributed community
Only two studies (PS23 and PS26) focused on providing of volunteers, and the increasing demand for professionals with
support tailored to newcomers with specific cognitive styles OSS knowledge has prompted students to contribute to OSS
(PS23) and concerning newcomers’ age (PS26). Santos et al. projects [40]. Students gain real-world skills and experiences
[92] (PS23), focused on mitigating cognitive barriers faced by by engaging in OSS projects, making them more competitive in
newcomers due to inclusivity bugs. The study revealed that the job market [73, 76]. Additionally, exposing students to OSS
platforms like GitHub, which newcomers use to contribute to projects benefits the communities by increasing the number of
OSS, create barriers for users with different characteristics, dis- potential contributors and fostering collaboration.
proportionately impacting underrepresented groups. Heimburger Gamification has gained attention to enhance student en-
et al. [48] (PS26) developed a mobile app for generations Y and gagement and motivation in software projects. Gamification ap-
Z entering the workforce. This solution acknowledges these plies game elements in non-gaming contexts to motivate and en-
generations’ unique characteristics and communication styles, gage participants [28]. In the context of OSS, gamification tech-
allowing organizations to create onboarding experiences that niques are vital in promoting healthy competition and instilling
resonate with their target audience. Our results highlight the a sense of achievement [10, 12]. Our findings show a growing
need for more research in the software engineering field that interest in utilizing gamification and modifying the OSS envi-
specifically targets increasing diversity and inclusion in soft- ronment to enhance newcomer engagement and motivation. By
ware communities to improve and facilitate more inclusive soft- incorporating gaming elements, students remain engaged, per-
ware solutions for onboarding. sist in their contributions, and derive satisfaction from their in-
volvement. Furthermore, gamification offers learning and skill
Research Question 6
development opportunities as students acquire new technical
Answer: Among the 32 analyzed studies, the predominant skills, learn collaboration, and gain insights into project man-
focus on diversity and inclusion dimensions pertained to in- agement practices [6, 29, 81].
formation diversity (i.e., background and experience). Only Impact of software solutions for onboarding. Newcom-
two studies specifically addressed the unique needs of new- ers need proper orientation to navigate the project and correctly
comers from minority groups, focusing on gender and age. make contributions [106]. Motivating, engaging, and retaining
new developers in a project is essential to sustain a healthy OSS
community [84]. Our findings demonstrate that software solu-
4. Discussion
tions significantly impact newcomers’ onboarding experiences
This section delves into our research findings, exploring in- in OSS projects, with onboarding aspects categorized into four
sights and potential areas for further investigation. key areas (i.e., personal, interpersonal, process, and technical).
Momentum of recommendation systems and machine Collectively, these software solutions shape and enhance new-
learning. There is a rise in recommendation systems designed comers’ onboarding journeys, facilitating their integration into
to aid newcomers in diverse activities. These systems assist OSS projects. Begel and Simon [9] discuss the importance, ad-
developers in finding relevant information and evaluating al- vantages, and challenges of mentoring novices in the software
ternative decisions, thereby covering a broad spectrum of soft- industry. Mentoring is crucial in pairing experienced contribu-
ware engineering tasks [26, 86]. Machine learning and soft- tors with newcomers to provide guidance, support, and knowl-
ware engineering intersection has become increasingly promi- edge transfer. By establishing constructive learning relation-
nent [63, 70]. By harnessing machine learning techniques, it ships between mentors and mentees, these solutions fostered
can tackle software engineering problems that are challenging the growth and integration of newcomers in the OSS project.
Our findings highlight the diverse impact of software solu-
14
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 15
tions on newcomers’ onboarding in OSS projects. Focusing on related issues in OSS [117]. Our analysis revealed that most
solutions such as engagement and motivation, mentoring, label- of the proposed software solutions for onboarding targeted a
ing and task selection, project recommendation, and reducing general newcomer population without considering or evaluat-
information overload contribute to facilitating the integration ing different user types in OSS projects.
of newcomers in software development communities. Developing inclusive software solutions for onboarding is
Investigating newcomers’ barriers. A better understand- required to foster diversity and inclusion in software communi-
ing of the barriers enables communities and researchers to de- ties. Our study underscores the scarcity of software solutions
sign and produce tools and conceive software solutions to sup- for onboarding addressing diversity and inclusion. By address-
port newcomers [4]. We identified research gaps in address- ing the specific needs and barriers underrepresented groups face,
ing barriers newcomers face during onboarding. Only 18 out it is possible to create more inclusive onboarding processes and
of the 58 barriers were covered by the existing software solu- foster greater diversity within OSS projects. Our study serves
tions. In particular, software solutions are lacking to tackle bar- as a call to action for the software engineering community to
riers related to communication, documentation issues, technical actively work towards creating inclusive environments that wel-
challenges, and newcomers’ orientation. Additionally, there is come individuals from diverse backgrounds and leverage their
room for exploring tools and techniques to assist newcomers unique perspectives to benefit the community.
in finding the correct artifacts to understand the contribution
process workflow. Existing software solutions for onboarding
5. Implications for practitioners
addressed communication barriers to some extent. However,
research opportunities remain for further improvements to sup- In this section, we outline the implications of our study for
port newcomers in better communicating with members of the practitioners.
OSS communities. Furthermore, new studies can explore doc- Implications for project maintainers. Project maintainers
umentation barriers by removing the overload of information have many responsibilities, including attracting and retaining
newcomers face when onboarding and making it simple to share new contributors to promote the project’s growth and sustain-
documentation. Additionally, future studies can investigate an- ability. They can leverage the insights gained from our study
other interesting gap in supporting newcomers in understanding to create welcoming, inclusive, and supportive environments to
code and architecture hurdles, focusing on the cognitive pro- onboard and retain newcomers. For example, they can facilitate
cesses required to comprehend the code information flow. the integration of newcomers into their projects by recognizing
Beyond the laboratory to explore new horizons. Soft- the value of mentorship recommendations solutions and focus-
ware engineering is a dynamic and interdisciplinary domain en- ing on developing structured documentation and resources to
compassing various social and technological aspects. It is cru- lessen newcomers’ cognitive overload when onboarding a new
cial to deeply understand human activities to explore how indi- software project.
vidual software engineers engage in software development and Implications for tool developers. Tool developers can use
how teams and organizations coordinate their efforts to achieve our results to understand how to alleviate newcomers’ onboard-
success. By studying these aspects, researchers can gain a holis- ing barriers and use this knowledge to implement new tools.
tic understanding of software engineering practices and enhance These tools could represent project information through dash-
the ability to support software development processes [33]. The boards, web portals, and visualization techniques to support
analysis of the selected primary studies revealed several types newcomers with the necessary resources for successful navi-
of evaluations. Overall, our findings highlight the different re- gation and performing better at tasks. Moreover, developers
search strategies employed to evaluate the software solutions could focus on designing tools that consider the needs of mi-
for onboarding, with the predominant strategy being laboratory nority groups, such as women or generations Y and Z.
experiments. However, future research endeavors could ben-
efit from transitioning beyond the laboratory and conducting
field experiments in real-world settings to offer a more compre- 6. Limitations
hensive evaluation of software solutions for onboarding over an
Although we have adopted the SLR guidelines proposed by
extended period, ensuring their long-term success.
Kitchenham et al. [58], this study has some limitations. This
Diversity and inclusion in software solutions. Newcom-
section presents the study’s limitations and discusses how we
ers encounter various challenges, which affect underrepresented
mitigate them.
populations differently and can result in a steeper learning curve,
Search strategy. It is possible that the search process might
a lack of community support, and difficulties in initiating con-
miss relevant primary studies [51]. We defined and followed
tributions, all contributing to the existing diversity imbalance in
the search strategy described in subsection 2.3 to mitigate this
OSS [79, 103, 115]. Numerous studies emphasized the posi-
threat. One author extracted the search terms based on our re-
tive impact of social diversity on productivity, teamwork, and
search questions, and the search string was iteratively devel-
the quality of contributions. The literature has highlighted con-
oped. The search string terms (detailed in Table 2) are broad,
cerns regarding the low diversity in OSS, considering factors
aiming to retrieve as many relevant studies as possible. More-
such as gender, language, and location [15, 43, 110, 115]. Pre-
over, we incorporated author and citation analysis, which al-
vious research has demonstrated that diverse teams are more
lowed us to identify other studies beyond our initial search.
productive, reinforcing the significance of addressing diversity-
15
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 16
Studies selection. A significant threat in secondary studies newcomers, exploring the activities and adjustments made by
is recognized to be the validity of study selection [2]. We pre- individuals and the workplace. As a result, they developed an
defined inclusion and exclusion criteria (see Subsection 2.2) in agile onboarding model encompassing various onboarding ac-
the protocol and used them to filter relevant studies. Addition- tivities, individual adjustments made by newcomers, and work-
ally, two researchers applied the selection criteria in different place adjustments to facilitate their integration into the team.
stages of the study’s selection process and jointly conducted a A multitude of empirical studies dedicated their focus to
consensus decision-making meeting. examining the process of newcomers joining community-based
Data extraction. Inconsistency extraction is a fundamental OSS projects [21, 80, 91, 101, 102, 120]. These studies of-
threat in SLR studies Khan et al. [57]. We mitigate this threat by fer insights into the factors influencing newcomers’ onboarding
defining a data extraction form, detailed in subsection 2.4, to ex- experiences within OSS communities. Fronchetti et al. [38] in-
tract relevant data to answer our RQs consistently. One author vestigated the factors influencing the onboarding of new con-
initially extracted the data, and the other authors participated in tributors in OSS projects. The authors analyzed 450 reposito-
the discussion meetings to solve doubt and double-check data, ries and identified project popularity, review time for pull re-
as suggested by Wohlin et al. [122]. quests, project age, and programming languages as the main
Data analysis. The risk of inaccurate data classification and factors explaining newcomers’ growth patterns. Understanding
mapping can cause subjective interpretation bias. We lessened these factors helps project maintainers optimize software solu-
this threat following an inductive approach inspired by open tions for onboarding. Furthermore, a separate body of research
coding and axial coding procedures from GT by Corbin and has focused on understanding newcomers’ barriers during their
Strauss [24] for analyzing qualitative data. onboarding journey [104, 123].
Generalizability. We do not assert the complete generaliz- Our study stands out from existing literature due to its unique
ability of this study. Nevertheless, we have tried to enhance its focus on providing knowledge on software solutions for new-
applicability by providing a comprehensive overview of soft- comers’ onboarding within software projects. To the best of our
ware solutions for onboarding and by logically structuring the knowledge, our research is the first to investigate software so-
study’s collected data, results, analysis, and conclusions. To lutions for onboarding. We offered a literature review detailing
promote the potential for generalizability in our findings, we software solutions and their practical implementation, impact
thoroughly examined a wide array of studies across various sub- on the onboarding process, research methodologies employed,
fields of software engineering. As an outcome, we described and potential to reduce barriers for newcomers. We also investi-
the implications of our results to social coding platforms, soft- gated whether these solutions prioritize aspects of diversity and
ware development organizations, maintainers of OSS projects, inclusion for newcomers into software projects.
software projects, tool developers, and researchers. Literature reviews. The systematic mapping study con-
ducted by Kaur et al. [56] examined community participation
7. Related work and engagement in OSS projects. The authors analyzed 67 stud-
ies to address the joining process, contribution barriers, motiva-
This section overviews the relevant work concerning new- tion, retention, and abandonment. The study also highlighted
comers’ onboarding in software projects and literature reviews gaps in mentoring newcomers, finding starting tasks, and iden-
focusing on onboarding practices. By exploring these areas, we tifying factors influencing developer participation and engage-
aim to understand the challenges and software solutions associ- ment. Steinmacher et al. [105] identified and aggregated 20
ated with integrating newcomers into software projects. studies that provided evidence of barriers newcomers face when
Newcomer’s onboarding. Onboarding is a crucial pro- onboarding to OSS projects. The study highlighted the most
cess that facilitates the transition of new employees and enables studied barriers and shows that successful contributions require
them to acquire the necessary attitudes, knowledge, skills, and domain knowledge, technical skills, and social interaction, em-
behaviors for effective work [20, 61, 112]. According to Bauer phasizing the importance of community receptivity, simple code,
and Erdogan [8], onboarding is a crucial process encompassing and organized documentation.
the activities and initiatives designed to equip new hires with Some literature reviews focused on diversity and inclusion
the knowledge, skills, and behaviors necessary to succeed in the aspects in software engineering that can influence software de-
new work environment. Newcomers in the software develop- velopment. Trinkenreich et al. [115] examined women’s partic-
ment environment face challenges in becoming fully integrated ipation in OSS projects, focusing on their demographics, mo-
and productive team members, which includes acquiring orga- tivations, types of contributions, challenges, and the proposed
nizational knowledge, project knowledge, product and domain strategies to address those challenges. The study reveals a sig-
knowledge, and knowledge of the technical environment [41]. nificant gender disparity in OSS, with women representing only
Fagerholm et al. [34] executed a case study to evaluate the in- about 10% of participants. Gender biases exist in various as-
fluence of mentoring support on developers. Their findings re- pects, such as differential acceptance rates for pull requests based
vealed that mentoring played a crucial role in the onboarding on gender identification. Women also face social challenges,
process for newcomers, empowering them to become more en- including a lack of peer parity, non-inclusive communication,
gaged and active participants. Gregory et al. [41] examined a toxic culture, impostor syndrome, and bias in peer review.
onboarding practices in a co-located agile project team within Considering the need for more diversity in software projects,
a large IT department that regularly welcomed inexperienced our study emphasizes the importance of examining and improv-
16
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 17
ing current software solutions for onboarding. Additionally, (LLMs) can be used to enhance onboarding processes for new-
Rodrı́guez-Pérez et al. [88] conducted an SLR to understand comers and evaluate their impacts on newcomers’ activities.
the relationship between perceived diversity aspects (gender,
age, race, and nationality) in software engineering. The au-
Acknowledgment
thors analyzed 131 previous studies to identify factors influenc-
ing diverse developers’ engagement and permanence in soft- The National Science Foundation (NSF) partially supports
ware engineering, methods used to improve perceived diversity this work under grant numbers 2236198, 2247929, 2303042,
in teams, and limitations of previous studies. The study high- and 2303612. Katia Romero Felizardo is funded by a research
lights gaps in the current literature and emphasizes the need for grant from the Brazilian National Council for Scientific and
future action in addressing perceived diversity in software en- Technological Development (CNPq), Grant 302339/2022 − 1.
gineering.
Pedreira et al. [81] conducted a mapping study focusing on
the potential benefits of gamification to the Software Engineer- References
ing (SE) field. The study findings highlight that gamification [1] A. Agrawal, J. S. Gans, and A. Goldfarb. Artificial intelligence adop-
can be a promising field that can help improve software en- tion and system-wide change. Journal of Economics & Management
gineers’ daily engagement and motivation in their tasks. The Strategy, 2023.
authors also observed that the adoption of gamification in SE is [2] A. Ampatzoglou, S. Bibi, P. Avgeriou, M. Verbeek, and A. Chatzige-
orgiou. Identifying, categorizing and mitigating threats to validity in
going more slowly than in other domains such as marketing, ed- software engineering secondary studies. Information and Software Tech-
ucation, or mobile applications. This trend is similar to our find- nology, 106:201–230, 2019.
ings on only three software solutions that adopted gamification [3] M. Azanza, A. Irastorza, R. Medeiros, and O. Dı́az. Onboarding in soft-
ware product lines: concept maps as welcome guides. In IEEE/ACM
elements to improve onboarding. Furthermore, Darejeh and
43rd International Conference on Software Engineering: Software En-
Salim [27] conducted an SLR to thoroughly examine gamifica- gineering Education and Training (ICSE-SEET), pages 122–133. IEEE,
tion solutions addressing user engagement issues across various 2021.
software categories. Their findings highlighted gamification as [4] S. Balali, I. Steinmacher, U. Annamalai, A. Sarma, and M. Gerosa. New-
comers’ barriers... is that all? an analysis of mentors’ and newcom-
a viable approach for enhancing user engagement and perfor- ers’ barriers in OSS projects. Computer Supported Cooperative Work
mance. Most gamification solutions aim to motivate users to (CSCW), 2018.
contribute more content to software, encourage active software [5] S. Balali, U. Annamalai, H. S. Padala, B. Trinkenreich, M. A. Gerosa,
usage, and improve the software’s appeal to induce behavior I. Steinmacher, and A. Sarma. Recommending tasks to newcomers in
OSS projects: How do mentors handle it? In 16th International Sympo-
change. Moreover, their results show a limited focus on moti- sium on Open Collaboration (OpenSym), pages 1–14, 2020.
vating users to effectively utilize software content, addressing [6] A. Bartel and G. Hagel. Gamifying the learning of design patterns in
learning challenges, and integrating users’ real identities within software engineering education. In IEEE Global Engineering Education
the software environment. Conference (EDUCON), pages 74–79. IEEE, 2016.
[7] V. R. Basili. Evolving and packaging reading technologies. Journal of
Systems and Software, 38(1):3–12, 1997.
8. Conclusion [8] T. N. Bauer and B. Erdogan. Organizational socialization: The effec-
tive onboarding of new employees., pages 51–64. APA handbooks in
In this paper, we conducted an SLR analyzing 32 primary psychology. American Psychological Association, Washington, DC, US,
2011. doi: 10.1037/12171-002. URL https://doi.org/10.1037/
studies to investigate the software solutions proposed in the 12171-002.
literature to enhance the onboarding processes for newcomers [9] A. Begel and B. Simon. Novice software developers, all over again.
in software projects. The proposed software solutions for on- In Fourth International Workshop on Computing Education Research
(ICER), pages 3–14, 2008.
boarding focused on recommendation systems using web-based [10] J. Bell, S. Sheth, and G. Kaiser. Increasing student engagement in soft-
implementations, and the impact of those software solutions in- ware engineering with gamification. In 4th International Workshop on
volves personal, interpersonal, technical, and process aspects. Social Software Engineering (SSE), pages 1–2, 2012.
Moreover, laboratory experiments were the most common re- [11] L. M. Berlin. Beyond program understanding: a look at programming
expertise in industry. ESP, 93(744):6–25, 1993.
search strategy for evaluation. Concerning diversity, software [12] A. P. O. Bertholdo and M. A. Gerosa. Promoting engagement in open
solutions for onboarding mainly consider newcomers’ back- collaboration communities by means of gamification. In HCI Interna-
grounds and experience levels. tional 2016–Posters’ Extended Abstracts: 18th International Confer-
We recognize that various project domains may exhibit dis- ence, pages 15–20. Springer, 2016.
[13] J. Biolchini, P. G. Mian, A. C. C. Natali, and G. H. Travassos. Systematic
tinct characteristics and requirements during the onboarding pro- review in software engineering. System engineering and computer sci-
cess, and the software solutions found in our SLR may not ap- ence department COPPE/UFRJ, Technical Report ES, 679(05):45, 2005.
ply equally to all project domains. As a future work opportu- [14] K. Blincoe, O. Springer, and M. R. Wrobel. Perceptions of gender diver-
nity, exploring onboarding solutions tailored to different project sity’s impact on mood in software development teams. IEEE Software,
36(5):51–56, 2019.
domains is essential, allowing for a more nuanced understand- [15] A. Bosu and K. Z. Sultana. Diversity and inclusion in open source
ing of the unique scenarios. Moreover, as future work, we aim software (OSS) projects: where do we stand? In ACM/IEEE Interna-
to investigate the diversity and inclusion aspects of onboard- tional Symposium on Empirical Software Engineering and Measurement
ing and propose inclusive software solutions that contribute to (ESEM), pages 1–11. IEEE, 2019.
[16] P. Brereton, B. A. Kitchenham, D. Budgen, M. Turner, and M. Khalil.
the diversity and inclusion of more users in software projects. Lessons from applying the systematic literature review process within
Additionally, we aim to explore how large language models
17
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 18
the software engineering domain. Journal of systems and software, 80 on Visual Languages and Human-Centric Computing (VL/HCC), pages
(4):571–583, 2007. 1–5. IEEE, 2022.
[17] R. Britto, D. S. Cruzes, D. Smite, and A. Sablis. Onboarding software [37] A. Forte and C. Lampe. Defining, understanding, and supporting open
developers and teams in three globally distributed legacy projects: a collaboration: Lessons from the literature. American behavioral scien-
multi-case study. Journal of Software: Evolution and Process, 30(4): tist, 57(5):535–547, 2013.
e1921, 2018. [38] F. Fronchetti, I. Wiese, G. Pinto, and I. Steinmacher. What attracts new-
[18] J. Buchan, S. G. MacDonell, and J. Yang. Effective team onboarding comers to onboard on OSS projects? tl; dr: Popularity. In 15th IFIP
in agile software development: techniques and goals. In ACM/IEEE Advances in Information and Communication Technology (OSS), pages
International Symposium on Empirical Software Engineering and Mea- 91–103. Springer, 2019.
surement (ESEM), pages 1–11. IEEE, 2019. [39] C. Fu, M. Zhou, Q. Xuan, and H.-X. Hu. Expert recommendation in OSS
[19] M. Burnett, S. D. Fleming, S. Iqbal, G. Venolia, V. Rajaram, U. Farooq, projects based on knowledge embedding. In International Workshop on
V. Grigoreanu, and M. Czerwinski. Gender differences and program- Complex Systems and Networks (IWCSN), pages 149–155. IEEE, 2017.
ming environments: across programming populations. In Proceedings [40] V. Goduguluri, T. Kilamo, and I. Hammouda. Kommgame: a reputation
of the 2010 ACM-IEEE international symposium on empirical software environment for teaching open source software. In 7th IFIP Advances
engineering and measurement, pages 1–10, 2010. in Information and Communication Technology (OSS), pages 312–315.
[20] D. M. Cable, F. Gino, and B. R. Staats. Reinventing employee onboard- Springer, 2011.
ing. MIT Sloan Management Review, 2013. [41] P. Gregory, D. E. Strode, H. Sharp, and L. Barroca. An onboarding
[21] G. Canfora, M. Di Penta, R. Oliveto, and S. Panichella. Who is going model for integrating newcomers into agile project teams. Information
to mentor newcomers in open source projects? In ACM SIGSOFT 20th and Software Technology (IST), 143:106792, 2022.
International Symposium on the Foundations of Software Engineering [42] M. Guizani, A. Chatterjee, B. Trinkenreich, M. E. May, G. J. Noa-
(FSE), pages 1–11, 2012. Guevara, L. J. Russell, G. G. Cuevas Zambrano, D. Izquierdo-Cortazar,
[22] A. Carrera-Rivera, W. Ochoa, F. Larrinaga, and G. Lasa. How-to con- I. Steinmacher, M. A. Gerosa, et al. The long road ahead: Ongoing chal-
duct a systematic literature review: A quick guide for computer science lenges in contributing to large oss organizations and what to do. ACM
research. MethodsX, 9:101895, 2022. on Human-Computer Interaction, 5(CSCW2):1–30, 2021.
[23] A.-M. Cazan, E. Cocoradă, and C. I. Maican. Computer anxiety and atti- [43] M. Guizani, I. Steinmacher, J. Emard, A. Fallatah, M. Burnett, and
tudes towards the computer and the internet with romanian high-school A. Sarma. How to debug inclusivity bugs? a debugging process with
and university students. Computers in Human Behavior, 55:258–267, information architecture. In ACM/IEEE 44th International Conference
2016. on Software Engineering: Software Engineering in Society (ICSE-SEIS),
[24] J. Corbin and A. Strauss. Techniques and procedures for developing 2022.
grounded theory. Basics of Qualitative Research, 3rd ed.; Sage: Thou- [44] M. Guizani, T. Zimmermann, A. Sarma, and D. Ford. Attracting and
sand Oaks, CA, USA, pages 860–886, 2008. retaining OSS contributors with a maintainer dashboard. In ACM/IEEE
[25] D. Cubranic and G. C. Murphy. Hipikat: recommending pertinent soft- 44th International Conference on Software Engineering: Software En-
ware development artifacts. In 25th International Conference on Soft- gineering in Society (ICSE-SEIS), pages 36–40, 2022.
ware Engineering (ICSE), pages 408–418. IEEE, 2003. [45] A.-W. Harzing and S. Alakangas. Google scholar, scopus and the web
[26] B. Dagenais and M. P. Robillard. Recommending adaptive changes for of science: a longitudinal and cross-disciplinary comparison. Sciento-
framework evolution. ACM Transactions on Software Engineering and metrics, 106:787–804, 2016.
Methodology (TOSEM), 20(4):1–35, 2011. [46] Ø. Hauge, C. Ayala, and R. Conradi. Adoption of open source soft-
[27] A. Darejeh and S. S. Salim. Gamification solutions to enhance soft- ware in software-intensive organizations–a systematic literature review.
ware user engagement—a systematic review. International Journal of Information and Software Technology, 52(11):1133–1154, 2010.
Human-Computer Interaction, 32(8):613–642, 2016. [47] H. He, H. Su, W. Xiao, R. He, and M. Zhou. Gfi-bot: automated good
[28] S. Deterding, D. Dixon, R. Khaled, and L. Nacke. From game design first issue recommendation on GitHub. In 30th ACM Joint European
elements to gamefulness: defining “gamification”. In 15th International Software Engineering Conference and Symposium on the Foundations
Academic MindTrek Conference: Envisioning Future Media Environ- of Software Engineering (ESEC/FSE), pages 1751–1755, 2022.
ments (MindTrek), pages 9–15, 2011. [48] L. Heimburger, L. Buchweitz, R. Gouveia, and O. Korn. Gamifying
[29] D. Dicheva, C. Dichev, G. Agre, and G. Angelova. Gamification in edu- onboarding: how to increase both engagement and integration of new
cation: a systematic mapping study. Journal of Educational Technology employees. In International Conference on Social and Occupational
& Society (JSTOR), 18(3):75–88, 2015. Ergonomics (AHFE), pages 3–14. Springer, 2020.
[30] G. C. Diniz, M. A. G. Silva, M. A. Gerosa, and I. Steinmacher. Us- [49] F. Heimerl, S. Lohmann, S. Lange, and T. Ertl. Word cloud explorer:
ing gamification to orient and motivate students to contribute to OSS text analytics based on word clouds. In 47th Hawaii International Con-
projects. In IEEE/ACM 10th International Workshop on Cooperative ference on System Sciences (HICSS), pages 1833–1842. IEEE, 2014.
and Human Aspects of Software Engineering (CHASE), pages 36–42. [50] S. K. Horwitz and I. B. Horwitz. The effects of team diversity on team
IEEE, 2017. outcomes: a meta-analytic review of team demography. Journal of Man-
[31] J. Dominic, J. Houser, I. Steinmacher, C. Ritter, and P. Rodeghero. agement, 2007.
Conversational bot for newcomers onboarding to open source projects. [51] S. Jalali and C. Wohlin. Systematic literature studies: database searches
In IEEE/ACM 42nd International Conference on Software Engineering vs. backward snowballing. In ACM-IEEE International Symposium on
Workshops (ICSEW), pages 46–50, 2020. Empirical Software Engineering and Measurement (ESEM), pages 29–
[32] T. Dyba, T. Dingsoyr, and G. K. Hanssen. Applying systematic reviews 38, 2012.
to diverse study types: An experience report. In First international sym- [52] K. A. Jehn, G. B. Northcraft, and M. A. Neale. Why differences make a
posium on empirical software engineering and measurement (ESEM), difference: A field study of diversity, conflict and performance in work-
pages 225–234. IEEE, 2007. groups. Administrative Science Quarterly, 44(4):741–763, 1999.
[33] S. Easterbrook, J. Singer, M.-A. Storey, and D. Damian. Selecting em- [53] A. Ju, H. Sajnani, S. Kelly, and K. Herzig. A case study of onboard-
pirical methods for software engineering research. Guide to Advanced ing in software teams: tasks and strategies. In IEEE/ACM 43rd Inter-
Empirical Software Engineering, pages 285–311, 2008. national Conference on Software Engineering (ICSE), pages 613–623.
[34] F. Fagerholm, A. S. Guinea, J. Borenstein, and J. Münch. Onboarding IEEE, 2021.
in open source projects. IEEE Software, 31(6):54–61, 2014. [54] H. Kagdi, M. Hammad, and J. I. Maletic. Who can help me with this
[35] D. Ford, A. Harkins, and C. Parnin. Someone like me: how does source code change? In IEEE International Conference on Software
peer parity influence participation of women on stack overflow? In Maintenance (ICSM), pages 157–166. IEEE, 2008.
IEEE Symposium on Visual Languages and Human-Centric Computing [55] L. C. Kats, R. G. Vogelij, K. T. Kalleberg, and E. Visser. Software
(VL/HCC). IEEE CS, 2017. development environments on the web: a research agenda. In ACM
[36] D. Ford, N. Shrestha, and T. Zimmermann. Reboc: recommending be- SIGPLAN International Symposium on New Ideas, New Paradigms, and
spoke open source software projects to contributors. In IEEE Symposium Reflections on Programming and Software (Onward!), pages 99–116,
18
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 19
2012. [77] F. Nayebi, J.-M. Desharnais, and A. Abran. The state of the art of mobile
[56] R. Kaur, K. K. Chahal, and M. Saini. Understanding community partic- application usability evaluation. In 25th IEEE Canadian Conference
ipation and engagement in open source software projects: a systematic on Electrical and Computer Engineering (CCECE), pages 1–4. IEEE,
mapping study. Journal of King Saud University - Computer and Infor- 2012.
mation Sciences, 34(7):4607–4625, 2022. [78] A. Nederlof, A. Mesbah, and A. V. Deursen. Software engineering for
[57] A. A. Khan, A. Ahmad, M. Waseem, P. Liang, M. Fahmideh, T. Mikko- the web: the state of the practice. In 36th International Conference on
nen, and P. Abrahamsson. Software architecture for quantum computing Software Engineering, pages 4–13, 2014.
systems—a systematic review. Journal of Systems and Software, 201: [79] S. H. Padala, C. J. Mendez, L. F. Dias, I. Steinmacher, Z. S. Hanson,
111682, 2023. C. Hilderbrand, A. Horvath, C. Hill, L. D. Simpson, M. Burnett, et al.
[58] B. Kitchenham, S. Charters, et al. Guidelines for performing systematic How gender-biased tools shape newcomer experiences in OSS projects.
literature reviews in software engineering version 2.3. Engineering, 45 IEEE Transactions on Software Engineering (TSE), 2020.
(4ve):1051, 2007. [80] Y. Park and C. Jensen. Beyond pretty pictures: examining the benefits
[59] B. Kitchenham, L. Madeyski, and D. Budgen. How should software en- of code visualization for open source newcomers. In 5th IEEE Interna-
gineering secondary studies include grey material? IEEE Transactions tional Workshop on Visualizing Software for Understanding and Analy-
on Software Engineering, 49(2):872–882, 2022. sis (VISSOFT), pages 3–10. IEEE, 2009.
[60] B. A. Kitchenham, D. Budgen, and P. Brereton. Evidence-based soft- [81] O. Pedreira, F. Garcı́a, N. Brisaboa, and M. Piattini. Gamification in
ware engineering and systematic reviews, volume 4. CRC press, 2015. software engineering–a systematic mapping. Information and Software
[61] H. J. Klein, B. Polin, and K. Leigh Sutton. Specific onboarding prac- Technology (IST), 57:157–168, 2015.
tices for the socialization of new employees. International Journal of [82] R. Pham, S. Kiesling, L. Singer, and K. Schneider. Onboarding inex-
Selection and Assessment, 23(3):263–283, 2015. perienced developers: struggles and perceptions regarding automated
[62] A. J. Ko. Mining the mind, minding the mine: grand challenges in com- testing. Software Quality Journal, 25(4):1239–1268, 2017.
prehension and mining. In 26th Conference on Program Comprehension [83] L. Pradel. Quantifying the ramp-up problem in software projects. In
(ICPC), pages 1–1, 2018. 20th International Conference on Evaluation and Assessment in Soft-
[63] Z. Kotti, R. Galanopoulou, and D. Spinellis. Machine learning for soft- ware Engineering (EASE), pages 1–4, 2016.
ware engineering: a tertiary study. ACM Computing Surveys, 55(12): [84] I. Qureshi and Y. Fang. Socialization in open source software projects: a
1–39, 2023. growth mixture modeling approach. Organizational Research Methods,
[64] A. Labuschagne and R. Holmes. Do onboarding programs work? In 14(1):208–238, 2011.
IEEE/ACM 12th Working Conference on Mining Software Repositories [85] A. Rastogi, S. Thummalapenta, T. Zimmermann, N. Nagappan, and
(MSR), pages 381–385. IEEE, 2015. J. Czerwonka. Ramp-up journey of new hires: do strategic practices
[65] C. Liu, D. Yang, X. Zhang, B. Ray, and M. M. Rahman. Recommend- of software companies influence productivity? In 10th Innovations in
ing GitHub projects for developer onboarding. IEEE Access, 6:52082– Software Engineering Conference (ISEC), pages 107–111, 2017.
52094, 2018. [86] M. Robillard, R. Walker, and T. Zimmermann. Recommendation sys-
[66] S. K. Lo, Q. Lu, C. Wang, H.-Y. Paik, and L. Zhu. A systematic litera- tems for software engineering. IEEE Software, 27(4):80–86, 2009.
ture review on federated machine learning: from a software engineering [87] G. Robles, L. A. Reina, J. M. González-Barahona, and S. D. Domı́nguez.
perspective. ACM Computing Surveys (CSUR), 54(5):1–39, 2021. Women in free/libre/open source software: the situation in the 2010s.
[67] Y. Malheiros, A. Moraes, C. Trindade, and S. Meira. A source code In 12th IFIP Advances in Information and Communication Technology
recommender system to support newcomers. In IEEE 36th Annual Com- (OSS). Springer, 2016.
puter Software and Applications Conference (COMPSAC), pages 19–24. [88] G. Rodrı́guez-Pérez, R. Nadri, and M. Nagappan. Perceived diversity in
IEEE, 2012. software engineering: a systematic literature review. Empirical Software
[68] J. Marlow, L. Dabbish, and J. Herbsleb. Impression formation in on- Engineering, 26:1–38, 2021.
line peer production: activity traces and personal profiles in GitHub. In [89] K. Rollag, S. Parise, and R. Cross. Getting new hires up to speed quickly.
Conference on Computer Supported Cooperative Work. ACM, 2013. MIT Sloan Management Review, 2005.
[69] R. Medeiros and O. Dı́az. Assisting mentors in selecting newcomers’ [90] F. Santos, I. Wiese, B. Trinkenreich, I. Steinmacher, A. Sarma, and M. A.
next task in software product lines: A recommender system approach. Gerosa. Can i solve it? identifying apis required to complete OSS tasks.
In Advanced Information Systems Engineering: 34th International Con- In IEEE/ACM 18th International Conference on Mining Software Repos-
ference (CAiSE), pages 460–476. Springer, 2022. itories (MSR), pages 346–257. IEEE, 2021.
[70] K. Meinke and A. Bennaceur. Machine learning for software engi- [91] I. Santos, I. Wiese, I. Steinmacher, A. Sarma, and M. A. Gerosa. Hits
neering: Models, methods, and applications. In IEEE/ACM 40th In- and misses: Newcomers’ ability to identify skills needed for oss tasks.
ternational Conference on Software Engineering: Companion (ICSE- In IEEE SANER, pages 174–183. IEEE, 2022.
Companion), pages 548–549, 2018. [92] I. Santos, J. F. Pimentel, I. Wiese, I. Steinmacher, A. Sarma, and M. A.
[71] S. Minto and G. C. Murphy. Recommending emergent teams. In Gerosa. Designing for cognitive diversity: improving the GitHub expe-
Fourth International Workshop on Mining Software Repositories (MSR- rience for newcomers. In IEEE/ACM 45th International Conference on
ICSEW), pages 5–5. IEEE, 2007. Software Engineering: Software Engineering in Society (ICSE-SEIS),
[72] D. Moody. The “physics” of notations: toward a scientific basis for con- pages 1–12, 2023.
structing visual notations in software engineering. IEEE Transactions [93] A. Sarma, L. Maccherone, P. Wagstrom, and J. Herbsleb. Tesseract: in-
on Software Engineering (TSE), 35(6):756–779, 2009. teractive visual exploration of socio-technical relationships in software
[73] B. Morgan and C. Jensen. Lessons learned from teaching open source development. In IEEE 31st International Conference on Software Engi-
software development. In 10th IFIP Advances in Information and Com- neering (ICSE), pages 23–33. IEEE, 2009.
munication Technology (OSS), pages 133–142. Springer, 2014. [94] A. Sarma, M. A. Gerosa, I. Steinmacher, and R. Leano. Training the
[74] T. P. Nagarhalli, V. Vaze, and N. Rana. A review of current trends in future workforce through task curation in an OSS ecosystem. In 24th
the development of chatbot systems. In 6th International Conference ACM SIGSOFT International Symposium on Foundations of Software
on Advanced Computing and Communication Systems (ICACCS), pages Engineering (FSE), pages 932–935, 2016.
706–710. IEEE, 2020. [95] L. P. Serrano Alves, I. S. Wiese, A. P. Chaves, and I. Steinmacher. How
[75] L. Nagel, O. Karras, and J. Klünder. Ontology-based software graphs to find my task? chatbot to assist newcomers in choosing tasks in OSS
for supporting code comprehension during onboarding. In 47th Euromi- projects. In Chatbot Research and Design: 5th International Workshop
cro Conference on Software Engineering and Advanced Applications (CONVERSATIONS), pages 90–107. Springer, 2022.
(SEAA), pages 158–165. IEEE, 2021. [96] S. E. Sim and R. C. Holt. The ramp-up problem in software projects:
[76] D. M. Nascimento, K. Cox, T. Almeida, W. Sampaio, R. A. Bittencourt, a case study of how software immigrants naturalize. In 20th Inter-
R. Souza, and C. Chavez. Using open source projects in software en- national Conference on Software Engineering (ICSE), pages 361–370.
gineering education: a systematic mapping study. In IEEE Frontiers in IEEE, 1998.
Education Conference (FIE), pages 1837–1843. IEEE, 2013. [97] L. Singer, F. Figueira Filho, B. Cleary, C. Treude, M.-A. Storey, and
19
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 20
K. Schneider. Mutual assessment in the social programmer ecosystem: [117] B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik,
an empirical investigation of developer profile aggregators. In Confer- P. Devanbu, and V. Filkov. Gender and tenure diversity in GitHub teams.
ence on Computer Supported Cooperative Work, 2013. In ACM CHI Conference, 2015.
[98] A. Singh, V. Bhadauria, A. Jain, and A. Gurung. Role of gender, self- [118] A. S. M. Venigalla, K. Boyalakuntla, and S. Chimalakonda. Gitq-
efficacy, anxiety and testing formats in learning spreadsheets. Computers towards using badges as visual cues for GitHub projects. In
in Human Behavior, 29(3):739–746, 2013. 30th IEEE/ACM International Conference on Program Comprehension
[99] C. Stanik, L. Montgomery, D. Martens, D. Fucci, and W. Maalej. A (ICPC), pages 157–161, 2022.
simple nlp-based approach to support onboarding and retention in open [119] G. Viviani and G. C. Murphy. Reflections on onboarding practices in
source communities. In IEEE International Conference on Software mid-sized companies. In IEEE/ACM 12th International Workshop on
Maintenance and Evolution (ICSME), pages 172–182. IEEE, 2018. Cooperative and Human Aspects of Software Engineering (CHASE),
[100] I. Steinmacher, I. S. Wiese, and M. A. Gerosa. Recommending men- pages 83–84. IEEE, 2019.
tors to software project newcomers. In Third International Workshop [120] J. Wang and A. Sarma. Which bug should i fix: helping new developers
on Recommendation Systems for Software Engineering (RSSE), pages onboard a new project. In 4th International Workshop on Cooperative
63–67. IEEE, 2012. and Human Aspects of Software Engineering (CHASE), pages 76–79,
[101] I. Steinmacher, I. Wiese, A. P. Chaves, and M. A. Gerosa. Why do 2011.
newcomers abandon open source software projects? In 6th International [121] C. Wohlin. Guidelines for snowballing in systematic literature studies
Workshop on Cooperative and Human Aspects of Software Engineering and a replication in software engineering. In 18th International Con-
(CHASE), pages 25–32. IEEE, 2013. ference on Evaluation and Assessment in Software Engineering (EASE),
[102] I. Steinmacher, M. A. Gerosa, and D. Redmiles. Attracting, onboarding, pages 1–10, 2014.
and retaining newcomer developers in open source software projects. [122] C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and
In Workshop on Global Software Development in a CSCW Perspective, A. Wesslén. Experimentation in software engineering. Springer Sci-
2014. ence & Business Media, 2012.
[103] I. Steinmacher, T. Conte, M. Gerosa, and D. Redmiles. Social barri- [123] V. Wolff-Marting, C. Hannebauer, and V. Gruhn. Patterns for tearing
ers faced by newcomers placing their first contribution in open source down contribution barriers to floss projects. In 12th International Con-
software projects. In 18th ACM Conference on Computer Supported ference on Intelligent Software Methodologies, Tools and Techniques
Cooperative Work & Social Computing (CSCW), 2015. (SoMeT). IEEE, 2013.
[104] I. Steinmacher, T. U. Conte, and M. A. Gerosa. Understanding and sup- [124] W. Xiao, H. He, W. Xu, X. Tan, J. Dong, and M. Zhou. Recommending
porting the choice of an appropriate task to start with in open source soft- good first issues in GitHub OSS projects. In 44th International Confer-
ware communities. In 48th Hawaii International Conference on System ence on Software Engineering (ICSE), pages 1830–1842, 2022.
Sciences (HICSS), pages 5299–5308. IEEE, 2015. [125] C. Yang, Q. Fan, T. Wang, G. Yin, and H. Wang. Repolike: personal
[105] I. Steinmacher, M. A. G. Silva, M. A. Gerosa, and D. F. Redmiles. A repositories recommendation in social coding communities. In 8th Asia-
systematic literature review on the barriers faced by newcomers to open Pacific Symposium on Internetware (Internetware), pages 54–62, 2016.
source software projects. Information and Software Technology (IST), [126] A. Yasin, R. Fatima, L. Wen, W. Afzal, M. Azhar, and R. Torkar. On
59:67–85, 2015. using grey literature and google scholar in systematic literature reviews
[106] I. Steinmacher, T. U. Conte, C. Treude, and M. A. Gerosa. Overcoming in software engineering. IEEE access, 8:36226–36243, 2020.
open source project entry barriers with a portal for newcomers. In 38th [127] H. Yin, Z. Sun, Y. Sun, and G. Huang. Automatic learning path recom-
International Conference on Software Engineering (ICSE), pages 273– mendation for open source projects using deep learning on knowledge
284, 2016. graphs. In IEEE 45th Annual Computers, Software, and Applications
[107] I. Steinmacher, M. Gerosa, T. U. Conte, and D. F. Redmiles. Overcom- Conference (COMPSAC), pages 824–833. IEEE, 2021.
ing social barriers when contributing to open source software projects. [128] J. Zheng, L. Williams, N. Nagappan, W. Snipes, J. P. Hudepohl, and
Computer Supported Cooperative Work (CSCW), 28:247–290, 2019. M. A. Vouk. On the value of static analysis for fault detection in soft-
[108] K.-J. Stol and B. Fitzgerald. The abc of software engineering re- ware. IEEE Transactions on Software Engineering, 32(4):240–253,
search. ACM Transactions on Software Engineering and Methodology 2006.
(TOSEM), 27(3):1–51, 2018. [129] M. Zhou and A. Mockus. Developer fluency: achieving true mastery in
[109] K.-J. Stol and B. Fitzgerald. Guidelines for conducting software en- software projects. In 18th ACM SIGSOFT International Symposium on
gineering research. In Contemporary Empirical Methods in Software Foundations of Software Engineering (FSE), pages 137–146, 2010.
Engineering, pages 27–62. Springer, 2020. [130] M. Zhou and A. Mockus. What make long term contributors: willing-
[110] M.-A. Storey, A. Zagalsky, F. Figueira Filho, L. Singer, and D. M. Ger- ness and opportunity in oss community. In 34th International Confer-
man. How social and communication channels shape and challenge a ence on Software Engineering (ICSE), pages 518–528. IEEE, 2012.
participatory culture in software development. IEEE Transactions on [131] Y. Zhou, J. Wu, and Y. Sun. Ghtrec: a personalized service to rec-
Software Engineering (TSE), 2016. ommend GitHub trending repositories for developers. In IEEE Interna-
[111] X. Sun, W. Xu, X. Xia, X. Chen, and B. Li. Personalized project recom- tional Conference on Web Services (ICWS), pages 314–323. IEEE, 2021.
mendation on GitHub. Science China Information Sciences, 61:1–14,
2018.
[112] N. Talya and D. Bauer. Onboarding new employees: Maximizing suc-
cess, 2014.
[113] C. Toscani, D. Gery, I. Steinmacher, and S. Marczak. A gamification
proposal to support the onboarding of newcomers in the flosscoach por-
tal. In 17th Brazilian Symposium on Human Factors in Computing Sys-
tems (IHC), pages 1–10, 2018.
[114] B. Trinkenreich, M. Guizani, I. Wiese, A. Sarma, and I. Steinmacher.
Hidden figures: Roles and pathways of successful oss contributors. ACM
on Human-Computer Interaction, 4(CSCW):1–22, 2020.
[115] B. Trinkenreich, I. Wiese, A. Sarma, M. Gerosa, and I. Steinmacher.
Women’s participation in open source software: a survey of the liter-
ature. ACM Transactions on Software Engineering and Methodology
(TOSEM), 2022.
[116] A. Valente, M. Holanda, A. M. Mariano, R. Furuta, and D. Da Silva.
Analysis of academic databases for literature review in the computer
science education field. In 2022 ieee frontiers in education conference
(fie), pages 1–7. IEEE, 2022.
20
Santos et al. / Procedia Computer Science 00 (2024) 1–?? 21
Primary Studies [PS19] W. Xiao, H. He, W. Xu, X. Tan, J. Dong, M. Zhou, Recommending
good first issues in GitHub OSS projects, in: 44th International Confer-
[PS01] M. Azanza, A. Irastorza, R. Medeiros, O. Dı́az, Onboarding in soft- ence on Software Engineering (ICSE), 2022, pp. 1830–1842.
ware product lines: concept maps as welcome guides, in: IEEE/ACM [PS20] H. Yin, Z. Sun, Y. Sun, G. Huang, Automatic learning path recom-
43rd International Conference on Software Engineering: Software En- mendation for open source projects using deep learning on knowledge
gineering Education and Training (ICSE-SEET), IEEE, 2021, pp. 122– graphs, in: IEEE 45th Annual Computers, Software, and Applications
133. Conference (COMPSAC), IEEE, 2021, pp. 824–833.
[PS02] G. Canfora, M. Di Penta, R. Oliveto, S. Panichella, Who is going to [PS21] D. Ford, N. Shrestha, T. Zimmermann, Reboc: recommending bespoke
mentor newcomers in open source projects?, in: ACM SIGSOFT 20th open source software projects to contributors, in: IEEE Symposium
International Symposium on the Foundations of Software Engineering on Visual Languages and Human-Centric Computing (VL/HCC), IEEE,
(FSE), 2012, pp. 1–11. 2022, pp. 1–5.
[PS03] D. Cubranic, G. C. Murphy, Hipikat: recommending pertinent software [PS22] C. Liu, D. Yang, X. Zhang, B. Ray, M. M. Rahman, Recommend-
development artifacts, in: 25th International Conference on Software ing GitHub projects for developer onboarding, IEEE Access 6 (2018)
Engineering (ICSE), IEEE, 2003, pp. 408–418. 52082–52094.
[PS04] G. C. Diniz, M. A. G. Silva, M. A. Gerosa, I. Steinmacher, Using gam- [PS23] I. Santos, J. F. Pimentel, I. Wiese, I. Steinmacher, A. Sarma, M. A.
ification to orient and motivate students to contribute to OSS projects, Gerosa, Designing for cognitive diversity: improving the GitHub expe-
in: IEEE/ACM 10th International Workshop on Cooperative and Human rience for newcomers, in: IEEE/ACM 45th International Conference on
Aspects of Software Engineering (CHASE), IEEE, 2017, pp. 36–42. Software Engineering: Software Engineering in Society (ICSE-SEIS),
[PS05] J. Dominic, J. Houser, I. Steinmacher, C. Ritter, P. Rodeghero, Con- 2023, pp. 1–12.
versational bot for newcomers onboarding to open source projects, in: [PS24] F. Santos, I. Wiese, B. Trinkenreich, I. Steinmacher, A. Sarma, M. A.
IEEE/ACM 42nd International Conference on Software Engineering Gerosa, Can i solve it? identifying apis required to complete OSS
Workshops (ICSEW), 2020, pp. 46–50. tasks, in: IEEE/ACM 18th International Conference on Mining Soft-
[PS06] C. Fu, M. Zhou, Q. Xuan, H.-X. Hu, Expert recommendation in OSS ware Repositories (MSR), IEEE, 2021, pp. 346–257.
projects based on knowledge embedding, in: International Workshop on [PS25] S. Minto, G. C. Murphy, Recommending emergent teams, in:
Complex Systems and Networks (IWCSN), IEEE, 2017, pp. 149–155. Fourth International Workshop on Mining Software Repositories (MSR-
[PS07] M. Guizani, T. Zimmermann, A. Sarma, D. Ford, Attracting and re- ICSEW), IEEE, 2007, pp. 5–5.
taining OSS contributors with a maintainer dashboard, in: ACM/IEEE [PS26] L. Heimburger, L. Buchweitz, R. Gouveia, O. Korn, Gamifying on-
44th International Conference on Software Engineering: Software Engi- boarding: how to increase both engagement and integration of new em-
neering in Society (ICSE-SEIS), 2022, pp. 36–40. ployees, in: International Conference on Social and Occupational Er-
[PS08] H. He, H. Su, W. Xiao, R. He, M. Zhou, Gfi-bot: automated good gonomics (AHFE), Springer, 2020, pp. 3–14.
first issue recommendation on GitHub, in: 30th ACM Joint European [PS27] Y. Malheiros, A. Moraes, C. Trindade, S. Meira, A source code recom-
Software Engineering Conference and Symposium on the Foundations mender system to support newcomers, in: IEEE 36th Annual Computer
of Software Engineering (ESEC/FSE), 2022, pp. 1751–1755. Software and Applications Conference (COMPSAC), IEEE, 2012, pp.
[PS09] H. Kagdi, M. Hammad, J. I. Maletic, Who can help me with this source 19–24.
code change?, in: IEEE International Conference on Software Mainte- [PS28] C. Yang, Q. Fan, T. Wang, G. Yin, H. Wang, Repolike: personal repos-
nance (ICSM), IEEE, 2008, pp. 157–166. itories recommendation in social coding communities, in: 8th Asia-
[PS10] R. Medeiros, O. Dı́az, Assisting mentors in selecting newcomers’ next Pacific Symposium on Internetware (Internetware), 2016, pp. 54–62.
task in software product lines: A recommender system approach, in: [PS29] Y. Zhou, J. Wu, Y. Sun, Ghtrec: a personalized service to recommend
Advanced Information Systems Engineering: 34th International Confer- GitHub trending repositories for developers, in: IEEE International Con-
ence (CAiSE), Springer, 2022, pp. 460–476. ference on Web Services (ICWS), IEEE, 2021, pp. 314–323.
[PS11] L. Nagel, O. Karras, J. Klünder, Ontology-based software graphs for [PS30] A. S. M. Venigalla, K. Boyalakuntla, S. Chimalakonda, Gitq-towards
supporting code comprehension during onboarding, in: 47th Euromi- using badges as visual cues for GitHub projects, in: 30th IEEE/ACM
cro Conference on Software Engineering and Advanced Applications International Conference on Program Comprehension (ICPC), 2022, pp.
(SEAA), IEEE, 2021, pp. 158–165. 157–161.
[PS12] A. Sarma, M. A. Gerosa, I. Steinmacher, R. Leano, Training the future [PS31] X. Sun, W. Xu, X. Xia, X. Chen, B. Li, Personalized project recom-
workforce through task curation in an OSS ecosystem, in: 24th ACM mendation on GitHub, Science China Information Sciences 61 (2018)
SIGSOFT International Symposium on Foundations of Software Engi- 1–14.
neering (FSE), 2016, pp. 932–935. [PS32] A. Sarma, L. Maccherone, P. Wagstrom, J. Herbsleb, Tesseract: in-
[PS13] L. P. Serrano Alves, I. S. Wiese, A. P. Chaves, I. Steinmacher, How teractive visual exploration of socio-technical relationships in software
to find my task? chatbot to assist newcomers in choosing tasks in OSS development, in: IEEE 31st International Conference on Software Engi-
projects, in: Chatbot Research and Design: 5th International Workshop neering (ICSE), IEEE, 2009, pp. 23–33.
(CONVERSATIONS), Springer, 2022, pp. 90–107.
[PS14] C. Stanik, L. Montgomery, D. Martens, D. Fucci, W. Maalej, A simple
nlp-based approach to support onboarding and retention in open source
communities, in: IEEE International Conference on Software Mainte-
nance and Evolution (ICSME), IEEE, 2018, pp. 172–182.
[PS15] I. Steinmacher, T. U. Conte, C. Treude, M. A. Gerosa, Overcoming
open source project entry barriers with a portal for newcomers, in: 38th
International Conference on Software Engineering (ICSE), 2016, pp.
273–284.
[PS16] I. Steinmacher, I. S. Wiese, M. A. Gerosa, Recommending mentors to
software project newcomers, in: Third International Workshop on Rec-
ommendation Systems for Software Engineering (RSSE), IEEE, 2012,
pp. 63–67.
[PS17] C. Toscani, D. Gery, I. Steinmacher, S. Marczak, A gamification pro-
posal to support the onboarding of newcomers in the flosscoach portal,
in: 17th Brazilian Symposium on Human Factors in Computing Systems
(IHC), 2018, pp. 1–10.
[PS18] J. Wang, A. Sarma, Which bug should i fix: helping new developers
onboard a new project, in: 4th International Workshop on Cooperative
and Human Aspects of Software Engineering (CHASE), 2011, pp. 76–
79.
21