Artículo Inglés
Artículo Inglés
net/publication/335961819
CITATIONS READS
45 6,217
2 authors, including:
Fadi Wedyan
Hashemite University
32 PUBLICATIONS 315 CITATIONS
SEE PROFILE
All content following this page was uploaded by Fadi Wedyan on 21 September 2019.
Abstract: The impact of design patterns on quality attributes has been extensively evaluated in studies with different perspectives,
objectives, metrics, and quality attributes, leading to contradictive and hard to compare results. Our objective is to explain these
results by considering confounding factors, practices, metrics, or implementation issues that effect quality. Furthermore, there is a
lack of research that connect design patterns evaluations to patterns development studies. Accordingly, we also aim at providing
an initiate on how patterns structure and implementation can be improved, to promote software quality.
To achieve our goals, we conducted a systematic literature review by searching the literature for related studies. The study covers
the period between years 2000 and 2018. We identified 804 candidate papers. After applying our inclusion and exclusion criteria,
we were left with 50 primary studies. Our results show that documentation of patterns, size of pattern classes, and the scattering
degree of patterns have clear impact on quality. In case studies, researchers used different metrics applied on different modules.
Controlled experiments have major design differences.
Reaching consensuses on the effect of patterns requires considering influencing factors, using unified metrics, and an
agreement on what modules to measure. Studying how to improve patterns modularity is recommended for future research.
1 Introduction the primary studies they evaluated are not well understood, and have
limited generalizability.
Design patterns represent solutions to frequently occurring soft- In this paper, we build upon previous work on the effort of pro-
ware problems for designing good quality software. They were viding a better understanding of the effect of using design patterns
proposed for the first time by Alexander et al. [1] in 1977 in the on software quality. We aim at providing an explanation of the con-
field of building architecture. In the mid 1990s, Gamma et al. [2] tradictive results obtained by various results by studying the factors,
proposed approaches for applying architectural patterns on object practices or implementation issues that affect quality attributes when
oriented software. They proposed 23 design patterns called GoF design patterns are used. We also collected and compared the vari-
(Gang of Four) patterns. GoF design patterns are classified into three ous metrics used in evaluating design patterns and how these metrics
categories: structural, creational, and behavioral patterns. are applied. Moreover, by providing an explanation of the relation-
Design patterns are solutions of problems at the design and ship between design pattern and quality that is built on a variety of
implementation levels of software components. Design patterns are empirical studies, we also aim at giving suggestions on how design
claimed to provide various benefits including: (1) support for better patterns structure and implementation can be improved, to promote
design decisions, (2) improving communication among developers, building high quality software.
(3) enhancing or easing software maintainability, and reusability In order to achieve our study goals, we conducted a Systematic
(4) help in satisfying nonfunctional requirements of the system, Literature Review (SLR) of existing literature on the effect of using
and (5) saving cost, effort and time by reusing existing proven design patterns on software quality, in particular, GoF design pat-
solutions [2–4]. terns. The SLR covers the period between the years 2000 and 2018
Many researchers conducted studies to investigate the above (inclusive). We searched for relevant publications in primary dig-
claims both empirically and analytically. Some of the studies indi- ital libraries. We identified 804 candidate studies. After applying
cate that design patterns impede the achievement of the required the inclusion and exclusion criteria, we were left with 50 primary
level of quality for the subject software. However, some other stud- studies. The primary studies include both analytical and empirical
ies concluded that developers should use design patterns in order to research.
improve software quality. For example, some studies concluded that The rest of this paper is organized as follows. Section 2 presents
the Abstract Factory pattern improves design extensibility (e.g., [5– related work. Section 3 presents the procedure followed in the SLR
7]), while other studies found that the use of Abstract Factory has a and the research questions. Section 4 provides a summary of the
negative impact on extensibility ( [8–10]). Such contradictory results primary studies. Section 5 discusses the results obtained from the
have been reported for most of evaluated patterns [11, 12]. There- selected primary studies and answers the research questions. Threats
fore, there is little consensus about the real impact of any particular to validity are discussed in Section 6. Finally, Section 7 concludes
design pattern on software quality. the work.
Studies were performed in order to collect evidences from avail-
able research in order to reach an agreement on the effect of using
design patterns on certain quality attributes. These studies were per- 2 Related Work
formed as a mapping study (e.g.,[12, 13]), or a literature review (e.g.,
[11, 14, 15]). The results of these studies clearly show that generally The study of design patterns has gained significant attention since
there is no consensus on the effect of design patterns on software they were first proposed. Many studies have been performed to
quality. Riaz et al. [13] found that it is hard to compare the findings detect, categorize, utilize, and evaluate patterns empirically and ana-
of empirical studies due to differences in study design and execu- lytically. Literature surveys play an important role in evaluating
tion. They also concluded that the available empirical findings of studies and shed light on the state of research and the milestones
achieved.
Weiss [14] surveys 16 studies published between the years 2000 phenomenon of interest.” In this work, we carry out a SLR by fol-
and 2008 on design patterns and their impact on system con- lowing the guidelines provided by Kitchenham [17]. The objectives
cerns. The author divides the primary studies into three categories; of this SLR are:
which are: (1) studies that explicitly link design patterns and system
concerns, (2) studies with the goal of supporting the selection of pat- • Identifying confounding factors that affect the deployment of
terns, and (3) studies that document the rationale for architectural design patterns.
decisions. • Identifying whether the structure and implementation of a design
Zhang and Budgen [15] present a systematic literature review in pattern has an effect on quality attributes.
the form of a mapping study. The study investigates which of the • Identifying the metrics used to measure quality attributes in the
GoF design patterns have been subjected to empirical studies, and studies that evaluate the effect of design patterns on quality.
what conclusions are available about their evaluation on effects of • Understanding how design patterns affect software quality and
using them. The study considers 11 empirical studies documented in explain how to reach consistency for this effect.
10 papers published between the years 1995 and 2009. • Categorizing approaches used in evaluating the effect of design
Ampatzoglou et al. [12] present a mapping study of about 120 patterns on software quality.
primary studies to provide an overview of the research state of the • Identifying and cataloging threats to validity reported in the
art on GoF design patterns. They investigate: (a) if research efforts studies that evaluate the effect of design patterns on software quality.
on design patterns can be categorized in research subdomains, (b)
what are the most active subdomains, and (c) what evidence exists In the following sections, we present the details of our methodol-
about the effect of patterns on software quality attributes. ogy.
Ali and Elish [11] survey the literature for existing empirical
evidence on the impact of GoF design patterns on software qual-
ity attributes. They evaluated the coverage of empirical evidence in 3.1 Research Questions
terms of both quality attributes and design patterns, and provided
a summary of the impact of design patterns on software quality Based on the study objectives, the research questions of this SLR
attributes. The authors studied 17 papers published between the are:
years 2001 and 2012.
Riaz et al. [13] present a study in which they analyzed 19 • RQ1: What confounding factors, practices, or programming con-
empirical studies documented in 17 papers published between the structs affect quality attributes when design patterns are used?
years 2000 and 2009. They extracted beneficial information, such • RQ2: What quality attributes are evaluated, what is measured, and
as variables associated with participants demographics, pattern and what metrics are used?
problem presentations, in addition to 10 evaluation criteria with their • RQ3: What are the common threats to validity reported in the
associated observable measures. They also identified challenges that primary studies?
researchers may face during conducting an empirical study.
Mayvan et al. [16] performed a mapping study on design pat- Software quality is often described as a context dependent con-
terns. They extracted data from 637 articles. Their results show that cept [18]. In order to achieve a high quality software, there are
the design patterns field is an active and attractive research field. various measures that have to be considered. Many factors can con-
Moreover, they found that pattern development, pattern mining, and tribute to produce a better quality software. When design patterns are
pattern usage are the most active topics, while less publication found deployed, these factors cannot, and should not be eliminated. More-
on pattern evaluation and pattern specification topics. over, a correct use of design patterns can also be affected by how
Previous literature reviews and mapping studies show that there developers are using design patterns and their level of experience. By
are problems in how design patterns are evaluated, in terms of their answering this question, we collect from the primary studies the fac-
effect on quality. These problems lead to contradictions in results tors that researchers had considered when evaluating design patterns.
obtained from different studies. Moreover, studying the impact of We also identify whether more factors should be considered.
design patterns or any design module is far from being simple, taking The second question is related to the relevance of quality
into consideration the large range of factors that influence how a attributes. Which attributes researchers considered to be related to
software is crafted. In order to successfully deploy a solution, a good the use of design patterns. Software quality is multidimensional,
understanding of the solution needs to be provided, including how to consisting of many attributes, some of which might contradict.
successfully deploy the solution, and how to identify and measure its Therefore, we need to identify which attributes can be influenced
effect. by the use of design patterns, which have been largely evaluated and
In this paper, we present a systematic literature review of 50 pri- which have been ignored.
mary studies published during the years 2000 through 2018. Our Another important issue is the metrics used to evaluate quality
paper differs from the above studies in many aspects. We studied all attributes and how these metrics are applied. This helps in explaining
types of approaches, i.e., both empirical or analytical, with respect to the differences in results obtained by different studies.
all quality attributes and GoF design patterns, which makes our work Finally, the answer of the third question can help (1) future
the most recent and comprehensive. Our results clearly identify con- researchers avoid these threats while performing their empirical
founding factors that effects software quality when design patterns evaluations and (2) validate the results of existing studies.
are used. We also analyzed the differences in metrics used to mea-
sure quality, and differences in the measured software artifacts. Our
findings can increase the understanding of the inconsistency in the 3.2 Search process
results obtained by different studies. Moreover, we outlined the areas
that need more research , mainly on how to improve software qual- The search process was performed in two steps. In the first step,
ity by carefully deploying design patterns, and how to study patterns search for relevant publications was performed in digital libraries.
effect on quality. This step includes identifying data sources and specifying search
terms. In the second step, the study selection strategy is applied. The
second step requires specifying the inclusion/exclusion criteria and
the quality assessment strategy. In the following sections, we discuss
the details of these steps.
3 Systematic Review Procedure
3.2.1 Search terms and search resources: We derived the
According to Kitchenham [17], “A systematic literature review is search terms using our research questions. We determined alternative
a means of identifying, evaluating and interpreting all available spellings for each term and then performed the search process. We
research relevant to a particular research question, or topic area, or used the following search string:
In Table 1, we show the number of obtained papers from each • F1: The article has clear objectives that are related to our study.
of the digital libraries. The search was performed using the search • F2: The article clearly describes the research methodology.
string on the metadata (title, abstract, and keywords). The search was • F3: The article specifies what metrics are used to measure quality
limited to the period between 2000 and 2018 (inclusive), which is the attributes and describes how these metrics are computed.
period covered in this SLR. The results obtained from the different • F4: The article discusses threats to validity and/or study limita-
databases contain many redundant articles (i.e., the same article is tions.
returned by more than one search engine). There is a high percent-
age of redundant articles because many articles might be indexed in Applying the quality assessment requires reading the full-text of
more than one database (e.g., an ACM/IEEE conference article). The each candidate primary study. We performed the quality assessment
percentage of redundant retrieved articles in our study is about 55%. in meetings between the first and second authors. First we sorted
This is close to what was found by Jabangwe et al. [18] (about 51% the candidate primary studies in ascending order using publication
of redundant articles). Dyba et al. [19] also reported that databases date. Starting from the oldest papers in the list and before each meet-
relevant to software engineering, such as ScienceDirect and Wiley, ing, we select a group of candidate primary studies to discuss. The
returned similar search results as ACM and IEEE. The total number number of articles discussed in each meeting depend on many fac-
of retrieved articles at this stage (after excluding duplication) is 804 tors including the meeting duration and the size of articles to be
article. discussed. In each meeting, the first and second authors discussed
whether the paper qualifies to be one of the primary studies by
Table 1 Number of obtained papers from the searched digital libraries assessing the quality factors. In the case where there is no agree-
Digital Library No. Obtained Papers ment on whether to qualify an article or one of the authors did not
ACM Digital Library 560 understand an aspect of the article, the article is added to a list of
IEEE Digital Library 212 papers which need a second round of discussion. In this round, we
Science@Direct 647 invited reviewers to join the meeting (either graduate students or
SpringerLink 369 faculty members) to solve the dispute.
Wiley Online Library 24
Total 1812 3.3 Information extraction
For each study, we collected two sets of information. First set, which
we called general information, includes the following variables:
3.2.2 Primary study selection: The following inclusion and Type of publication (journal, conference), publication venue, pub-
exclusion criteria are applied: lication date, and number of pages. We used this information to find
some descriptive statistics about the obtained papers. In the second
• Only peer reviewed articles published in journals, conferences, set, which we called detailed information, we extracted the following
and workshops indexed in Scopus [20] are included. information:
• Articles that study the impact of using at least one of the GoF
design patterns on object-oriented software systems (e.g., Java, C#, • The design Pattern(s) investigated
or C++) are included. Articles that study non-GoF design pat- • The evaluated quality attributes
terns are excluded (e.g., agent design patterns [21], patterns for • Metrics used for evaluating the quality attributes. Including arti-
distributed, concurrent, or web-based systems, and UI patterns). facts that the metric evaluates
• Only articles Written in English are included. • Research methodology (analytical, controlled experiment, case
• In the case where the same study is published in more than one study)
article, that is, when a study is first published as a short paper in • Threads to validity reported
a conference and subsequently as an extended journal version, we
consider only the journal version as a primary study (e.g., Izurieta The retrieved information are then classified according to: (1)
and Bieman’s [22] short paper was extended in the form of a journal design patterns, (2) quality attributes, and (3) research methodology.
paper [23]).
Papers that do not meet the inclusion criteria are excluded. Exclu- 4 Summary of Primary Studies
sion of irrelevant articles is decided by reading the paper title,
abstract and keywords. The list of the 50 primary studies used in this SLR is given in the
We applied the inclusion and exclusion criteria on 804 papers. appendix (Section 10). In the rest of this paper, we will be using the
Accordingly, 729 papers were removed. This left us with 75 papers. ID given to each primary study in section 10 to refer to the corre-
We then applied the quality assessment strategy on the remaining sponding paper. In this section, we summarize information obtained
papers. from the primary studies including trend of publication in years,
We assessed the 75 candidate primary studies using the following publication venues, design patterns evaluated, data sets, evaluated
factors: quality attributes, and research methods.
4.1 Publication Years, Venus, and Authors Table 3 Active Researchers on the effect of design patterns on Quality
Author Primary Studies
In order to understand the general trend of evaluating the effect of Ampatzoglou, A. PS18, PS31, PS33, PS35, PS39, PS45, PS47, PS49, PS50
design patterns on software quality, we counted the number of pub- Avgeriou, P. PS45, PS47, PS49, PS50
lications by year from 2000 till 2018 (period covered in our study). Bieman, J. M. PS3,PS4, PS9, PS38
The results are shown in Figure 2. While our results do not show a Cerulo, L. PS16, PS19, PS21, PS23
Chan, W. PS13, PS14, PS17, PS37
consistent increase in the number of publications, it shows a close Cheung, S. PS13, PS14, PS17, PS37
number of annual publications, which indicates that the interest in GuÃl’hÃl’neuc, Y.G. PS20,PS21, PS26, PS42
the field has not changed during the last 19 years. However, there is Ng, T. PS13, PS14,PS17, PS37
no direct evidence to explain the spikes in the years 2007, and 2011. Stamelos, I. PS31, PS33, PS35, PS49
Tichy, W.F PS1,PS5, PS7 PS11
Yu, Y. PS13, PS14, PS17, PS37
Aversano, L. PS16, PS19, PS23
Counsell, S. PS24, PS28, PS36
Di.Penta, M. PS16, PS19, PS23
Gatrell, M. PS24, PS28, PS36
Gravino, C. PS27, PS29, PS43
Risi, M. PS27,PS29, PS43
Scanniello, G. PS27,PS29,PS43
Tortora, G. PS27 PS29, PS43
Unger, B. PS1,PS5, PS7
Table 5 Most common datasets in case studies • DeMIMA (Design Motif Identification Multilayered Approach) [29].
DeMIMA is built on the Ptidej [30] framework which constructs
Prog. No.
Program
Lang. Studies
Primary Studies static and dynamic models of Java source code. DeMIMA consists
JHotDraw Java 9 PS16, PS19, PS21, PS23, PS32 of three layers: two layers to recover an abstract model of the source
PS34, PS40, PS44, PS47 code, and a third layer to identify design patterns in the abstract
Eclipse Java 5 PS19, PS21, PS23, PS32, PS40 model. The current version of DeMIMA can detect the following
ArgoUML Java 5 PS19, PS23, PS38, PS40, PS42
Xerces-J Java 3 PS21, PS32, PS42
16 design patterns: Abstract Factory, Adapter, Builder, Chain of
JFreeCart Java 2 PS42, PS46 Responsibility, Command, Composite, Decorator, Factory method,
JRefactory Java 2 PS9, PS42 Facade, Observer, Prototype, Proxy, Singleton, State/Strategy, Tem-
plate method, and Visitor. DeMIMA was used by 3 of the primary
studies as shown in Table 4.
• PINOT (Pattern Inference and Recovery Tool) [31]. The tool uses
static analysis and knowledge about the behavior aspects of the pat-
however, this can also decrease the generalization of the results and tern. The current version of the tool can detect the following 17
is a clear external threat to validity. patterns: Abstract Factory, Adapter, Bridge, Chain of Responsibility,
Third, design pattern mining (detection) tools are used in all pri- Composite, Decorator, Facade„ Factory Method, Flyweight, Medi-
mary studies that evaluated subject programs implemented in Java, ator, Observer, Proxy, Singleton, Strategy, State, Template Method,
except when the patterns are documented. In primary studies that and Visitor. PINOT was used by primary study PS45.
evaluated subject programs implemented in C++ and C#, design pat- • Design pattern finder and design pattern Seeker. The tools were
terns are detected manually (except PS10). While this might indicate develop at Colorado State University for identifying intended design
that researchers generally trust design patterns mining tools for Java patterns. The tools were used by PS38 with the help of a script writ-
programs, other researchers argue that the precision and recall of ten by the the paper authors. The authors also verified the results
these tools are still unsatisfactory, therefore, it is better to study soft- manually.
ware with documented design patterns (e.g., PS44), or verify the tool
with manual inspection (e.g., PS38). For a comparison of design
patterns mining tools, please refer to [27]. In the primary studies
4.2.3 Controlled or quasi-experiments: Experiments are used
included in this paper, the following tools were used to detect design
for measuring the effects of manipulating one variable on other vari-
patterns in Java programs:
ables while controlling other variables that might affect the results
at a fixed level [32]. There are 14 primary studies that used con-
• The tool developed by Tsantalis et al. [28] which is based on a trolled experiments. Their characteristics are given in Table 6. In the
Similarity Scoring Algorithm (SSA). The algorithm finds scores that table, for each primary study (column one), we show which pro-
reflect similarity of graphs representing original patterns and the ana- gramming language used (column two), the subjects participated in
lyzed code. A pattern is considered as correctly identified if the sim- the study (column three), and the approach used in preparing the task
ilarity score exceeds a pre-defined threshold. As shown in Table 4, performed by the participating subjects (column four).
the tool was used in 12 out of 18 case study performed on Java In controlled experiments, researchers need to consider limiting
programs with undocumented patterns. The current version of the the effect of the environmental settings in order to increase the abil-
tool can detect the following 11 design patterns: Adapter/Command, ity to generalize and replicate the study. One observation about the
Composite, Decorator, Factory Method, Observer, Prototype, Proxy, controlled experiments performed as appear in Table 6 is that most of
Singleton, State/Strategy, Template Method, and Visitor. Notably, participants are students. Only four experiments used software engi-
the tool cannot distinguish between some patterns (i.e., Adapter neers from the industry (PS5, PS11, PS15, and PS43). As discussed
and Command, State and Strategy) because of their identical static by Kitchenham [33], subjects must be representative of the popula-
structure. tion or you cannot draw conclusions from the experiment. However,
Our findings show that all of the 23 GoF design patterns were
evaluated in the primary studies but with varying rates. Factory
Method and Singleton are the most evaluated creational patterns.
Among the structural patterns, Composite, Decorator, and Adapter
are the most evaluated ones, while Observer, State, Command, and
Strategy are the most evaluated behavioral patterns. Among all pat-
Fig. 4: Evaluation of Creational Patterns in Primary Studies terns, Observer, State, Factory method, Composite, Decorator, and
Strategy are the most evaluated patterns; their impact on software
quality is evaluated in 27 or more of the 50 primary studies. On the
other hand, Facade, Flyweight, Chain of Responsibility, Interpreter,
Iterator, Mediator, and Memento are evaluated in 6 or less primary In PS43, Scanniello et al. extended the experiments in PS28 and
studies. PS30 where the controlled experiments are performed with subjects
These results can be explained by the following reasons. First, with different levels of experience. Their results, while confirming
software that contains documented patterns are used frequently as their previous experiments, also show that experienced developers
subject programs. As shown in Table 5, JHotDraw, which contains benefit more from the documentation of design patterns, compared
a set of documented instances of design patterns has been used with less experienced developers.
in 9 of the case studies in the primary studies. While the number The primary studies results agree on the positive effect of docu-
design patterns instances in JHotDraw varies in different versions menting the source code with information about instances of design
of the software, design patterns that are evaluated more than others patterns on software comprehension and maintenance. Taking into
have instances in most of JHotDraw versions. Even in some con- account that primary studies show that it is enough to depend on
trolled experiments, participating subjects were asked to perform source code comments about instances of design patterns, therefore,
tasks on real subject programs with documented design pattern like we expect that the effort needed by the developers to add such com-
JHotDraw. ments is not high. However, this effort needs to be evaluated in future
Second, design patterns that can be mined from the subject pro- studies.
grams using tools are evaluated more. For example, as we discussed Primary studies did not, however, study the effect of the lack of
in Section 4.2, Tsantalis et al. [28] tool is widely used. The patterns documentation for unintended design pattern, which are instances
that the tool can detect are among the most evaluated patterns. of design patterns that developers implement in the code uninten-
Third, the level of training and knowledge of the developers tionally. Having these patterns instances in the source code without
affects the choice of patterns when controlled experiments are per- documentation makes it hard to understand the code. One suggestion
formed. Finally, there might be a relationship between the number here is to use a design pattern mining tool in order to find unin-
of evaluations of design patterns and their usage (popularity) in tended patterns. Needless to say, this suggestion depends on having
software. however, such claim needs further investigation. a tool with high mining precision and low cost. The primary studies
discussed the importance of documenting design patterns for main-
tainability and program understanding. Future research can study the
5 Results effect of documentation of design patterns on other quality attributes
(e.g., extensibility).
In the following sections, we answer the research questions of this
study. 5.1.2 Module Size: The size of a software module has been
shown by many studies to have a clear effect on many software qual-
5.1 What confounding factors, practices, or programming ity attributes, which known as the confounding effect of class size.
constructs effect quality attributes when design patterns are Studies show that large modules (classes in OOP) tend to be hard
used? to maintain, test, and error-prone. El Emam et al. [37] examined
confounding effect of class size in the validation of object-oriented
Many factors contribute to the quality of software. When studying metrics, mainly the C&K metrics [38]. Their results confirm the con-
the effect of design patterns on quality, other factors that effect the founding effect. In many primary studies, we found that the class
use of patterns need to be identified. In the following, we discuss size is admitted as a threat to validity. In primary studies PS3, PS9,
the factors, programming constructs, or practices, addressed in the PS10, PS32, and PS48, the effect of size of classes participating in
primary studies, that influence software quality in the presence of design patterns is evaluated. While in PS44, the class size is used to
design patterns. normalize the metrics used in order to eliminate its effect.
In PS3, Bieman et al. performed a case study to evaluate whether
5.1.1 Documentation of design pattern instances: A soft- there is a relationship between the use of design patterns and the
ware document can be described as any artifact which aims at number of changes in evolving versions of an industrial software.
communicating information about the software it describes to read- While the goal of the paper is to evaluate classes that participate in
ers involved in the software production [35]. Proper documentation design patterns, they found a strong relationship between the class
of software is one of the oldest practices in software development size and change proneness, where larger classes changed more fre-
that is still emphasized due to its importance for software develop- quently. For the effect of design patterns, they found that classes
ment, understanding, and maintenance [36]. In primary studies PS7, which participate in design patterns are among the most change
PS27, PS29, and PS43, the effect of documenting design patterns on prone classes in the software.
software quality is evaluated. In PS9, Bieman et al. studied five systems to evaluate the effects
In primary studies PS7, PS27, PS29, and PS43, the effect of of the use of design patterns on changes that occur as the systems
documenting design pattern instances on performing maintenance evolve. They found that classes that participate in design patterns
tasks has been evaluated using controlled experiments. Prechelt et (pattern classes for short) are change prone as other classes in four
al. (PS7) performed two controlled experiments and found that pro- out of five systems. Pattern classes in one of the systems were less
viding a well-documented programs, commented with instances of change prone. In this study, the authors reported after normalizing
design patterns, significantly decrease the cost of maintenance, in the effect class size. Moreover, they found that larger classes are
terms of time cost and errors, compared with code with no such more change prone in two of the five systems.
comments. In PS10, Vokáč performed a case study to evaluate the relation-
In PS27, Scanniello et al. conducted a more recent controlled ship between the use of design patterns and number of defects in
experiments that confirms the finding of Prechelt et al. that main- the code. The author compared defect rates for classes that partici-
tenance is performed faster with proper documentation of design pated in selected design patterns to the code at large. They found that
patterns (which the authors called effort). They also provided a new there are significant differences in defect rates among the patterns.
measurement which they called efficiency, computed bu dividing the Vokáč analyzed the effect of class size as a confounding factor. The
subject performance by the subject effort. They found that efficiency results show that the size effect is significantly correlated to defect
is also improved when design patterns instances are documented. frequency. However, no certain results for the correlation between
In PS29, Gravino et al. conducted two controlled experiments certain patterns and class size (i.e., if the use of a pattern requires
to measure the impact of documenting design patten instances on larger participant classes).
the ability of subjects to understand a given source code (i.e., In PS32, Posnett et al. studied the effect of the size of patterns
source code comprehension). They used two types of documenta- classes and classes playing a metapattern roles on change proneness.
tion, graphical (UML), and textual (source code comments). There The concept of a metapattern was introduced by Pree [39] which
results do not favor any type of documentation over another. They aims at capturing the pure structure of design patterns. Structurally
also found that the ability to understand the code is increased when similar design patterns instantiate the same metapatterns. Posnett et
developers correctly identify design patterns in the code. al. found that size explains more of the variance in change-proneness
than either design pattern or metapattern roles. They also found that The results of PS16 show that a pattern induced crosscutting con-
both design pattern and metapattern roles were strong determinants cern scattering degree can have a moderate to high correlation with
of size. Therefore, size is a stronger determinant of change proneness the presence of defects. They also found that the scattering degree is
than either design pattern or metapattern roles. Differences in change correlated to the fault proneness of the design pattern code itself.
proneness between roles can be better explained by differences in the The results of primary studies PS16 and PS23 can be interpreted
sizes of classes playing these roles. in many ways, mainly, the results show that not all design pat-
In PS44, Elish and Mohammed measured and compared the fault terns induce crosscutting concerns and when that happens, then the
density of pattern classes with non pattern classes. Fault density of problem is not related to the design pattern but to the existence
class is a measure of the number of faults in the class divided by the of scattered code in object-oriented programs. Eddy et al. [41, 42]
class size. The authors aim of normalizing the number of of faults study show that there is a significant correlation between the scat-
by the class size is to reduce confounding effect of class size. tering degree of any crosscutting concern and fault proneness. They
In PS48, Hussain et al. performed a case study to investigate also found that most of concerns are crosscutting (95%) with various
the correlation between frequent use of design patterns and quality degrees. Which suggest potential need for improving modularity. In
attributes. The authors studied the effect of the software size, mea- other words, crosscutting concerns, are the norm in object-oriented
sured in number of classes, on the correlation. Their results support programs, not the exception.
the existence of the size effect as a confounding factor. Modularization of crosscutting concerns can be obtained using
Since the class size effect on software metrics is widely accepted, Aspect-oriented Programming (AOP). In AOP, a crosscutting con-
it is necessary to evaluate whether the use of certain design patterns cern can be modularized using a construct called Aspect. AOP is
has an effect on the class size (i.e., requires increasing or decreasing common in many industrial framework such as JBoss and Spring
the size of participating classes). This can provide better understand- Framework [43]. Most of common programming languages have an
ing of the conflicting results of evaluating design patterns. In PS28, extension that supports developing aspect-oriented programs (e.g.,
PS36, the authors mentioned that some design patterns (e.g., sin- AspectJ [44] for Java, AspectC++ [44] for C/C++).
gleton) increase the class size. However, this is an observation that In PS6, Hannemann and Kiczales developed AspectJ implemen-
needs to be empirically supported. tations of the 23 GoF design patterns. For each of the 23 GoF
It is widely agreed in the software engineering community that patterns, they developed a representative example that makes use of
large modules have a harmful effect on software quality. The pri- the pattern and implemented the example in both Java and AspectJ.
mary studies discussed here support that size is a confounding factor Hannemann and Kiczales study shows that using AspectJ improves
that should be considered. However, the effect of the use of design the implementation of many GoF patterns, in terms of modulariza-
pattern on modules size is still not clear. Whether classes become tion. In particular, design patterns with crosscutting concerns got
larger because of the use of a design pattern or due to other factor the most improvement. Another result the authors obtained is that
(e.g., the task that a pattern performs) is an important issue that needs by implementing a pattern in an aspect, the pattern implementation
further research. becomes reusable. The last result applies to half of the GoF patterns.
Primary study PS6 quantified modularity using the four attributes
of locality, reusability (of the pattern code), composition trans-
5.1.3 Design patterns as crosscutting concerns: A software parency, and (un)pluggability.
can contain two types of concerns: core, which is a main behav- In PS22, and PS41, two case studies were conducted to eval-
ior that is needed by the software, and crosscutting, which refer to uate the AOP implementation of design patterns, compared with
a behavior that is common to multiple system core modules [40]. OOP implementation. In PS22, Garcia et al. performed a study
With the lack of a structure that modules a crosscutting concern comparing the AOP implementation of design patterns developed
in object-oriented programming, a crosscutting concern implemen- by Hannemann and Kiczales (PS6) with OO implementation. The
tation is scattered through core concerns modules, resulting in a Garcia et al. used the implementation from Hannemann and Kicza-
tangled code for the core concerns. In other words, a crosscutting les study and added some implementations they developed for the
concerns implementation decreases modularity, which in turn, neg- sake of the study. The comparison was performed using the fol-
atively impacts software quality. Primary studies PS6, PS16, PS22, lowing attributes: separation of concerns, coupling, cohesion and
PS23, and PS41 discusses the degree of scattering of a design pattern size. They used 10 metrics for measurements. The results show that
and how a design pattern modularity can be increased. for most patterns, AO implementations improved separation of con-
In PS15, and PS22, Aversano et al. stated that the introducing of cerns, reduced coupling, increased cohesion, and reduced size of
some design patterns in the code, might result in scattering the code code.
of the classes that interact with the design patterns (i.e., the clients In PS41, Cachoa et al. investigated how AOP can help to reduce
the design patterns). When the software evolves, the presence of such the complexity involved in composing design patterns. That is, when
scattered code might make the software hard to change and introduce a class or a method that plays a role in a particular pattern plays
faults. another role in a different pattern. In this case, the design pat-
In PS16, Aversano et al. empirically investigated the relation- terns implementations become tangled with each other and with
ships between the evolution of design patterns and the evolution the core responsibilities their implementation becomes tangled with
of crosscutting concerns they induce. They identified crosscutting each other and with the core responsibilities of the patterns client
concerns containing invocation of methods belonging to design pat- classes. They used three software originally implemented in Java
tern classes. Their results indicate a consistent change of crosscutting and reengineered the existing Java implementations to produce the
with the pattern. aspect-oriented (AO) versions of them. They used Hannemann and
In PS23, Aversano et al. studied whether there is a relationship Kiczales (PS6) AspectJ implementations of the design patterns in the
between the scattering degree of the crosscutting concern and fault AO versions. They measured pattern modularity with four attributes,
proneness for crosscutting concerns induced by design patterns. The these are: separation of concerns, coupling, cohesion, and size. Their
degree of scattering describes how the code of a concern is dis- results were in favor of the AO implementations in most of the cases.
tributed among elements (classes or methods) [41]. Aversano et al. Modularization of design patterns using AOP is promising as the
identified the degree of scattering in the context of a design pattern primary studies suggest. Providing an off-the-shelf version of design
over its clients. That is, the number of callers spread among different patterns can promote reusability and ease of use of design patterns.
classes with respect to the number of callees for classes participat- However, there are some technical issues related to AOP that need to
ing in a design pattern. Aversano et al. metric is different from the be handled in order to ease software development via AO program-
Degree Of Scattering (DOS) metric created by Eaddy et al. [42] in ming languages. For example, in PS41, Cachoa et al. compared the
that the later is a measure of the variance of the concentration of a design patterns implementations in AspectJ (the deFacto AO com-
concern (how many of the source lines related to a concern are con- piler for Java) with implementations in another AO complier called
tained within a specific component) over all components with respect Compose* [45]. Their results show that Compose* implementation
to the worst case. provide either similar or better separation of concerns, compared
with AspectJ. AOP is not new anymore, research in AOP need to In order to answer these questions, we summarized the setting
solve issues that prevent the wider adoption of the paradigm. Tanter and evaluation approaches used in the primary studies in Table 7,
et al. [46] presented several technical issues associated to the current for case studies, and Table 8 for controlled experiments. In Table 7,
state of affairs of aspect languages that need to be handled. Legar for each primary study in column one, we give in column two, the
and Fukuda [47] asked the question of why developers do not take quality attribute and the surrogate attribute in parentheses measured
advantage of the progress in modularity (including AOP), despite by the study. In column three, we give the module(s) used for mea-
modularity clear benefits. While the authors intend to answer the surements, while in column four, we give the metrics used. Table 8
question in future work, we think, based on the discussed primary is organized in the same way except for the modules, which does not
studies, a solution which allows modularization of design patterns, apply for the metrics used in controlled experiments. The follow-
either using Aspects or another approach, can promote software ing abbreviations are used in both tables: Pattern classes (PC), Non
quality. pattern classes (NPC), Classes that Co-change with pattern classes
(CoPC), Classes that has static relationships with pattern classes
(StPC), Classes of a certain rule in a design pattern (RoPC), Package
classes (PaC), and all classes in the software (AllC).
5.2 What quality attributes are evaluated, what is Evaluation of software quality requires relating measurements of
measured, and what metrics are used? software artifacts to external quality attributes or to one or more of
the attribute characteristics. Models for software quality have been
Results reported by the primary studies are obtained empirically researched for decades and a large number of such models have
using experiments with various settings and evaluation approaches. been proposed. In the primary studies, researchers did not always
Here, in an effort to provide better understanding on how these specify which quality model they follow, although this is important,
results compare to each other and yet to accumulatively build on especially when an internal attribute is used as a surrogate for an
these results, we raise the following questions: external attribute. From Table 7, we can clearly conclude that main-
tainability is the most evaluated external quality attribute. We can
also note that the internal quality attributes evaluated are mapped to
• What quality attributes have been measured, are these attributes maintainability using various quality models.
measured directly or internal quality attributes are used as surro- In the following, we discuss results for each of the quality
gates? attributes.
• What metrics are used for evaluating quality attributes?
• What software artifacts are measured using the chosen metrics? 5.2.1 Maintainability: Most of the primary studies evaluated
Even when the same metrics are used, empirical studies apply these design patterns in the context of software maintenance and evolu-
metrics on different constructs or design motifs (e.g., classes that tion. This trend can be explained by the fact that the benefits (if any)
participate in design patterns, client classes, non-pattern classes, related to the use of design pattern become clear when the software
subsystems or libraries that contain design patterns). is evolved and maintained. For example, the flexibility provided by
• What design patterns are evaluated in the primary studies? Pat- design patterns becomes clear when the software is extended. On the
terns differ in their design and implementation and consequently can other hand, the effect of complexity added due to the use of design
have different influence on certain quality attributes. patterns, if any, also becomes clear when the software is extended
or maintained. In controlled experiments, which require performing Changes performed to a class can be corrective, adaptive, per-
a task and measure subjects performance, it is easier to perform the fective, or preventive [53]. These changes can occur due to new
experiment on the form of a maintenance task on an existing code requirements, debugging, changes that propagate from changes in
due to time and cost limitations of the study. other classes. and refactoring [50]. A class that changes more fre-
ISO-9126 [51] defined maintainability as ”software quality char- quent than others might be because the class is easy to extend, or
acteristic concerning the effort needed to make specified modifica- because the class is correlated to other classes, which rises an alarm
tions to an already implemented system”. ISO-9126 decomposes about the class modularity. In primary studies PS3, PS9, PS24, PS32,
maintainability into four characteristics: analyzability; changeabil- and PS42, changes were not classified. Primary studies PS19, PS21,
ity; stability; and testability. The more recent ISO/IEC 25010 [52] and PS40 examined types of changes in order to find whether the role
quality model rearranged maintainability characteristics into: modu- of a class in a design pattern is related to certain types of changes.
larity, reusability, analyzability, modifiability, and testability. In the However, the primary studies did not investigate why these changes
primary case studies, authors evaluated maintainability using the occur.
following attributes: Another concern rises from ignoring the size of change. The
size can be a proxy of the effort needed to perform the change
(maintenance) task.
1. Change proneness. The attribute was evaluated by 8 of the case 2. Fault proneness, defect frequency or fault density. Was evalu-
studies. These are: PS3, PS9, PS19, PS21, PS24, PS32, PS40, and ated in case studies PS10, PS28, PS31, PS36, PS42, and PS44.
PS42. Except in case studies PS19, and PS40, change proneness was Faults information were obtained from the bug reports and change
measured by counting the number of changes to a class. A change is logs of the studied systems. In case studies PS10, PS36, and PS42,
counted as a single change regardless of the change size, importance, fault proneness was measured by counting the number of faults per
or required effort. In case studies PS19 and PS40, change is nor- class, comparing pattern class with non-pattern classes. In PS28, the
malized by dividing the number of committed changes (snapshots) authors measured the effort needed to fix a fault by considering the
that involve pattern classes by the number of snapshots. Moreover, number of lines of code changed to fix a fault (added, deleted or
number of lines of code that co-changed in non-pattern classes is modified). In PS31, the authors counted the number of open bugs
counted. as a measure of defect frequency and the ratio (No. bugs fixed/No.
Change proneness metric was measured on pattern classes and bugs opened) as a measure of debugging frequency. The authors of
non-pattern classes in order to find whether the use of design pat- PS44 measured fault density of a class by dividing the number of
terns requires more or less changes (PS3, PS9, PS21, and PS24). In fault found by the class size in order to reduce the effect of the class
PS32, and PS21, the metric is applied to classes with certain roles size as a confounding factor.
in a design pattern in order to find whether the role a class plays The results of the primary studies regarding fault proneness do
in a design pattern influences the change proneness of the class. The not show a clear agreement. Primary studies PS28 and PS36 show
change proneness metric is applied to pattern classes and classes that that pattern class are more fault prone than non-pattern classes. Pri-
co-changes when a pattern class is changed in primary studies PS19, mary studies PS28 and PS36 found that some pattern classes are
PS40, and PS42. more or less fault prone than non-pattern classes (e.g., structural
Changeability is considered as one of the characteristics of main- patterns classes are less fault prone), depending on which pattern
tainability according to ISO-9126 quality model [51], in which they belong to. Primary study PS42 did not find clear tendency.
Changeability is defined as ”Attributes of software that relate to In PS44, the authors found that classes which have static and co-
the effort needed for modification, fault removal or for environmen- change dependency with pattern classes are more fault prone than
tal change”. Change proneness is a measure of frequent of change, other classes.
while changeability is concerned with the effort of change. Although At this end, we need to ask ourselves here ”What do these results
the two concepts are related, a clear difference exists. The question tell us?” There is no firm answer on whether pattern classes are more
here becomes, what does change proneness tell us about the class? fault prone. Even if some studies can show that, still we need to know
or why classes change?
why and when this happens. There are many factors that can deter- of design patterns can improve team communication which improves
mine the fault proneness of a class including class responsibility or program understandability.
dependency relations. The developers level of experience, lack of
proper documentation of the software, or lack of maintainers under- 5.2.2 Performance: Performance is a well-known external
standing of the software are main factors that can lead to less or more quality attribute which can be measured, according to ISO-9126 [51]
faults. In the end, we have no advice we can give to software engi- and ISO/IEC 25010 [52] quality models, by time behavior (response
neers on when to use design patterns to avoid producing software time), and resource utilization.
with more faults. The study of Rudzki in PS12 evaluated two quality attributes,
3. Stability. In PS45, Ampatzoglou et al. evaluated stability of throughput, measured by the number of requests served per unit of
classes that participate in design pattern. According to ISO- time, and reliability, measured by the number of correctly served
9126 [51], stability refers to the "attributes of software that relate requests. The author also measured response time for a request. The
to the risk of unexpected effect of modifications”. The standard also study of Rudzki aims at finding whether the changes that design
considers stability as one of maintainability subfactors. To measure patterns introduce to the software structure can have an effect on
stability, Ampatzoglou et al. defined a metric called the Riple Effect the software performance. The author evaluated two design patterns,
Measurement (REM), which attempts to quantify the probability of Command and Facade, and found that in most of the cases, Facade
a change occurring in class to be propagated to a dependent class. outperformed Command design pattern.
Their results show that design pattern classes are more stable than In PS47, Feitosa et al. investigated the effect of using design
non-pattern classes, depending on the class role in the pattern. patterns on performance by measuring energy consumption. They
4. Size, complexity, coupling, and cohesion. In PS18, Ampatzoglou compared pattern participating methods energy consumption to
and Chatzigeorgiou evaluated the effect of using design patterns on the consumption of functionally equivalent alternatives non-pattern
maintainability of games software. The authors calculated four cate- solutions. Their results show that for most of the cases the alter-
gories software metrics from the Chidamber and Kemerer (C&K) native solutions are better, in terms of energy consumption. There
metrics suite [38]: size, complexity, coupling, and cohesion. The results also show that for larger and complex modules (methods),
results of the study were in favor of using design patterns. Mainly the design pattern solution consumes less energy.
for coupling, cohesion, and complexity. The results reported in PS47 are important and should lead to
In the study performed by Jabangwe et al. [18] that aims at more studies due to the increasing importance of energy efficient
finding a link between object-oriented metrics and external quality software, specially for portable devices. In conclusion, There is a
attributes, the author founds that it is a common practice to links shortage of studies that link software performance to the use of
object-oriented measures and proxies for maintainability. Moreover, design patterns. The reason might be that increasing performance
they found that the C&K measurement suite were the most popu- is not among the promised advantages of using design patterns.
lar across studies. While such trend is not followed by the primary Theoretically, however, since design patterns increase flexibility and
studies that we investigated, except for PS18. reusability, which can (or cannot) be at the price of decreasing per-
formance. Therefore, there is a need to empirically investigate this
In the controlled experiments included in our study, researchers issue.
performed the study by preparing a maintenance task on an existing
software. The subjects are then asked to perform the maintenance 5.2.3 Quality attributes from QMOOD quality model: In
task on the software, and/or are asked to answer a questionnaire primary studies PS33, PS39, and PS48, the authors used the
about the task. For evaluation, the following metrics are used in four QMOOD [48] metrics suite to evaluate the effectiveness of using
experiments (PS5, PS7, PS11, PS14, PS25): design patterns. The QMOOD model describes a set of measures
that aims at enabling quantitative quality evaluation of internal char-
acteristics that are unique to object-oriented designs. The measures
• Time to perform the maintenance task.
are obtainable during the design phases, and therefore, can be used
• Number of faults subjects introduced to the code when performing
to identify design flaws early in the software development cycle.
the maintenance task.
The QMOOD defined six external properties: effectiveness,
extendibility, flexibility, functionality, reusability, and understand-
In these experiments, the aim was to measure the effect of using ability. These are linked to design (internal) properties for which the
design patterns on maintenance. To do so, the authors prepared two model provides corresponding object-oriented metrics [48].
versions of each program, with and without design patterns. Then In PS33, Ampatzoglou et al. investigated reusability in software
two group of subjects, a group for each version of the program, are packages. According to QMOOD, class reusability is calculated as
asked to perform the maintenance task. Time to perform the task is follows:
recorded, and the maintenance code is inspected for faults, except reusability = 41 ∗ cohesion + 0.5 ∗ messaging + 12 ∗ size −
1
in PS14, where the authors ran test cases and counted the failed test 4 ∗ coupling
cases as a measure of the number of faults. The authors computed the average reusability of design pattern
The authors of PS25, PS27, PS29, and PS43 used a questionnaire classes and compared that with the average reusability of the soft-
about how to perform a maintenance task on a part of a software ware package. The results suggest that each pattern has different
that uses design patterns. Subjects answers are then evaluated in level of reusability.
terms of time spent to answer the questions, as a measure of under- In PS39, Sfetsos et al. evaluated the six quality attributes sug-
standability, and the correctness of the answers, as a measure of gested by QMOOD. These are: effectiveness, extendibility, flexi-
modifiability. bility, functionality, reusability, and understandability. The authors
Results obtained by controlled experiments are contradictive. The evaluated quality in software libraries and in standalone applica-
results of PS5 show that the same pattern can have different effect tions. The results, as described by the authors, are mixed, as they
on maintainability depending on the program complexity or domain. show that some design patterns have stronger effect on quality in
PS11 results show that some patterns (decorator, observer) promote standalone applications than in libraries while some other patterns
maintainability, while other patterns have a negative impact (com- have an opposite influence. There was no explanation of this.
posite, visitor). PS14 results show that design patterns have a positive In PS48, Hussain et al. evaluated the six quality attributes sug-
impact on maintainability, while PS25 results show that the use of gested by QMOOD. The authors also studied the effect of size
design patterns decreases understandability and no significant effect as a confounding factor, and the effect of design patterns on soft-
on modifiability. ware evolvability (how changing the number of patterns instances
In the remaining controlled experiments, primary studies PS27, in subsequent releases effects quality attributes). Their results show
PS29, and PS43 evaluated the effect of documentation on maintain- that generally design patterns improve reusability and flexibility
ing programs that use design patterns, while PS1 found that the use attributes.
5.3 What are the common threats to validity? • Mortality, which refers to having some subjects giving up on a
task before completing it: This threat was reported in primary studies
Validity is concerned with the credibility of results obtained by an PS11, PS14, and PS25.
empirical study. As pointed out by Yin [54], and Runeson and Mar- • Learning and fatigue effect: This threat was reported in primary
tin [25], there are four important aspects of validity, these are: (1) studies: PS11, PS25, PS26, PS27, and PS29.
Construct validity, (2) Internal validity, (3) External validity, and • Differences between subject groups (group balancing): This threat
(4) Reliability. Based on the threats to validity reported in the pri- was reported in primary studies PS5 and PS11.
mary studies, we identify the following main validity concerns for • Incorrect time measurement: This threat was reported in primary
empirical studies on evaluating the effect of using design patterns on studies PS5 and PS7.
software quality. • Communication and team work skills of subjects. Reported in
primary study PS1.
5.3.1 Construct Validity: Construct validity refers to the mean- • The accuracy of the metrics in measuring the quality attributes.
ingfulness of measurements [55]. Do the measurements actually Reported in primary study PS39.
measure what they claim? The following common threats were
reported in the primary studies: In the case studies, primary studies PS3, PS23, PS38, PS42,
and PS47, reported that it is not enough to illustrate statistically
• Accuracy of pattern mining approach: Many tools are available significant relationships. It is also important to show temporal prece-
that can mine design patterns from the design, source code, or the dence, evidence that the cause occurs before the result. To con-
executable code. However, all these tools have limitations and can clude, researchers need to report any confounding factors that might
miss some patterns (i.e., false negatives). The same issue occurs influence the studied relation between dependent and independent
when mining for patterns is performed manually. When design pat- variables.
terns in the studied programs are documented (e.g., JHotDraw),
unintentional patterns are excluded (i.e., patten realizations that 5.3.3 External Validity: External validity is concerned with
occur in the code unintentionally). Pattern mining concerns are dis- ability to generalize results [54]. The following common threats
cussed in the following primary studies: PS10, PS16, PS19, PS21, were reported in the primary studies:
PS23, PS28, PS34, PS38, PS42, and PS50.
One way to limit this threat is to use the two mining approaches,
manual and using a tool. However, it is hard to perform manual • In case studies, generalization of results can be effected by the
inspections when the subject programs are large. Therefore, improv- programming language of the subject programs, the domain of the
ing the accuracy of design pattern mining tools is important to limit subject programs, the number of subject programs, the sizes of
this threat. the subject programs, and the number of realizations (instances) of
• Assessment of changeability: In the primary studies, changeabil- design patterns in subject programs. These threats were reported in
ity of design patterns is assessed by measuring changes in classes the following primary studies: PS3, PS4, PS9, PS10, PS14, PS15,
that play a rule in a design pattern. However, changes that occur in PS16, PS17, PS19, PS21, PS23, PS24, PS28, PS29, PS33, PS34,
a class playing a role in a pattern might not necessary be related to PS35, PS38, PS39, PS42, PS47, PS48, and PS50.
the use of the pattern. The class might provide services to clients In PS34, Hegedűs et al. reported that a small number of changes
that require frequent changes, whether a pattern is used or not. This can effect the evaluation of changeability in the results of their study.
threat was reported in primary studies: PS21, PS24, and PS28. Ng et. al also discussed the effect of the number and type of changes
Another changeability assessment concern is treating all changes on external validity in PS17.
equally. As reported by Bieman et al. in PS9, some changes require • In controlled experiments and surveys, generalization of results is
more effort than others. In PS3 and PS9, Bieman et al. stated that affected by number of participants, the experience level of partici-
a large number of changes over many versions should minimize the pants (all of them are either students or senior software engineers),
impact of change effort variability. the size and complexity of the tasks, the suitability of the task, the
In PS19 and PS23, Aversano et al. reported a threat related to the effect of team work since realistic programming is a team work, pro-
assessment of change impact (change in classes that interact with gram and task representativeness, and maintenance situations. These
patterns). They used change sets to determine the impact of such threats were reported in the following primary studies: PS1, PS5,
changes in pattern classes. Change sets represent changes performed PS7, PS11, PS13, PS15, PS20, PS25, PS26, and PS27.
by a developer in terms of added, deleted, and altered source code
line. 5.3.4 Reliability: Reliability is concerned with the ability of
• Social threats (or social desirability bias). Refers to the ten- replicating the study, where the operations of a study can be repeated,
dency of individuals to present themselves in the most favorable with the same results [54]. To facilitate replication of empirical stud-
manner [56]. These threats appear in controlled experiments and ies, data sets, the analysis approach, the tools used, the source code
questionnaires. The threats were reported in primary studies PS26, of the programs (whether subject programs in case studies or tasks
PS27, and PS29. in experiments), and the design pattern detection tool used (if any),
• Transformation or preparing alternative solutions may be error- need to be publicly available. Moreover, the details of the statisti-
prone due to their complexity. This threat can occur in studies that cal analysis performed need to be described and row data must be
compare non-pattern solutions with pattern solutions, where the available to allow replicating statistical analyses. Reliability con-
authors have to prepare either solution. The threat was reported in cerns were discussed in the following primary studies: PS14, PS19,
PS47. PS21, PS23, PS42, PS47, PS49, and PS50.
• Mapping between quality attributes. This threat can be minimized
by using validated metrics. The threat was reported in PS49.
6 Threats to Validity
5.3.2 Internal Validity: Internal validity is concerned with the
cause and effect relationships, where certain conditions are believed In this section, we discuss possible threats to the validity of our
to lead to other conditions [54]. In experiments, the following study. We identified three types of threats to our study, these are:
common threats are reported in the primary studies: construct validity, internal validity, and external validity.
Construct validity refers to the meaningfulness of measurements,
• Plagiarism, which occurs in controlled experiments when subjects to what extend the measurements represent what is investigated [25,
communicate with each other to solve a task or pass information 55]. In a literature review, construct validity threats are related to
about the task to other groups: The threat was reported in primary the identification of primary studies including the following fac-
studies: PS7, PS11, PS13, PS14, PS17, PS25, and PS27. tors: inappropriate or incomplete search terms in automatic search,
incorrect search method, incomprehensible venues or database, inap- AOP that need to be handled in order to ease software develop-
propriate inclusion and exclusion criteria, or lack of expert evalua- ment via AO paradigm. Research in AOP need to solve technical
tion [57]. In our study, we performed the search on the candidate issues that prevent the wider adoption of the paradigm. However,
article metadata including the article title, abstract, and keywards. whether using AOP or another approach, we recommend studying
Articles that do not mention the phrase ”design patten” in their how to improve modularity of design patterns, since providing mod-
metadata might be omitted. However, articles that evaluate design ular solutions for patterns can promote software quality and ease
patterns would state that either in the article title or on the abstract maintainability.
and keywards. We searched five digital libraries that cover high qual- Maintainability is the most evaluated external quality attribute.
ity publications in software engineering. The inclusion and exclusion Internal quality attributes, mainly change proneness, fault proneness,
criteria that we used ensures that only quality articles are included and stability, are mapped to maintainability using various quality
(at least indexed by SCOPUS). We further assessed the list of articles models. However, comparison of results is hard due to the use of dif-
manually by meetings and discussions between the authors using ferent metrics and measuring variety of artifacts (e.g., pattern classes
predefined quality factors. We can be fairly confident that we had versus non pattern classes, pattern classes versus system classes, or
identified the most relevant articles related to our study. pattern classes and their clients versus non pattern classes). Addi-
Internal validity is concerned with the causal relationship, where tionally, the results of primary studies regrading change proneness
certain independent variables lead to other dependent variables [25, and fault proneness are contradective with few explanation for why
55]. In a literature review, internal validity threats are related to data such results are obtained. In controlled experiments, our findings
extraction [57, 58]. We extracted data from the primary studies in confirm with the results of Riaz et al. [13], which shows that there
a two-phases process carried out by the first and second authors are subtle differences in the studies design which limit comparison
where each had extracted the data from the paper and compared the of the results.
results with each other. Sometimes we faced difficulties in classify- The results of our study highlight the current state of evaluating
ing data because primary studies might use different names for the the impact of using design patterns on software quality. We hope that
same quality attribute or metrics. In such cases, we depend on the our findings and recommendations will provide an important guide
details provided by the primary study to properly classify the results. for future studies.
External validity is concerned with the generalization of the
study findings. The generalizability of this review is limited by the
generalizability of the included primary studies [59]. 8 Acknowledgments
The authors would like to thank the editors and the anonymous
reviewers for their valuable comments and suggestions. We also
7 Conclusions and Future Work sincerely thank professor Sudipto Ghosh from the department of
Computer Science at Colorado State University for his insightful
advices and many valuable suggestions to improve the quality of this
In this paper, we reported the results of a systematic literature review paper.
analyzing the current state of research on the effect of using design
patterns on software quality. We identified confounding factors that
can affect software quality, which were studied in previous research, 9 References
or recognized as a threat to validity. Our study has shown that the
1 Alexander, C., Ishikawa, S., Silverstein, M.: ‘A pattern language: towns, buildings,
primary studies provide an empirical evidence on the positive effect construction’. vol. 2. (Oxford University Press, 1977)
of documentation of designs pattern instances on program com- 2 Gamma, E., Helm, R., Johnson, R., Vlissides, J.: ‘Design patterns: elements of
prehension, and therefore, maintainability. While this result is not reusable object-oriented software’. (Pearson Education, 1994)
surprising, it has, however, two indications. First, developers should 3 Schmidt, D., Stal, M., Rohnert, H., Buschmann, F.: ‘Pattern-Oriented Software
Architecture, Volume 1: A System of Patterns’. (John Wiley & Sons, 1996)
pay more effort to add such documentation, even if in the form 4 Schmidt, D., Stal, M., Rohnert, H., Bushmann, F.: ‘Patterns for Concurrent and
of simple comments in the source code. Second, when comparing Networked Objects, volume 2 of Pattern-Oriented Software Architecture’. (Wiley,
results of different studies, the effect of documentation has to be con- 2000)
sidered. Moreover, the cost of documentation needs to be counted. 5 Prechelt, L., Unger, B., Tichy, W.F., Brossler, P., Votta, L.G.: ‘A controlled exper-
iment in maintenance: comparing design patterns to simpler solutions’, IEEE
Unintended design patterns need also to be considered. One way to Transactions on Software Engineering, 2001, 27, (12), pp. 1134–1144
find these patterns is to use a pattern mining tool. 6 Aversano, L., Cerulo, L., Di.Penta, M. ‘Relating the evolution of design patterns
The size of a software module, known as the confounding effect and crosscutting concerns’. In: Seventh IEEE International Working Conference
of class size in object-oriented software, is agreed upon to affect on Source Code Analysis and Manipulation (SCAM 2007). (Paris, France: IEEE,
2007. pp. 180–192
many software quality attributes. However, the effect of using design 7 Gatrell, M., Counsell, S., Hall, T. ‘Design patterns and change proneness: a repli-
patterns on modules, whether classes or sometimes methods, needs cation using proprietary C# software’. In: 16th Working Conference on Reverse
further investigation. Some design patterns can effect the size of Engineering (WCRE’09). (Lille, France: IEEE, 2009. pp. 160–164
participating classes, either requiring smaller classes, or requiring 8 Vokáč, M., Tichy, W., Sjøberg, D.I., Arisholm, E., Aldrin, M.: ‘A controlled experi-
ment comparing the maintainability of programs designed with and without design
classes with larger size and larger methods. Furthermore, the pat- patternsâĂŤa replication in a real programming environment’, Empirical Software
tern responsibility needs also to be considered as design patterns Engineering, 2004, 9, (3), pp. 149–195
might perform tasks that increase the size of participating classes. 9 Ampatzoglou, A., Frantzeskou, G., Stamelos, I.: ‘A methodology to assess the
Studying these relations can provide better understanding of the con- impact of design patterns on software quality’, Information and Software Tech-
nology, 2012, 54, (4), pp. 331–346
flicting results of evaluating design patterns and guide researchers 10 Nanthaamornphong, A., Carver, J.C. ‘Design patterns in software maintenance:
and practitioners on how to evaluate and deploy design patterns. An experiment replication at University of Alabama’. In: Second International
Modularity and separation of concerns are key concepts in object- Workshop on Replication in Empirical Software Engineering Research (RESER).
oriented software. When using design patterns, providing a modular (Banff, AB, Canada: IEEE, 2011. pp. 15–24
11 Ali, M., Elish, M.O. ‘A Comparative Literature Survey of Design Patterns Impact
perceptive of a pattern can ease maintaining and extending patterns. on Software Quality’. In: International Conference on Information Science and
From the primary studies, we can identify too main issues when Applications (ICISA). (Suwon, South Korea: IEEE, 2013. pp. 1–7
studying modularity of design patterns. First, it is not clear how to 12 Ampatzoglou, A., Charalampidou, S., Stamelos, I.: ‘Research state of the art on
limit the design pattern into a module. There are many dependen- GoF design patterns: A mapping study’, Journal of Systems and Software, 2013,
86, (7), pp. 1945–1964
cies between design pattern classes and other classes. Considering 13 Riaz, M., Breaux, T., Williams, L.: ‘How have we evaluated software pattern appli-
participating classes only neglects such dependencies. Second, what cation? a systematic mapping study of research design practices’, Information and
metrics should be used to measure modularity should be decided. As Software Technology, 2015, 65, pp. 14 – 38
for measuring quality attributes, various metrics are available. 14 Weiss, M. ‘Patterns and their Impact on System Concerns’. In: 13th Annual
European Conference on Pattern Languages of Programming (EuroPLoP). (Irsee,
Some of the primary studies suggested modularizing design pat- Germany, 2008. pp. S2–1–S2–10
terns using AOP. The results obtained by these studies are promising 15 Zhang, C., Budgen, D.: ‘What do we know about the effectiveness of software
(PS6, PS22, and PS41). However, there are some issues related to design patterns?’, IEEE Transactions on Software Engineering, 2012, 38, (5),
pp. 1213–1231 International Workshop on Advanced Software Development Tools and Tech-
16 Mayvan, B.B., Rasoolzadegan, A., Yazdi, Z.G.: ‘The state of the art on design niques. (Paphos, Cyprus, 2008. p. 14
patterns: A systematic mapping of the literature’, Journal of Systems and Software, 46 Tanter, É., Figueroa, I., Tabareau, N.: ‘Execution levels for aspect-oriented pro-
2017, 125, pp. 93 – 118 gramming: Design, semantics, implementations and applications’, Science of
17 Kitchenham, B. ‘Procedures for performing systematic reviews’. (Keele, UK, Computer Programming, 2014, 80, pp. 311–342
Keele University, 2004. TR/SE-0401 47 Leger, P., Fukuda, H. ‘Why do developers not take advantage of the progress in
18 Jabangwe, R., Börstler, J., Šmite, D., Wohlin, C.: ‘Empirical evidence on the modularity?’. In: Proceedings of the 8th International Conference on Bioinspired
link between object-oriented measures and external quality attributes: a systematic Information and Communications Technologies. (ICST, 2014. pp. 388–389
literature review’, Empirical Software Engineering, 2015, 20, (3), pp. 640–693 48 Bansiya, J., Davis, C.G.: ‘A hierarchical model for object-oriented design quality
19 Dyba, T., Dingsoyr, T., Hanssen, G.K. ‘Applying systematic reviews to diverse assessment’, IEEE Transactions on software engineering, 2002, 28, (1), pp. 4–17
study types: An experience report’. In: First International Symposium on Empirical 49 Bakota, T., Hegedűs, P., Körtvélyesi, P., Ferenc, R., Gyimóthy, T. ‘A probabilis-
Software Engineering and Measurement (ESEM 2007). (IEEE, 2007. pp. 225–234 tic software quality model’. In: Software Maintenance (ICSM), 2011 27th IEEE
20 Scopus: abstract and citation database of peer-reviewed literature. (Elsevier, 2019. International Conference on. (Williamsburg, VI, USA: IEEE, 2011. pp. 243–252
https://www.elsevier.com/solutions/scopus 50 Ampatzoglou, A., Chatzigeorgiou, A., Charalampidou, S., Avgeriou, P.: ‘The effect
21 Aridor, Y., Lange, D.B. ‘Agent design patterns: elements of agent application of gof design patterns on stability: A case study’, IEEE Transactions on Software
design’. In: Proceedings of the Second International Conference on Autonomous Engineering, 2015, 41, (8), pp. 781–802
Agents. (Minneapolis, Minnesota: ACM, 1998. pp. 108–115 51 ISO 9126: ‘Information Technology: Software product evaluation, quality charac-
22 Izurieta, C., Bieman, J.M. ‘How software designs decay: A pilot study of pattern teristics and guidelines for their use’, International Organization for Standardiza-
evolution’. In: First International Symposium on Empirical Software Engineering tion, 1992,
and Measurement (ESEM). (Madrid, Spain: IEEE, 2007. pp. 449–451 52 ISO 25010: ‘ISO/IEC 25010:2011: 2011 Systems and software engineering–
23 Izurieta, C., Bieman, J.M.: ‘A multiple case study of design pattern decay, grime, Systems and software Quality Requirements and Evaluation (SQuaRE)–System
and rot in evolving software systems’, Software Quality Journal, 2013, 21, (2), and software quality models’, International Organization for Standardization,
pp. 289–323 2011, p. 34
24 Kothari, C.R.: ‘Research methodology: Methods and techniques’. (New Age 53 Bieman, J.M., Jain, D., Yang, H.J. ‘OO design patterns, design structure, and pro-
International, 2004) gram changes: an industrial case study’. In: Proceedings of the IEEE International
25 Runeson, P., Höst, M.: ‘Guidelines for conducting and reporting case study Conference on Software Maintenance. (Florence, Italy: IEEE, 2001. pp. 580–589
research in software engineering’, Empirical Software Engineering, 2008, 14, (2), 54 Yin, R.K.: ‘Case study research: Design and methods’. 4th ed. (Sage publications,
pp. 131–164 2009)
26 Guéhéneuc, Y.G. ‘P-mart: Pattern-like micro architecture repository’. In: Pro- 55 Kerlinger, F.N., Lee, H.B.: ‘Foundations of behavioral research’. (Wadsworth
ceedings of the 1st EuroPLoP Focus Group on Pattern Repositories. (Ptidej, 2007. Publishing, 1999)
pp. 1–3 56 Fisher, R.J.: ‘Social desirability bias and the validity of indirect questioning’,
27 Dong, J., Zhao, Y., Peng, T.: ‘A review of design pattern mining techniques’, Inter- Journal of consumer research, 1993, 20, (2), pp. 303–315
national Journal of Software Engineering and Knowledge Engineering, 2009, 19, 57 Zhou, X., Jin, Y., Zhang, H., Li, S., Huang, X. ‘A map of threats to validity
(06), pp. 823–855 of systematic literature reviews in software engineering’. In: 2016 23rd Asia-
28 Tsantalis, N., Chatzigeorgiou, A., Stephanides, G., Halkidis, S.T.: ‘Design pattern Pacific Software Engineering Conference (APSEC). (Hamilton, New Zealand:
detection using similarity scoring’, IEEE transactions on software engineering, IEEE, 2016. pp. 153–160
2006, 32, (11), pp. 896–909 58 Afacan, T. ‘State Design Pattern Implementation of a DSP processor: A case study
29 Guéhéneuc, Y.G., Antoniol, G.: ‘Demima: A multilayered approach for design pat- of TMS5416C’. In: 6th IEEE International Symposium on Industrial Embedded
tern identification’, IEEE Transactions on Software Engineering, 2008, 34, (5), Systems (SIES). (Vasteras, Sweden: IEEE, 2011. pp. 67–70
pp. 667–684 59 Petersen, K., Vakkalanka, S., Kuzniarz, L.: ‘Guidelines for conducting systematic
30 Guéhéneuc, Y.G. ‘Ptidej: A flexible reverse engineering tool suite’. In: Software mapping studies in software engineering: An update’, Information and Software
Maintenance, 2007. ICSM 2007. IEEE International Conference on. (Paris, France: Technology, 2015, 64, pp. 1–18
IEEE, 2007. pp. 529–530
31 Shi, N., Olsson, R.A. ‘Reverse engineering of design patterns from Java source
code’. In: 21st IEEE/ACM International Conference on Automated Software
Engineering (ASE’06). (Tokyo, Japan: IEEE, 2006. pp. 123–134
32 Wohlin, C., Höst, M., Henningsson, K. ‘Empirical research methods in soft-
ware engineering’. In: Empirical methods and studies in software engineering.
(Springer, 2003. pp. 7–23
33 Kitchenham, B.A., Pfleeger, S.L., Pickard, L.M., Jones, P.W., Hoaglin, D.C.,
El.Emam, K., et al.: ‘Preliminary guidelines for empirical research in soft-
10 List of Primary Studies
ware engineering’, IEEE Transactions on software engineering, 2002, 28, (8),
pp. 721–734 PS1 Unger, B., Tichy, W.F.: ‘Do design patterns improve communication? an exper-
34 Robson, C.: ‘Real world research: A resource for social scientists and practitioners- iment with pair design’. In: International Workshop Empirical Studies of
researchers’, Massachusetts: Blackwell Pushers, 1993, Software Maintenance. (Limerick, Ireland: ACM, 2000. pp. 1–5)
35 Forward, A., Lethbridge, T.C. ‘The relevance of software documentation, tools PS2 Huston, B.: ‘The effects of design pattern application on metric scores’, Journal
and technologies: a survey’. In: Proceedings of the 2002 ACM symposium on of Systems and Software, 2001, 58, (3), pp. 261–269
Document engineering. (McLean, Virginia, USA: ACM, 2002. pp. 26–33 PS3 Bieman, J.M., Jain, D., Yang, H.J.: ‘OO design patterns, design structure, and
36 de Souza, S.C.B., Anquetil, N., de Oliveira, K.M. ‘A study of the documentation program changes: an industrial case study’. In: Proceedings of the IEEE Inter-
essential to software maintenance’. In: Proceedings of the 23rd annual inter- national Conference on Software Maintenance. (Florence, Italy: IEEE, 2001.
national conference on Design of communication: documenting & designing for pp. 580–589)
pervasive information. (Coventry, United Kingdom: ACM, 2005. pp. 68–75 PS4 McNatt, W.B., Bieman, J.M.: ‘Coupling of design patterns: Common prac-
37 El.Emam, K., Benlarbi, S., Goel, N., Rai, S.N.: ‘The confounding effect of class tices and their benefits’. In: 25th Annual International Computer Software
size on the validity of object-oriented metrics’, IEEE Transactions on Software and Applications Conference (COMPSAC). (Chicago, IL, USA: IEEE, 2001.
Engineering, 2001, 27, (7), pp. 630–650 pp. 574–579)
38 Chidamber, S.R., Kemerer, C.F.: ‘A metrics suite for object oriented design’, IEEE PS5 Prechelt, L., Unger, B., Tichy, W.F., Brossler, P., Votta, L.G.: ‘A controlled exper-
Transactions on software engineering, 1994, 20, (6), pp. 476–493 iment in maintenance: comparing design patterns to simpler solutions’, IEEE
39 Pree, W. ‘Meta patterns – a means for capturing the essentials of reusable object- Transactions on Software Engineering, 2001, 27, (12), pp. 1134–1144
oriented design’. In: European Conference on Object-Oriented Programming. PS6 Hannemann, J., Kiczales, G.: ‘Design pattern implementation in java and
ECOOP ’94. (London, UK, UK: Springer-Verlag, 1994. pp. 150–162 aspectj’. In: Proceedings of the 17th ACM SIGPLAN Conference on Object-
40 Wedyan, F., Ghosh, S., Vijayasarathy, L.R.: ‘An approach and tool for measure- oriented Programming, Systems, Languages, and Applications. OOPSLA ’02.
ment of state variable based data-flow test coverage for aspect-oriented programs’, (Seattle, Washington, USA: ACM, 2002. pp. 161–173)
Information and Software Technology, 2015, 59, pp. 233–254 PS7 Prechelt, L., Unger.Lamprecht, B., Philippsen, M., Tichy, W.F.: ‘Two controlled
41 Eaddy, M., Zimmermann, T., Sherwood, K.D., Garg, V., Murphy, G.C., Nagap- experiments assessing the usefulness of design pattern documentation in pro-
pan, N., et al.: ‘Do crosscutting concerns cause defects?’, IEEE transactions on gram maintenance’, IEEE Transactions on Software Engineering, 2002, 28, (6),
Software Engineering, 2008, 34, (4), pp. 497–515 pp. 595–606
42 Eaddy, M., Aho, A., Murphy, G.C. ‘Identifying, assigning, and quantifying PS8 Baudry, B., Traon, Y., Sunyé, G., Jézéquel, J.M.: ‘Measuring and improv-
crosscutting concerns’. In: Proceedings of the First International Workshop on ing design patterns testability’. In: Proceedings Ninth International Software
Assessment of Contemporary Modularization Techniques. ACoM ’07. (Minneapo- Metrics Symposium. (Sydney, NSW, Australia: IEEE, 2003. pp. 50–59)
lis, Minnesota: IEEE, 2007. pp. 2–. Available from: http://dx.doi.org/ PS9 Bieman, J.M., Straw, G., Wang, H., Munger, P.W., Alexander, R.T.: ‘Design
10.1109/ACOM.2007.4 patterns and change proneness: An examination of five evolving systems’. In:
43 Wedyan, F., Ghosh, S. ‘A dataflow testing approach for aspect-oriented programs’. Proceedings of the Ninth international Software metrics symposium. (Sydney,
In: High-Assurance Systems Engineering (HASE), 2010 IEEE 12th International NSW, Australia: IEEE, 2003. pp. 40–49)
Symposium on. (IEEE, 2010. pp. 64–73 PS10 Vokáč, M.: ‘Defect frequency and design patterns: An empirical study of
44 The AspectJ Team. ‘AspectJ Compiler 1.8.10’. (, 2017. http://www. industrial code’, IEEE Transactions on Software Engineering, 2004, 30, (12),
eclipse.org/aspectj/ pp. 904–917
45 Roo, A., Hendriks, M., Havinga, W., Durr, P., Bergmans, L. ‘Compose*: a PS11 Vokáč, M., Tichy, W., Sjøberg, D.I., Arisholm, E., Aldrin, M.: ‘A controlled
language-and platform-independent aspect compiler for composition filters’. In: experiment comparing the maintainability of programs designed with and
without design patternsâĂŤa replication in a real programming environment’,
Empirical Software Engineering, 2004, 9, (3), pp. 149–195
PS12 Rudzki, J.: ‘How design patterns affect application performance–a case of PS36 Gatrell, M., Counsell, S.: ‘Faults and their relationship to implemented patterns,
a multi-tier J2EE application’. In: FIDJI’04 Proceedings of the 4th interna- coupling and cohesion in commercial c# software’, International Journal of
tional conference on Scientific Engineering of Distributed Java Applications. Information System Modeling and Design (IJISMD), 2012, 3, (2), pp. 69–88
(Luxembourg-Kirchberg, Luxembourg: Springer, 2005. pp. 12–23) PS37 Ng, T.H., Yu, Y.T., Cheung, S.C., Chan, W.K.: ‘Human and program fac-
PS13 Ng, T., Cheung, S., Chan, W., Yu, Y.: ‘Toward effective deployment of design tors affecting the maintenance of programs with deployed design patterns’,
patterns for software extension: a case study’. In: Proceedings of the 2006 Information and Software Technology, 2012, 54, (1), pp. 99–118
international workshop on Software quality. (Shanghai, China: ACM, 2006. PS38 Izurieta, C., Bieman, J.M.: ‘A multiple case study of design pattern decay, grime,
pp. 51–56) and rot in evolving software systems’, Software Quality Journal, 2013, 21, (2),
PS14 Ng, T., Cheung, S., Chan, W.K., Yu, Y.T.: ‘Work experience versus refactoring pp. 289–323
to design patterns: a controlled experiment’. In: Proceedings of the 14th ACM PS39 Sfetsos, P., Ampatzoglou, A., Chatzigeorgiou, A., Deligiannis, I., Stamelos,
SIGSOFT international symposium on Foundations of software engineering. I.: ‘A comparative study on the effectiveness of patterns in software libraries
(Portland, Oregon, USA: ACM, 2006. pp. 12–22) and standalone applications’. In: 9th International Conference on the Qual-
PS15 Ellis, B., Stylos, J., Myers, B.: ‘The factory pattern in API design: A usability ity of Information and Communications Technology (QUATIC). (Guimaraes,
evaluation’. In: Proceedings of the 29th international conference on Software Portugal: IEEE, 2014. pp. 145–150)
Engineering. (Minneapolis, MN, USA: IEEE, 2007. pp. 302–312) PS40 Rossi, B., Russo, B.: ‘Evolution of design patterns: a replication study’. In: Pro-
PS16 Aversano, L., Cerulo, L., Di.Penta, M.: ‘Relating the evolution of design patterns ceedings of the 8th ACM/IEEE International Symposium on Empirical Software
and crosscutting concerns’. In: Seventh IEEE International Working Conference Engineering and Measurement. (Torino, Italy: ACM, 2014. article 38)
on Source Code Analysis and Manipulation (SCAM 2007). (Paris, France: IEEE, PS41 Cacho, N., SantâĂŹanna, C., Figueiredo, E., Dantas, F., Garcia, A., Batista, T.:
2007. pp. 180–192) ‘Blending design patterns with aspects: A quantitative study’, Journal of Systems
PS17 Ng, T., Cheung, S., Chan, W., Yu, Y.T.: ‘Do maintainers utilize deployed design and Software, 2014, 98, pp. 117–139
patterns effectively?’. In: Proceedings of the 29th international conference on PS42 Jaafar, F., Guéhéneuc, Y.G., Hamel, S., Khomh, F., Zulkernine, M.: ‘Evaluat-
Software Engineering. (Minneapolis, MN, USA: IEEE, 2007. pp. 168–177) ing the impact of design pattern and anti-pattern dependencies on changes and
PS18 Ampatzoglou, A., Chatzigeorgiou, A.: ‘Evaluation of object-oriented design pat- faults’, Empirical Software Engineering, 2015, 21, (3), pp. 896–931
terns in game development’, Information and Software Technology, 2007, 49, PS43 Scanniello, G., Gravino, C., Risi, M., Tortora, G., Dodero, G.: ‘Documenting
(5), pp. 445–454 design-pattern instances: a family of experiments on source-code comprehensi-
PS19 Aversano, L., Canfora, G., Cerulo, L., Del.Grosso, C., Di.Penta, M.: ‘An empir- bility’, ACM Transactions on Software Engineering and Methodology (TOSEM),
ical study on the evolution of design patterns’. In: Proceedings of the 6th joint 2015, 24, (3), pp. 14
meeting of the European software engineering conference and the ACM SIG- PS44 Elish, M.O., Mohammed, M.A.: ‘Quantitative analysis of fault density in design
SOFT symposium on The foundations of software engineering. (Dubrovnik, patterns: An empirical study’, Information and Software Technology, 2015, 66,
Croatia: ACM, 2007. pp. 385–394) pp. 58 – 72
PS20 Khomh, F., Guéhéneuc, Y.G.: ‘Do design patterns impact software quality PS45 Ampatzoglou, A., Chatzigeorgiou, A., Charalampidou, S., Avgeriou, P.: ‘The
positively?’. In: 12th European Conference on Software Maintenance and effect of gof design patterns on stability: A case study’, IEEE Transactions on
Reengineering (CSMR). (Athens, Greece: IEEE, 2008. pp. 274–278) Software Engineering, 2015, 41, (8), pp. 781–802
PS21 Penta, M.D., Cerulo, L., Guéhéneuc, Y.G., Antoniol, G.: ‘An empirical study PS46 Walter, B., Alkhaeir, T.: ‘The relationship between design patterns and code
of the relationships between design pattern roles and class change proneness’. smells: An exploratory study’, Information and Software Technology, 2016, 74,
In: IEEE International Conference on Software Maintenance (ICSM). (Beijing, pp. 127–142
China: IEEE, 2008. pp. 217–226) PS47 Feitosa, D., Alders, R., Ampatzoglou, A., Avgeriou, P., Nakagawa, E.Y.: ‘Inves-
PS22 Garcia, A., Sant’Anna, C., Figueiredo, E., Kulesza, U., Lucena, C., von Staa, A.: tigating the effect of design patterns on energy consumption’, Journal of
‘Modularizing design patterns with aspects: a quantitative study’. In: Transac- Software: Evolution and Process, 2017, 29, (2), pp. e1851.
tions on Aspect-Oriented Software Development I. Springer, 2006. pp. 36–74 PS48 Hussain, S., Keung, J., Khan, A.A.: ‘The effect of gang-of-four design patterns
PS23 Aversano, L., Cerulo, L., Di.Penta, M.: ‘Relationship between design patterns usage on design quality attributes’. In: IEEE International Conference on Soft-
defects and crosscutting concern scattering degree: an empirical study’, IET ware Quality, Reliability and Security (QRS). (Prague, Czech Republic: IEEE,
software, 2009, 3, (5), pp. 395–409 2017. pp. 263–273)
PS24 Gatrell, M., Counsell, S., Hall, T.: ‘Design patterns and change proneness: a PS49 Charalampidou, S., Ampatzoglou, A., Avgeriou, P., Sencer, S., Arvanitou, E.M.,
replication using proprietary C# software’. In: 16th Working Conference on Stamelos, I.: ‘A theoretical model for capturing the impact of design patterns on
Reverse Engineering (WCRE’09). (Lille, France: IEEE, 2009. pp. 160–164) quality: the decorator case study’. In: The 32nd ACM SIGAPP Symposium On
PS25 Garzás, J., García, F., Piattini, M.: ‘Do rules and patterns affect design main- Applied Computing. (Marrakech, Morocco: ACM, 2017. pp. 1231–1238)
tainability?’, Journal of Computer Science and Technology, 2009, 24, (2), PS50 Feitosa, D., Ampatzoglou, A., Avgeriou, P., Nakagawa, E.Y.: ‘Correlating
pp. 262–272 pattern grime and quality attributes’, IEEE Access, 2018, 6, pp. 23065–23078
PS26 Jeanmart, S., Gueheneuc, Y.G., Sahraoui, H., Habra, N.: ‘Impact of the visitor
pattern on program comprehension and maintenance’. In: Proceedings of the 3rd
International Symposium on Empirical Software Engineering and Measurement.
(Lake Buena Vista,Florida USA: IEEE, 2009. pp. 69–78
PS27 Scanniello, G., Gravino, C., Risi, M., Tortora, G.: ‘A controlled experiment for
assessing the contribution of design pattern documentation on software main-
tenance’. In: Proceedings of the 2010 ACM-IEEE International Symposium
on Empirical Software Engineering and Measurement. (Bolzano-Bozen, Italy:
ACM, 2010. article 52)
PS28 Gatrell, M., Counsell, S.: ‘Design patterns and fault-proneness a study of com-
mercial C# software’. In: Fifth International Conference on Research Challenges
in Information Science (RCIS). (Gosier, France: IEEE, 2011. pp. 1–8)
PS29 Gravino, C., Risi, M., Scanniello, G., Tortora, G.: ‘Does the documentation
of design pattern instances impact on source code comprehension? Results
from two controlled experiments’. In: 18th Working Conference on Reverse
Engineering (WCRE). (Limerick, Ireland: IEEE, 2011. pp. 67–76)
PS30 Hsueh, N.L., Wen, L.C., Ting, D.H., Chu, W., Chang, C.H., Koong, C.S.: ‘An
approach for evaluating the effectiveness of design patterns in software evolu-
tion’. In: IEEE 35th Annual Computer Software and Applications Conference
Workshops (COMPSACW). (Munich, Germany: IEEE, 2011. pp. 315–320)
PS31 Ampatzoglou, A., Kritikos, A., Arvanitou, E.M., Gortzis, A., Chatziasimidis, F.,
Stamelos, I.: ‘An empirical investigation on the impact of design pattern appli-
cation on computer game defects’. In: Proceedings of the 15th International
Academic MindTrek Conference: Envisioning Future Media Environments.
(Tampere, Finland: ACM, 2011. pp. 214–221)
PS32 Posnett, D., Bird, C., Dévanbu, P.: ‘An empirical study on the influence of pat-
tern roles on change-proneness’, Empirical Software Engineering, 2011, 16, (3),
pp. 396–423
PS33 Ampatzoglou, A., Kritikos, A., Kakarontzas, G., Stamelos, I.: ‘An empiri-
cal investigation on the reusability of design patterns and software packages’,
Journal of Systems and Software, 2011, 84, (12), pp. 2265–2283
PS34 Hegedűs, P., Bán, D., Ferenc, R., Gyimóthy, T.: ‘Myth or reality? analyzing the
effect of design patterns on software maintainability’. In: Computer Applica-
tions for Software Engineering, Disaster Recovery, and Business Continuity.
(Springer, 2012. pp. 138–145)
PS35 Ampatzoglou, A., Frantzeskou, G., Stamelos, I.: ‘A methodology to assess
the impact of design patterns on software quality’, Information and Software
Technology, 2012, 54, (4), pp. 331–346