Abstract
Clone has emerged as a controversial term in software engineering research and practice. The impact of clones is of great importance from software maintenance perspectives. Stability is a well investigated term in assessing the impacts of clones on software maintenance. If code clones appear to exhibit a higher instability (i.e., higher change-proneness) than non-cloned code, then we can expect that code clones require higher maintenance effort and cost than non-cloned code. A number of studies have been done on the comparative stability of cloned and non-cloned code. However, these studies could not come to a consensus. While some studies show that code clones are more stable than non-cloned code, the other studies provide empirical evidence of higher instability of code clones. The possible reasons behind these contradictory findings are that different studies investigated different aspects of stability using different clone detection tools on different subject systems using different experimental setups. Also, the subject systems were not of wide varieties. Emphasizing these issues (with several others mentioned in the motivation) we have conducted a comprehensive empirical study where we have - (i) implemented and investigated seven existing methodologies that explored different aspects of stability, (ii) used two clone detection tools (NiCad and CCFinderX) to implement each of these seven methodologies, and (iii) investigated the stability of three types (Type-1, Type-2, Type-3) of clones. Our investigation on 12 diverse subject systems covering three programming languages (Java, C, C#) with a list of 8 stability assessment metrics suggest that (i) cloned code is often more unstable (change-prone) than non-cloned code in the maintenance phase, (ii) both Type 1 and Type 3 clones appear to exhibit higher instability than Type 2 clones, (iii) clones in Java and C programming languages are more change-prone than the clones in C#, and (iv) changes to the clones in procedural programming languages seem to be more dispersed than the changes to the clones in object oriented languages. We also systematically replicated the original studies with their original settings and found mostly equivalent results as of the original studies. We believe that our findings are important for prioritizing code clones from management perspectives.









































Similar content being viewed by others
Notes
SourceForge: http://sourceforge.net/
References
Aversano L, Cerulo L, Penta MD (2007) How clones are maintained: an empirical study. In: Proc CSMR, pp 81–90
Bakota T, Ferenc R, Gyimothy T (2007) Clone smells in software evolution. In: Proc ICSM, pp 24–33
Barbour L, Khomh F, Zou Y (2011) Late propagation in software clones. In: Proc ICSM, pp 273–282
Barbour L, Khomh F, Zou Y (2013) An empirical study of faults in late propagation clone genealogies. J Softw Evol Process 25(11):1139–1165
Bettenburg N, Shang W, Ibrahim W, Adams B, Zou Y, Hassan A (2009) An empirical study on inconsistent changes to code clones at release level. In: Proc WCRE, pp 85–94
Cai D, Kim M (2011) An empirical study of long-lived code clones. In: Proc FASE/ETAPS, pp 432–446
Chatterji D, Carver JC, Kraft NA (2016) Code clones and developer behavior: results of two surveys of the clone research community. Empir Softw Eng 21(4):1476–1508
Cordy JR, Roy CK (2011) The NiCad Clone Detector. In: Proc ICPC (Tool Demo Track), pp 219–220
Göde N, Harder J (2011) Clone Stability. In: Proc CSMR, pp 65–74
Göde N, Koschke R (2009) Incremental clone detection. In: Proc CSMR, pp 219–228
Göde N, Koschke R (2010) Studying clone evolution using incremental clone detection. JSME 25(2):165–192
Göde N, Koschke R (2011) Frequency and risks of changes to clones. In: Proc ICSE, pp 311–320
Harder J, Göde N (2013) Cloned code: stable code. J Softw Evol Process 25(10):1063–1088
Higo Y, Kusumoto S (2009) Significant and scalable code clone detection with program dependency graph. In: Proc WCRE, pp 315–316
Hordijk W, Ponisio M, Wieringa R (2009) Harmfulness of code duplication—a structured review of the evidence. In: Proc EASE, pp 88–97
Hotta K, Sano Y, Higo Y, Kusumoto S (2010) Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software. In: Proc EVOL/IWPSE, pp 73–82
Islam JF, Mondal M, Roy CK (2016) Bug replication in code clones: an empirical study. In: Proc. SANER, pp 68–78
Jarzabek S, Xu Y (2010) Are clones harmful for maintenance? In: Proc IWSC, pp 73–74
Juergens E, Deissenboeck F, Hummel B, Wagner S (2009) Do code clones matter? In: Proc ICSE, pp 485–495
Kamiya T, Kusumoto S, Inoue K (2002) CCFinder: a multilinguistic token-based code clone detection system for large scale source code. TSE 28(7):654–670
Kapser C, Godfrey MW (2008) “Cloning considered harmful” considered harmful: patterns of cloning in software. ESE 13(6):645–692
Kim M, Sazawal V, Notkin D, Murphy GC (2005) An empirical study of code clone genealogies. In: Proc ESEC-FSE, pp 187–196
Krinke J (2007) A study of consistent and inconsistent changes to code clones. In: Proc WCRE, pp 170–178
Krinke J (2008) Is cloned code more stable than non-cloned code? In: Proc SCAM, pp 57–66
Krinke J (2011) Is cloned code older than non-cloned code? In: Proc IWSC, pp 28–33
Li J, Ernst MD (2011) CBCD: cloned buggy code detector. University of Washington Department of Computer Science and Engineering technical report UW-CSE-11-05-02. Seattle, May 2, 2011. Revised
Lozano A, Wermelinger M (2008) Assessing the effect of clones on changeability. In: Proc ICSM, pp 227–236
Lozano A, Wermelinger M (2010) Tracking clones’ imprint. In: Proc IWSC, pp 65–72
Lozano A, Wermelinger M, Nuseibeh B (2007) Evaluating the harmfulness of cloning: a change based experiment. In: Proc MSR, p 18
Mondal M, Rahman MS, Saha RK, Roy CK, Krinke J, Schneider KA (2011) An empirical study of the impacts of clones in software maintenance. In: Proc ICPC student research symposium track, pp 242–245
Mondal M, Roy CK, Rahman S, Saha RK, Krinke J, Schneider K A (2012a) Comparative stability of cloned and non-cloned code: an empirical study. In: Proc SAC, pp 1227–1234
Mondal M, Roy CK, Schneider KA (2012b) An empirical study on clone stability. ACR 12(3):20–36
Mondal M, Roy CK, Schneider KA (2012c) Dispersion of changes in cloned and non-cloned code. In: Proc IWSC, pp 29–35
Mondal M, Roy CK, Schneider K (2014a) An insight into the dispersion of changes in cloned and non-cloned code: a genealogy based empirical study. SCP 95(4):445–468
Mondal M, Roy CK, Schneider KA (2014b) Late propagation in near-miss clones: an empirical study. ECEASST 63:1–17
Mondal M, Roy CK, Schneider K A (2015) A comparative study on the bug-proneness of different types of code clones. In: Proc ICSME, pp 91–100
Mondal M, Roy CK, Schneider KA (2016) A comparative study on the intensity and harmfulness of late propagation in near-miss code clone. Softw Qual J 24(4):883–915
Rahman F, Bird C, Devanbu P (2010) Clones: what is that smell? In: Proc MSR, pp 72–81
Rahman MS, Roy CK (2014) A change-type based empirical study on the stability of cloned code. In: Proc SCAM, pp 31–40
Rahman MS, Aryani A, Roy CK, Perin F (2013) On the relationships between domain-based coupling and code clones: an exploratory study. In: Proc ICSE NIER track, pp 1265–1268
Roy CK, Cordy JR (2008a) An empirical evaluation of function clones in open source software. In: Proc WCRE, pp 81–90
Roy CK, Cordy JR (2008b) NICAD: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In: Proc ICPC, pp 172–181
Roy CK, Cordy JR (2008c) Scenario-based comparison of clone detection techniques. In: Proc ICPC, pp 153–162
Roy CK, Cordy JR (2009) A mutation/injection-based automatic framework for evaluating code clone detection tools. In: Proc mutation, pp 157–166
Roy CK, Cordy JR, Koschke R (2009) Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. SCP 2009(74):470–495
Roy CK, Zibran MF, Koschke R (2014) The vision of software clone management: past, present and future. In: Proc CSMR-18/WCRE-21 software evolution week, p 16
Saha RK, Asaduzzaman M, Zibran MF, Roy CK, Schneider KA (2010) Evaluating code clone genealogies at release level: an empirical study. In: Proc SCAM, pp 87–96
Saha RK, Roy CK, Schneider KA (2011) An automatic framework for extracting and classifying near-miss clone genealogies. In: Proc ICSM, pp 293–302
Selim GMK, Barbour L, Shang W, Adams B, Hassan AE, Zou Y (2010) Studying the impact of clones on software defects. In: Proc WCRE, pp 13–21
Thummalapenta S, Cerulo L, Aversano L, Penta MD (2009) An empirical study on the maintenance of source code clones. ESE 15(1):1–34
Wang T, Harman M, Jia Y, Krinke J (2013) Searching for better configurations: a rigorous approach to clone evaluation. In: Proc ESEC/SIGSOFT FSE, pp 455–465
Wang X, Dang Y, Zhang L, Zhang D, Lan E, Mei H (2012) Can I clone this piece of code here? In: Proc ASE, pp 170–179
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Miryung Kim
Rights and permissions
About this article
Cite this article
Mondal, M., Rahman, M.S., Roy, C.K. et al. Is cloned code really stable?. Empir Software Eng 23, 693–770 (2018). https://doi.org/10.1007/s10664-017-9528-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-017-9528-y