Show simple item record

dc.contributor.authorSourav, Sumiten
dc.date2015-09-29 14:43:57.365
dc.date.accessioned2015-10-01T23:49:27Z
dc.date.available2015-10-01T23:49:27Z
dc.date.issued2015-10-01
dc.identifier.urihttp://hdl.handle.net/1974/13718
dc.descriptionThesis (Master, Computing) -- Queen's University, 2015-09-29 14:43:57.365en
dc.description.abstractCode clones are code snippets that come into existence when developers copy paste (and possibly modify) an existing piece of code. Studies show that cloning is an inevitable phenomenon leading to a significant presence of code clones (as much as 10%-30% of the source code consists of cloned code) in large software systems. To effectively manage these clones, researchers have proposed multiple activities along two dimensions: 1) proactive clone management, and 2) post-mortem clone management. Proactive clone management emphasis is on activities that prevent the introduction of new clones into the source code (e.g., identifying the factors that influence developers to clone code). On the other hand, post-mortem clone management focuses on managing the existing clones (e.g., detection or refactoring of a clone). In this thesis, we examine several open issues along both dimensions of clone management activities. For example, over 80% of research focuses on the detection of clones and studying their impact on code quality. However, limited research has examined the factors that make code more likely to be cloned. We find that an increase in the complexity of a method increases the likelihood of code being cloned from that method. Moreover, while there exists more than 70 clone detection tools, limited techniques are available to evaluate the performance (especially recall) of these tools. We find that current state-of-the-art framework for the evaluation of clone detection tools tend to overestimate their recall. Hence, we propose a statistically rigorous framework to evaluate the recall of clone detection tools. In addition, little has been studied to determine the life expectancy of an introduced clone. We find that the characteristics of a clone (e.g., size of a clone, or the directory-structure distance between clone siblings) at the time of its introduction highly influence its life expectancy. Practitioners and researchers can leverage our findings for: a) effective proactive clone management (e.g., by developing tools to propose the abstraction of code that is likely to be cloned) and b) effective post-mortem clone management by leveraging our framework to more accurately evaluate the recall of clone detection tools and by determining whether an introduced clone will be short-lived or long-lived to efficiently recommend clone management activities (e.g., annotation or refactoring of clones, especially the long-lived ones).en
dc.language.isoengen
dc.relation.ispartofseriesCanadian thesesen
dc.rightsQueen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canadaen
dc.rightsProQuest PhD and Master's Theses International Dissemination Agreementen
dc.rightsIntellectual Property Guidelines at Queen's Universityen
dc.rightsCopying and Preserving Your Thesisen
dc.rightsCreative Commons - Attribution - CC BYen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectClone Managementen
dc.subjectSurvival Analysisen
dc.subjectClone Detectionen
dc.titleLeveraging Historical Code Changes to Support Clone Management Activitiesen
dc.typethesisen
dc.description.degreeM.Sc.en
dc.contributor.supervisorHassan, Ahmed E.en
dc.contributor.departmentComputingen
dc.degree.grantorQueen's University at Kingstonen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record