Show simple item record

dc.contributor.authorChen, Tse-Hsun
dc.contributor.otherQueen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))en
dc.date2012-12-27 11:38:01.181en
dc.date2013-01-04 22:58:42.323en
dc.date2013-01-08 10:10:37.878en
dc.date.accessioned2013-01-14T18:24:57Z
dc.date.available2013-01-14T18:24:57Z
dc.date.issued2013-01-14
dc.identifier.urihttp://hdl.handle.net/1974/7743
dc.descriptionThesis (Master, Computing) -- Queen's University, 2013-01-08 10:10:37.878en
dc.description.abstractSoftware is an integral part of our everyday lives, and hence the quality of software is very important. However, improving and maintaining high software quality is a difficult task, and a significant amount of resources is spent on fixing software defects. Previous studies have studied software quality using various measurable aspects of software, such as code size and code change history. Nevertheless, these metrics do not consider all possible factors that are related to defects. For instance, while lines of code may be a good general measure for defects, a large file responsible for simple I/O tasks is likely to have fewer defects than a small file responsible for complicated compiler implementation details. In this thesis, we address this issue by considering the conceptual concerns (or features). We use a statistical topic modelling approach to approximate the conceptual concerns as topics. We then use topics to study software quality along two dimensions: code quality and code testedness. We perform our studies using three versions of four large real-world software systems: Mylyn, Eclipse, Firefox, and NetBeans. Our proposed topic metrics help improve the defect explanatory power (i.e., fitness of the regression model) of traditional static and historical metrics by 4–314%. We compare one of our metrics, which measures the cohesion of files, with other topic-based cohesion and coupling metrics in the literature and find that our metric gives the greatest improvement in explaining defects over traditional software quality metrics (i.e., lines of code) by 8–55%. We then study how we can use topics to help improve the testing processes. By training on previous releases of the subject systems, we can predict not well-tested topics that are defect prone in future releases with a precision and recall of 0.77 and 0.75, respectively. We can map these topics back to files and help allocate code inspection and testing resources. We show that our approach outperforms traditional prediction-based resource allocation approaches in terms of saving testing and code inspection efforts. The results of our studies show that topics can be used to study software quality and support traditional quality assurance approaches.en_US
dc.languageenen
dc.language.isoenen_US
dc.relation.ispartofseriesCanadian thesesen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectCohesionen_US
dc.subjectCouplingen_US
dc.subjectTopic Modelen_US
dc.subjectLDAen_US
dc.subjectCode Qualityen_US
dc.subjectSoftware Qualityen_US
dc.titleStudying Software Quality Using Topic Modelsen_US
dc.typethesisen_US
dc.description.degreeMasteren
dc.contributor.supervisorHassan, Ahmed E.en
dc.contributor.departmentComputingen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record