• Login
    View Item 
    •   Home
    • Graduate Theses, Dissertations and Projects
    • Queen's Graduate Theses and Dissertations
    • View Item
    •   Home
    • Graduate Theses, Dissertations and Projects
    • Queen's Graduate Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Towards Generalizing Defect Prediction Models

    Thumbnail
    View/Open
    Zhang_Feng_201601_PHD.pdf (1.129Mb)
    Date
    2016-01-28
    Author
    Zhang, Feng
    Metadata
    Show full item record
    Abstract
    Software quality is vital to the success of a software project. Fixing defects is the

    major activity to continuously improve software quality. Given that a real development team usually exhibits limited resources and tight schedules, it is important to

    prioritize testing activities and optimize development resources. Predicting defective entities (e.g., files or classes) ahead helps achieve such a goal. Defect prediction has attracted

    considerable attention from both academia and industry in the last decade.

    A typical defect prediction model is built upon software metrics and labelled defect

    data that are collected from the historical data of a software project. A defect prediction

    model can be applied within the same project (within-project defect prediction) or on other

    projects (cross-project defect prediction). However, due to the diversity in development

    processes, a defect prediction model is often not transferable and requires to be rebuilt

    when the target project changes. As it consumes additional effort to build and maintain

    a defect prediction model for a particular project, it is of significant interest to generalize

    a defect prediction model. A generalized defect prediction model relieves the need to

    rebuild a defect prediction model for each target project. Moreover, it helps reveal a general

    relationship between software metrics and defect data.

    In this thesis, we analyze the feasibility of generalizing defect prediction models. First,

    we analyze how the distribution of the values of software metrics varies across projects of

    different context factors (e.g., programming language and system size). We observe that

    such distributions do vary across projects, but can also be similar across projects of different context factors. Second, we investigate the impact that the pre-processing steps (in

    particular, transformation and aggregation of software metrics) have on the performance of

    defect prediction models. We find that the pre-processing steps impact the performance of

    defect prediction models, and therefore need to be considered towards generalizing defect

    prediction models. Finally, we propose two approaches for generalizing defect prediction

    models with supervised (requiring the training data) and unsupervised (without the training

    data) methods, respectively. Our results show that both approaches are feasible to generalize defect prediction models.
    URI for this record
    http://hdl.handle.net/1974/13984
    Collections
    • Queen's Graduate Theses and Dissertations
    • School of Computing Graduate Theses
    Request an alternative format
    If you require this document in an alternate, accessible format, please contact the Queen's Adaptive Technology Centre

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us
    Theme by 
    Atmire NV
     

     

    Browse

    All of QSpaceCommunities & CollectionsPublished DatesAuthorsTitlesSubjectsTypesThis CollectionPublished DatesAuthorsTitlesSubjectsTypes

    My Account

    LoginRegister

    Statistics

    View Usage StatisticsView Google Analytics Statistics

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us
    Theme by 
    Atmire NV