An Exploration of Challenges Limiting Pragmatic Software Defect Prediction
Software Defect Prediction , Software Engineering , Software Maintenance
Software systems continue to play an increasingly important role in our daily lives, making the quality of software systems an extremely important issue. Therefore, a significant amount of recent research focused on the prioritization of software quality assurance efforts. One line of work that has been receiving an increasing amount of attention is Software Defect Prediction (SDP), where predictions are made to determine where future defects might appear. Our survey showed that in the past decade, more than 100 papers were published on SDP. Nevertheless, the adoption of SDP in practice to date is limited. In this thesis, we survey the state-of-the-art in SDP in order to identify the challenges that hinder the adoption of SDP in practice. These challenges include the fact that the majority of SDP research rarely considers the impact of defects when performing their predictions, seldom provides guidance on how to use the SDP results, and is too reactive and defect-centric in nature. We propose approaches that tackle these challenges. First, we present approaches that predict high-impact defects. Our approaches illustrate how SDP research can be tailored to consider the impact of defects when making their predictions. Second, we present approaches that simplify SDP models so they can be easily understood and illustrates how these simple models can be used to assist practitioners in prioritizing the creation of unit tests in large software systems. These approaches illustrate how SDP research can provide guidance to practitioners using SDP. Then, we argue that organizations are interested in proactive risk management, which covers more than just defects. For example, risky changes may not introduce defects but they could delay the release of projects. Therefore, we present an approach that predicts risky changes, illustrating how SDP can be more encompassing (i.e., by predicting risk, not only defects) and proactive (i.e., by predicting changes before they are incorporated into the code base). The presented approaches are empirically validated using data from several large open source and commercial software systems. The presented research highlights how challenges of pragmatic SDP can be tackled, making SDP research more beneficial and applicable in practice.