University of Waterloo
200 University Ave W, Waterloo, ON
N2L 3G1
Phone: (519) 888-4567
Staff and Faculty Directory
Contact the Department of Electrical and Computer Engineering
Ming Tan
Online Defect Prediction for Imbalanced Data
Lin Tan
Many defect prediction techniques are proposed to improve software reliability. Change classification predicts defects at the change level, where a change is the modifications to one file in a commit. In this thesis, we conduct the first study of applying change classification in practice and share the lessons we learned.
We identify two issues in the prediction process, both of which contribute to the low prediction performance. First, the data are imbalanced—there are much fewer buggy changes than clean changes. Second, the commonly used cross-validation approach is inappropriate for evaluating the performance of change classification. To address these challenges, we apply and adapt online change classification, resampling, and updatable classification techniques to improve the classification performance.
We perform the improved change classification techniques on one proprietary and six open source projects. Our results show that these techniques improve the precision of change classification by 12.2-89.5% or 6.4–34.8 percentage points (pp.) on the seven projects.
Furthermore, we integrate change classification in the development process of the proprietary project. We have learned the following lessons: 1 ) new solutions are needed to convince developers to use and believe prediction results, and prediction results need to be actionable, 2 ) new and improved classification algorithms are needed to explain the prediction results, and insensible and unactionable explanations need to be filtered or refined, and 3 ) new techniques are needed to improve the relatively low precision.
University of Waterloo
200 University Ave W, Waterloo, ON
N2L 3G1
Phone: (519) 888-4567
Staff and Faculty Directory
Contact the Department of Electrical and Computer Engineering
The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is centralized within our Office of Indigenous Relations.