MASc seminar - Taiyue Liu

Wednesday, July 27, 2016 1:30 pm - 1:30 pm EDT (GMT -04:00)

Candidate

Taiyue Liu

Title

Anti-Patterns for Automatic Program Repairs

Supervisor

Lin Tan

Abstract

Automated program repair has been a heated topic in software engineering. In recent years, we have witnessed many successful applications such as Genprog, SPR, RSRepair, etc. Given a bug and its test suite, which includes both passed test cases and failed test cases, these tools aim to automatically generate a patch that fixes the bug without devel- opers’ efforts. All these tools adopt a “Generate-and-Validate” approach, which believes a tool-generated patch to be correct as long as it passes all its test cases. However, if test suites are of poor quality that cannot cover all the cases, incorrect tool-generated patches might pass all their test cases and be regarded as correct patches. We call such patches that are incorrect but can pass their test suites as overfitted patches.

We perform a deep analysis on the patches composed by developers and both correct and overfitted patches generated by Genprog and SPR, in order to investigate the reasons why overfitted patches are generated. In this paper, we propose two orthogonal approaches to filter out overfitted patches: 1) We leverage machine learning techniques with meaning- ful features to predict the correctness of tool-generated patches. The results shows that machine learning techniques cannot preserve correct patches well. In other words, ma- chine learning techniques would mis-classify correct patches as overfitted patches and filter them out. 2) To better preserve correct tool-generated patches and filter out only overfitted patches, we propose some concrete patterns, named anti-patterns, that can efficiently distinguish correct patches against overfitted patches. By embedding anti-patterns into SPR, we successfully filter out 24.2% overfitted patches while preserving all the correct patches for our studied bugs. Meanwhile, by filtering out overfitted patches at runtime, anti-patterns speed up SPR’s efficiency by 1.34 times on average. These two orthogonal approaches provide automatic program repair tools with valuable guidance on how to avoid generating overfitted patches.