MASc Seminar Notice: Automated Generation, Evaluation, and Enhancement of JMH Microbenchmark Suites from Unit Tests

Candidate: Mostafa Jangali

Date: November 15, 2024

Time: 11:00am

Supervisor: Weiyi Shang

Location: Online

Join Zoom Meeting https://uwaterloo.zoom.us/j/93062084910?pwd=WPBBMPTbrkAbb7m3mH7lxuWx8fL12E.1 Meeting ID: 930 6208 4910 Passcode: 725803

Abstract:

Performance is a crucial non-functional requirement of many software systems. Despite the widespread use of performance testing, developers still struggle to construct and evaluate the quality of performance tests. To address these two major challenges, we implement a framework, dubbed ju2jmh, to automatically generate performance microbenchmarks from JUnit tests and use mutation testing to study the quality of generated microbenchmarks. Specifically, we compare our ju2jmh generated benchmarks to manually-written JMH benchmarks and to automatically generated JMH benchmarks using AutoJMH framework, as well as directly measuring system performance with JUnit tests. For this purpose, we have conducted a study on three subjects (Rxjava, Eclipse-collections, and Zipkin) with ∼454K source lines of code (SLOC), 2,417 JMH benchmarks (including manually- written and generated AutoJMH benchmarks) and 35,084 JUnit tests. As a result, the ju2jmh generated JMH benchmarks consistently outperform using the execution time and throughput of JUnit tests as a proxy of performance and JMH benchmarks automatically generated using AutoJMH framework while being comparable to JMH benchmarks manually written by developers in terms of tests’ stability and ability to detect performance bugs. Nevertheless, ju2jmh benchmarks are able to cover more of the software applications than manually-written JMH benchmarks during the microbenchmarking execution. Furthermore, ju2jmh benchmarks are generated automatically, while manually-written JMH benchmarks requires many hours of hard work and attention, therefore our study can reduce developers’ effort to construct microbenchmarks. In addition, we proposed an enhancement to ju2jmh microbenchmarks to enable us executing relatively similar microbenchmarks in batches. As a result, we could save time while keeping the quality the same. To the best of our knowledge, this is the first study aimed at assisting developers in fully automated microbenchmark creation and assessing microbenchmark quality for performance testing.

Abstract:

Support Waterloo Engineering