|Title||DataMill: Rigorous Performance Evaluation Made Easy|
|Publication Type||Conference Paper|
|Year of Publication||2013|
|Authors||Oliveira, A., J-C. Petkovich, T. Reidemeister, and S. Fischmeister|
|Conference Name||Proc. of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE)|
|Conference Location||Prague, Czech Republic|
|Keywords||benchmarking, empirical evaluation, systems|
Empirical systems research is facing a dilemma. Minor aspects of an experimental setup can have a significant impact on its associated performance measurements and potentially invalidate conclusions drawn from them. Examples of such influences, often called hidden factors, include binary link order, process environment size, compiler generated randomized symbol names, or group scheduler assignments. The growth in complexity and size of modern systems will further aggravate this dilemma, especially with the given time pressure of producing results. So how can one trust any reported empirical analysis of a new idea or concept in computer science?