PhD defence - Leonardo Passos

Monday, April 25, 2016 1:00 pm - 1:00 pm EDT (GMT -04:00)

Candidate

Leonardo Passos

Title

Towards Better Understanding Variability Evolution

Supervisor

Krzysztof Czarnecki

Abstract

Variant-rich systems often leverage variability modeling to achieve systematical reuse and mass customization. When doing so, they become variability-aware. Although facilitating variability management, variability models do not eliminate the variability in other artifacts. In fact, evolving a system’s variability is far from trivial, as variation points spread across different artifacts, possibly at multiple locations. To make matters worse, existing approaches for variability evolution have been largely criticized in practice, as industry-based reports claim them as ineffective.

Currently, the understanding of variability evolution has not been thorough. For instance, most of the works addressing variability evolution scope analysis to changes in variability models only, ignoring its relation to the variability evolution in build and implementation files. Moreover, when validating new variability evolution approaches, researchers often rely on randomly generated models, or in some situations, even on fictitious cases. Thus, it is not surprising that variability evolution support is ineffective in practice; rather, we claim it as a natural consequence of lacking an understanding of how variability happens in real-world settings. This thesis seeks to advance such understanding.

With our research goal in mind, we perform an in-depth analysis of variability evolution in large, complex, and real-world variability-aware systems in the systems software domain.

Our starting point of research is an exploratory analysis over a sample of the Linux kernel evolution, the largest and longest-lived open-source variability-aware system. Motivated by the impact of pattern analysis in modern software engineering (e.g., refactoring patterns), we set to mine evolution patterns from the Linux kernel commit history. Specifically, our patterns focus on the variability evolution induced by adding or removing features in the variability model, capturing how other artifacts (e.g., Makefiles and code) coevolve as a consequence. We identify 23 coevolution patterns, from which we crosscheck their properties with the existing literature, evidencing limitations in existing approaches, as well as providing insights for tool builders. Additionally, we also observe how developers implement new features, finding feature scattering as a recurrent practice. This is particularly interesting, as feature scattering is often criticized in practice. We argue that scattering is not necessarily bad if used with care—in fact, as with the Linux kernel case, existing systems have shown that it is possible to achieve long-term evolution while accepting some level of feature scattering. The limits of feature scattering, however, are currently unknown. This is not surprising, as no empirical study investigates feature scattering in the evolution of large and long-lived software systems.

From our exploratory analysis of the Linux kernel, we perform further assessments to strengthen our understanding.

First, we set to increase the external validity of our patterns by validating them in the context of three other systems: axTLS, Toybox, and uClibc. We find that our patterns cover as much as 64 % of all feature additions and removal cases across the evolution of our three chosen subjects—altogether, our validation spans a period of over 20 years of evolution. Moreover, we find 14 patterns whose use goes beyond Linux. In fact, we claim them as general cases within the systems software domain.

Second, seeking a better understanding of feature scattering limits, we return our attention to the Linux kernel evolution. Different from the mining of patterns, our analysis considers the entire snapshot of the Linux kernel commit history, covering over eight years of evolution. Scoped to the scattering of device-driver features, the most common feature type in the Linux kernel, we set to identify empirical limits within the codebase, including the proportion of scattered features, as well as identifying typical scattering degrees. We also note specific feature types which appear to be more prone to scattering. While we do not claim the limits we find as universal, our study provides evidence that scattering can go as far as the limits we observe in the Linux kernel implementation.