Cost and Benefit of Tracing Features with Embedded Annotations

Title Cost and Benefit of Tracing Features with Embedded Annotations
Author
Abstract

Features are commonly used to describe the functional and non-functional characteristics of software. Especially agile development methods, such as SCRUM, FDD or XP, use features to plan and manage software development. Features are often the main units of software reuse, communication, and configuration, abstracting over code details. Especially in the age of generative AI, where feature requirements are specified as prompts and substantial code is cloned, codebases are becoming increasingly complex and redundant. This requires raising the level of abstraction at which we manage and evolve software systems. However, effectively using features requires knowing their precise locations within codebases, which is especially challenging when they are scattered across the codebase. Once implemented, the knowledge about a feature’s location quickly deteriorates when the software evolves or development teams change, requiring expensive recovery of features. This decades-old problem is known as the feature-location or concept assignment problem in software engineering, which researchers have—unsuccessfully over decades—tried to address with automated feature-location recovery techniques.

The problem lies in the common belief that recording and maintaining feature locations during development is laborious and error-prone. In this study, we argue to the contrary. We hypothesize that such information can be effectively embedded into codebases, and that the arising costs will be amortized by the benefits of this information. We validated this hypothesis in a simulation study with three subjects systems: a smaller open-source system, a large commercial firmware system, and an open-source mobile app. We designed a lightweight code annotation technique and simulated its use as if annotations had been added, maintained, and exploited during the original development. We identified evolution patterns and measured the cost and benefit of these annotations. Our results show that not only the cost of adding annotations, but also that of maintaining them is negligible compared to the development and maintenance costs of the actual code. Embedding the annotations into the codebase significantly reduced their maintenance effort, because they naturally co-evolved with the code. The annotations provided a benefit for feature-related maintenance tasks, such as feature cloning or merging the clones into an integrated codebase, that exceeded the costs of using them.

Year of Publication
2025
Journal
ACM Transactions on Software Engineering and Methodology
Date Published
08/2025
Type of Article
journal
URL
https://dl.acm.org/doi/10.1145/3746060
DOI
10.1145/3746060
Refereed Designation
refereed
Download citation