Project 14 - Iterative AI-Driven Memory Analysis of Concurrent Data Structures

Graduate Mentor: Tom Lagovet

Graduate Mentor's supervisor: Prof. Trevor Brown

Reasoning about the correctness and efficiency of multithreaded applications is notoriously challenging. Not only is it inherently difficult to consider interactions between independent computing entities working concurrently, but there are also many low-level factors not directly related to algorithmic specifications that can significantly impact an application's behaviour. As a result, there is often a large gap between a data structure's theoretical and observed performance.

Memory layout (i.e., where an application places objects in its address space) is one often-overlooked factor that can have a substantial impact on performance in modern systems. Fixing memory layout issues has been shown to nearly double the performance of certain concurrent data structures. However, diagnosing these issues traditionally requires manually gathering type information for all allocations, visualizing the layout of objects in cache lines and pages, computing cache set utilization statistics for each type, and wading through large amounts of collected data to draw conclusions.

Professor T. Brown recently developed a system called HeapLENS to help researchers automatically examine the memory layout of multithreaded applications. HeapLENS is specifically designed to produce compact, high-quality, curated output suitable for AI-driven analysis. While HeapLENS output can already enable AI agents to improve application memory layouts by a significant margin, the current workflow invokes HeapLENS only once and uses its output only once. A natural research direction is therefore to adapt HeapLENS to support repeated interaction with an AI agent, enabling an iterative optimization cycle in which incremental changes can be proposed, evaluated, and refined.

The students will work on extending the existing HeapLENS framework to enable iterative AI-driven memory layout optimization. This will involve the design and implementation of new functionality, experimental studies of optimization quality and performance, and comparisons against existing approaches. Strong results may lead to the preparation of a research paper for publication.

This project is best suited for students who are comfortable programming in C++ and have completed CS 240/CS 240E (or have equivalent knowledge in data structures). Prior experience with multithreaded programming, operating systems, memory management, performance analysis, or concurrent data structures is helpful but not required. Students should be willing to learn new concepts independently, engage with research papers and technical documentation, and use modern AI tools as part of the research and development process.