Candidate: Xiang Zhou
Date: Nov 14, 2022
Time: 10:00am
Location: EIT 3141
Supervisor: Arie Gurfinkel
Abstract: Ensuring memory safety in programs has been an important yet difficult topic of research. Most static analysis approaches rely on the theory of arrays to model memory access. The limitation of the theory of arrays in terms of scalability and compatibility with SAT/SMT solvers is well-known, and there have been many attempts at optimizing either the theory itself or memory encodings based on theory of arrays. In this thesis, we demonstrate that existing arrays-based memory encodings miss potential optimization opportunities by omitting language specific properties such as alignment and pointer arithmetic in C. We present SeaM, a new memory representation for C programs built around a more expressive First-order Theory: the Theory of Memory. We show that by preserving more C language specific rules and properties, the Theory of Memory allows for more thorough optimization methods during eager rewriting of sequences of stores. We introduce two such optimization methods in this thesis. First, we over-approximate pointer comparison with an abstract interpretation-like approach called AddressRangeMap. Second, we compress sequences of stores with StoreMap for faster address offset look-ups. The new memory representation is implemented in SeaBmc, a new BMC tool for LLVM. We evaluate our approach on real-world bounded model checking tasks from the aws-c-common library and Sv-Comp benchmarks and compare it against two existing memory representations in SeaBmc. Our results show that SeaM outperforms the theory of array-based representation and is comparable with the ë based representation.