Interval Type Inference: Improvements and Evaluations

Thursday, December 2, 2021 3:00 pm - 3:00 pm EST (GMT -05:00)

Candidate: Di Wang
Title: Interval Type Inference: Improvements and Evaluations

Date: December 2, 2021
Time: 15:00
Place: MS Teams
Supervisor(s): Dietl, Werner

Abstract:
Interval analysis estimates the run-time intervals of numerical expressions in the source code by computing a lower bound and an
upper bound. Interval analysis for integral types is useful in providing facts of the target program to help developers find
issues such as unsafe narrowing cast, out-of-bound array index, numerical overflow/underflow, division-by-zero, or dead branch.

Various approaches are developed to achieve this goal. Pluggable type systems such as the Checker Framework allow developers to
customize type checkers of their own interest by associating a type with a particular property and defining specific type rules
that restrict the program behaviors. However, the type checkers are intra-procedural, which requires manual annotations on all
the subroutines invoked in the method being checked. This annotating effort can be a heavy burden to the development of
large-scale projects.

A solution to reduce the human effort is inter-procedural, whole-program type inference. Whole-program type inference takes
unannotated program as input and outputs an entire typing for the program that type-checks. If no such typing exists, the reason
is either a real type error or a false positive.

Checker Framework Inference is a framework of whole-program type inference built upon the Checker Framework. Constraint variables
and constraints are created throughout the whole program based on syntactical type rules. Then the constraints are encoded and
solved by a solver.

Value Range Inference is a whole-program inference approach for integral range (interval) analysis, which is implemented with
Checker Framework Inference.

This thesis proposes Interval Type Inference, which improves Value Range Inference from the following aspects.

1. Simplify the interval type hierarchy and the representation of interval types. Thereby reduce the size of the SMT encoding.
2. Redefine certain type rules as well as the flow-sensitive refinement on comparison, especially in the context of a loop.
3. Redefine the SMT encoding of constraints including well-formedness constraints, comparison constraints, etc.

To evaluate Interval Type Inference, this thesis conducts experiments on selected open-source projects. The experimental results
show that Interval Type Inference successfully discovers issues including unsafe narrowing cast and use of invalid input.