MASc Seminar: Pluggable Properties for Program Understanding: Ontic Type Checking and Inference

Thursday, March 29, 2018 1:30 pm - 1:30 pm EDT (GMT -04:00)

Candidate: Zhuo Chen

Title: Pluggable Properties for Program Understanding: Ontic Type Checking and Inference

Date: March 29, 2018

Time: 1:30 pm

Place: E5 4047

Supervisor(s): Dietl, Werner M.

Abstract:

Pluggable type systems is a powerful approach to add additional information on types, which can facilitate the understanding of programs. This thesis presents our work on three pluggable type systems for helping both programmers and other program analysis tools to better understand programs.

Pluggable type checking ensures the absence of formalized vulnerabilities. This thesis presents the EOF Value Checker, which prevents unsafe end-of-file (EOF) value comparisons. Unsafe EOF value comparisons can lead infinite loops in programs, and the details are described by the SEI CERT Oracle Coding standard for Java rule FIO-08J. EOF Value Checker examines additional safety-related properties on integer types, and statically ensures no FIO-08J rule violations appear in a given program.

Traditionally, pluggable type inference alleviates manual annotation efforts for type checking. This thesis presents work on extending the purpose of type inference to also produce high-level abstractions for programs. We presents a novel type system called Ontology, which reasons about a coarse abstraction for a given program based on ontic concepts. An ontic concept is a high-level semantic conclusion of concrete types or fields in programs. The goal of Ontology is to produce reasonable semantic abstractions for a given program, so that the produced abstraction may facilitate other program analysis tools on their tasks.

To effectively solve type constraints for Ontology, as foundational work, we extend the solver framework in the Checker Framework Inference from only supporting satisfiability problem (SAT) solvers to also supporting Satisfiability Modulo Theories (SMT) solvers. Then, a new SMT encoding approach based on this extension is proposed for Ontology type system. We also apply this new encoding on Dataflow type system, which is previous work on reasoning about concrete run-time types for the components in programs. Our new encoding makes it be possible to support partially annotated programs.

We evaluate EOF Value Checker on 35 real world projects, and it finds 3 defects, 8 bad coding practices, and no false positives are generated. Ontology  and Dataflow type systems are evaluated on 15 scientific libraries (range from 393 to 86k LoC). For Ontology type system, it summarizes 4937 built-in ontic concepts, and propagates 274 domain-specific concepts in two pre-annotated physical libraries. For Dataflow type system, the new approach summarizes 36k data types. In addition, we manually examine the inference results, and verify the two type systems produce meaningful program abstractions of projects in the benchmark. These results suggest that pluggable type systems can provide confidence on preventing formalized vulnerabilities, and be able to infer high-level abstractions for programs.