Ph.D. Defence - Jia Hui Liang

Thursday, June 28, 2018 1:00 pm - 1:00 pm EDT (GMT -04:00)

Candidate: Jia Hui Liang

Title: Machine Learning for SAT Solvers

Date: June 28, 2018

Time: 1:00 PM

Place: EIT 3142

Supervisor(s): Czarnecki, Krzysztof - Ganesh, Vijay


SAT solvers are indispensable tools in a variety of domains where efficient search is required. Not only does this reduce the burden of the users since the users avoid the need to implement their own search algorithm, they also leverage the surprising effectiveness of modern SAT solvers. Thanks to many decades of cumulative effort, researchers have made persistent improvements to SAT technology to the point where nowadays the best solvers are routinely used to solve extremely large instances with millions of variables. Even though our current paradigm of SAT solvers runs in worst-case exponential time, it appears that the techniques and heuristics embedded in these solvers avert the worst-case exponential time in practice. The implementations of these various solver heuristics and techniques are vital to the solvers effectiveness in practice.

The state-of-the-art heuristics and techniques gather data during the run of the solver to inform their decisions like which variable to branch on or when to invoke a restart. The methods in which these heuristics and techniques process the data generally do not have theoretical underpinnings. Consequently, understanding why these heuristics and techniques perform so well in practice remains a challenge and systematically improving them is rather difficult. This goes to the heart of this thesis, that is to utilize machine learning to process the data. Research in machine learning exploded over the past decade due to its success in extracting useful information out of large volumes of data. Machine learning outclasses manual handcoding in a wide variety of complex tasks where data is plentiful. This is also the case in modern SAT solvers where propagations, conflict analysis, and clause learning produces plentiful of data to be analyzed, and exploiting this data to the fullest is naturally where machine learning comes in. Many machine learning techniques have a theoretical basis that makes them easy to analyze and understand why they perform well.

The branching heuristic is the first target for injecting machine learning. First I studied extant branching heuristics to understand what makes a branching heuristics good empirically. The fundamental observation is that good branching heuristics incur lots of clause learning by triggering conflicts as quickly as possible. This suggests that variables that cause conflicts are a valuable source of data. Another important observation is that the state-of-the-art VSIDS branching heuristic internally implements an exponential moving average. This highlights the importance of accounting for the temporal nature of the data when deciding to branch. These observations led me to propose a series of machine learning-based branching heuristics with the common goal of selecting the branching variables to increase probability of inducing conflicts. These branching heuristics are shown empirically to either be on par or outcompete the current state-of-the art.

The second area of interest for machine learning is the restart policy. Just like in the branching heuristic work, I first study restarts to observe why they are effective in practice. The important observation here is that restarts shrink the assignment stack as conjectured by other researchers. I show that this leads to better clause learning by lowering the LBD of learnt clauses. Machine learning is used to predict the LBD of the next clause, and a restart is triggered when the LBD is excessively high. This policy is shown to be on par with state-of-the-art. The success of incorporating machine learning into branching and restarts goes to show that machine learning has an immportant role in the future of heuristic and technique design for SAT solvers.