PhD Comprehensive Seminar | Juju (Thanin) Quartz, Continuous Action Space Reinforcement Learning for Control Problems with Theoretical Guarantees | Applied Mathematics

Candidate

Juju (Thanin) Quartz | Applied Mathematics, University of Waterloo

Title

Continuous Action Space Reinforcement Learning for Control Problems with Theoretical Guarantees

Abstract

Learning a policy for high-dimensional control tasks with continuous action spaces is extremely challenging. However, there may be a potential solution. In recent years, reinforcement learning has achieved considerable success in complex games like Starcraft and chess, garnering attention from the control community for its ability to learn controllers for complex, high-dimensional control tasks. While this empirical success is promising, important control properties of these reinforcement learning policies, such as reachability and stabilizability, remain underexplored.

In this talk, I will build on this recent success by discussing soft actor critic, a variant of the actor critic algorithm that is performant on many control problems. I will also discuss an improvement to this algorithm that combines the linear quadratic regulator to achieve stabilization about an equilibrium point in the case of unknown dynamics. The theoretical guarantees as well as the numerical simulations will be emphasized. Lastly, other promising research directions such as continuous time reinforcement learning and theoretical guarantees in the stochastic setting will be mentioned as a potential avenue for my PhD thesis.