PhD Comprehensive Exam | Juju Quartz, Safe Data-Driven Control of Unknown Non-linear Systems with Guarantees

Wednesday, January 15, 2025 10:00 am - 11:00 am EST (GMT -05:00)

Location

MC 6460

Candidate

Juju Quartz | Applied Mathematics, University of Waterloo

Title

Safe Data-Driven Control of Unknown Non-linear Systems with Guarantees

Abstract

Learning for the control of high-dimensional nonlinear dynamical systems remains a challenging task. Classical methods suffer from the well-known curse of dimensionality and do not scale well into the high-dimensional regime. Fortunately, recent advances in machine learning techniques have emerged to address these issues. In this talk, I will present how data-driven approaches can be employed for stabilization and safety tasks.

I will discuss three distinct methods for tackling stabilization and safety. The first method utilizes reinforcement learning to train a controller that stabilizes a nonlinear dynamical system. I propose a reinforcement learning algorithm that achieves stabilization by learning a local linear representation of the system dynamics. The core component of the algorithm is integrating the learned gain matrix directly into the neural policy. I then provide rigorous convergence guarantees for when the neural policy converges to an asymptotically stabilizing controller. The second method presents an optimization-based approach for safe reach-avoid-stay tasks. This method efficiently computes Bezier curves that satisfy signal temporal logic (STL) specifications with piecewise time-varying robustness. The time-varying robustness is less conservative than the real-valued robustness, enabling more effective tracking in practical applications. Finally, I will discuss a theoretical result that provides conditions under which control barrier functions and control Lyapunov functions can be combined into a single control Lyapunov barrier function. The control Lyapunov barrier function can simultaneously certify both the asymptotic stability and forward invariance of a safe set. This result has practical implications, as it simplifies the learning of a single function that certifies both stability and safety in control tasks.