Master's Thesis Defence | Joshua Joseph George, A Hamiltonian Systems Approach to Neural Network Optimization

Tuesday, May 12, 2026 10:00 am - 11:00 am EDT (GMT -04:00)

Location

MC 5417

Candidate 

Joshua Joseph George | Applied Mathematics, University of Waterloo

Title

A Hamiltonian Systems Approach to Neural Network Optimization

Abstract

We propose and analyze structure preserving methods for first-order optimization of Lipschitz smooth objectives by interpreting the dynamics as a dissipative Hamiltonian system, in which the model parameters evolve jointly with an auxiliary momentum variable. This formulation induces a natural energy dissipation mechanism that motivates the design of optimization algorithms that inherit a discrete energy decay property. We develop discrete gradient (DG) methods that preserve an exact discrete time energy decay property, ensuring monotone dissipation independent of stepsize. Building on this framework, we introduce variants which empirically reduce oscillations, improve runtime, and improve robustness to ill-conditioned problems.

To address the computational cost of the implicit DG methods, we propose semi-implicit discrete gradient (SIDG) schemes obtained by linearizing the DG updates and incorporating curvature through L-BFGS inverse Hessian approximations, which are used to efficiently solve the resulting linear systems. These schemes retain key structure preserving properties while significantly reducing computational cost, yielding a practical balance between stability and efficiency. We establish monotone energy decay, boundedness of iterates, and sublinear convergence to first-order stationary points.

Numerical experiments on ill-conditioned least-squares problems, regularized logistic regression, physics-informed neural networks, and CIFAR-10 image classification demonstrate good performance despite ill-conditioning and competitive performance as compared to widely used optimizers such as ADAM, Stochastic gradient descent, and L-BFGS.