Master's Thesis Defence | Joshua Joseph George, A Hamiltonian Systems Approach to Neural Network Optimization

Tuesday, May 12, 2026 10:00 am - 11:00 am EDT (GMT -04:00)

Location

MC 5417

Candidate 

Joshua Joseph George | Applied Mathematics, University of Waterloo

Title

A Hamiltonian Systems Approach to Neural Network Optimization

Abstract

We propose and analyze structure-preserving methods for first-order optimization of Lipschitz smooth objectives by interpreting the dynamics as a dissipative Hamiltonian system, in which the model parameters evolve jointly with an auxiliary momentum variable. This formulation induces a natural energy dissipation mechanism that motivates the design of optimization algorithms that inherit a discrete energy decay property. We develop discrete-gradient (DG) methods that preserve an exact discrete-time energy decay property, ensuring monotone dissipation independent of stepsize. Building on this framework, we introduce variants which empirically reduce oscillations, improve runtime, and improve robustness to ill-conditioned problems.

To address the computational cost of the implicit DG methods, we propose semi-implicit discrete-gradient (SIDG) schemes obtained by linearizing the DG updates and incorporating curvature through L-BFGS inverse-Hessian approximations, which are used to efficiently solve the resulting linear systems. These schemes retain key structure-preserving properties while significantly reducing computational cost, yielding a practical balance between stability and efficiency. We establish monotone energy decay, boundedness of iterates, and sublinear convergence to first-order stationary points.

Numerical experiments on ill-conditioned least-squares problems, regularized logistic regression, physics-informed neural networks, and CIFAR-10 image classification demonstrate good performance despite ill-conditioning and competitive performance as compared to widely used optimizers such as ADAM, Stochastic gradient descent, and L-BFGS.