Applied Mathematics, University of Waterloo
Nonlinear Preconditioning Methods for Optimization and Parallel-in-Time Methods for the Linear Advection Equation
This thesis consists of two main parts, part one addressing problems from nonlinear optimization and part two based on solving systems of time dependent differential equations, with both parts describing strategies for accelerating the convergence of iterative methods.
In part one we present a nonlinear preconditioning framework for use with nonlinear solvers applied to nonlinear optimization problems, motivated by generalizations of linear left preconditioning and linear preconditioning via a change of variables for minimizing quadratic functions. Nonlinear preconditioning is used to generate a preconditioner direction that either replaces or supplements the gradient vector throughout the optimization algorithm. This framework is used to discuss previously developed nonlinearly preconditioned nonlinear GMRES and nonlinear conjugate gradients algorithms, as well as to develop two new nonlinearly preconditioned limited memory quasi-Newton methods. These methods are compared by computing two different types of tensor decomposition, both of which are defined by optimization problems that have alternating least squares (ALS) type fixed point iterations. While these ALS type iterations may be slow to converge in practice, they can serve as efficient nonlinear preconditioners for the other optimization methods. Numerical results show that the nonlinearly preconditioned methods offer substantial improvements in terms of time-to-solution and robustness over state-of-the-art methods for large tensors, in cases where there are significant amounts of noise in the data, and when high accuracy results are required.
In part two we apply a multigrid reduction-in-time (MGRIT) algorithm to scalar 1D hyperbolic partial differential equations. This study is motivated by the observation that sequential time-stepping is an obvious computational bottleneck when attempting to implement highly concurrent algorithms, thus parallel-in-time methods are particularly desirable, but existing parallel-in-time methods can have stability and convergence issues for advection dominated problems. MGRIT primarily uses temporal coarsening, but spatial coarsening can also be incorporated to produce cheaper multigrid cycles and to ensure stability conditions are satisfied on all levels for explicit time-stepping methods. However, uniform factor-two spatial coarsening can cause extremely slow MGRIT convergence when the wave speed is near zero, even if only locally, due to the weak spatial connections that result. The use of waveform relaxation multigrid on intermediate, temporally semi-coarsened grids is identified as a potential remedy for this issue, with the downside of creating a more intrusive algorithm that cannot be easily combined with existing time-stepping routines. As a less intrusive alternative we present an adaptive spatial coarsening strategy designed to prevent the observed slowdown. Serial numerical results show this method offers significant improvements over uniform coarsening and is convergent for inviscid Burgers' equation with and without shocks. Parallel scaling tests indicate that improvements over serial time-stepping strategies are possible when spatial parallelism alone saturates, and that scalability is robust for oscillatory solutions that change on the scale of the grid spacing.