Master's Defense | Michael Hynes, A Taylor polynomial expansion line search for large-scale optimization | Applied Mathematics

MC 6460

Speaker

Michael Hynes
Department of Applied Mathematics, University of Waterloo

Title

A Taylor polynomial expansion line search for large-scale optimization

Abstract

In trying to cope with the Big Data deluge, the landscape of distributed
computing has changed. Large commodity hardware clusters, typically
operating in some form of MapReduce framework, are becoming prevalent for
organizations that require both tremendous storage capacity and fault
tolerance. However, the high cost of communication can dominate the
computation time in large-scale optimization routines in these frameworks.
This thesis considers the problem of how to efficiently conduct univariate
line searches in commodity clusters in the context of gradient-based batch
optimization algorithms, like the staple limited-memory BFGS (LBFGS) method.
In it, a new line search technique is proposed for cases where the
underlying objective function is analytic, as in logistic regression and low
rank matrix factorization. The technique approximates the objective
function by a truncated Taylor polynomial along a fixed search direction.
The coefficients of this polynomial may be computed efficiently in parallel
with far less communication than needed to transmit the high-dimensional
gradient vector, after which the polynomial may be minimized with high
accuracy in a neighbourhood of the expansion point without distributed
operations. This Polynomial Expansion Line Search (PELS) may be invoked
iteratively until the expansion point and minimum are sufficiently accurate,
and can provide substantial savings in time and communication costs when
multiple iterations in the line search procedure are required.

Three applications of the PELS technique are presented herein for important
classes of analytic functions: (i) Logistic Regression (LR), (ii) low-rank
Matrix Factorization (MF) models, and (iii) the feedforward multilayer
perceptron (MLP). In addition, for LR and MF, implementations of PELS in
the Apache Spark framework for fault-tolerant cluster computing are
provided. These implementations conferred significant convergence
enhancements to their respective algorithms, and will be of interest to
Spark and Hadoop practitioners. For instance, the Spark PELS technique
reduced the number of iterations and time required by LBFGS to reach
terminal training accuracies for LR models by factors of 1.8--2.
Substantial acceleration was also observed for the Nonlinear Conjugate
Gradient algorithm for MLP models, and is an interesting case for future
study in optimization for neural networks models. The PELS technique is
applicable to a broad class of models for Big Data processing and
large-scale optimization, and can be a useful component of batch
optimization routines.