Continuous Optimization Seminar- Julian Romero

Title: Why Random Reshuffling Beats Stochastic Gradient Descent

Speaker:	Julian Romero
Affiliation:	University of Waterloo
Room:	MC 5479

Abstract: Over the first few lectures in the seminar we studied the Stochastic Gradient Descent (SGD) method to minimize functions of the form $f=\sum_{i=1}^m f_i$ for $m$ large. In this talk I will go over a variant of (SGD) called Random Reshuffling (RR). In this method, the descent directions are chosen from the component functions $f_i$ at random as in (SGD), but the choice is made \textit{without replacement} and in a cyclic fashion. In the past, numerical computations have shown evidence of (RR) outperforming (SGD), however a proof of this fact was missing until very recently. I will go over the basic ideas surrounding the proof for the case in which the functions $f_i$ are quadratics. Time permitting I will explain the general case when the $f_i$'s are smooth.

Location Information

Location Address: MC - Mathematics & Computer Building
200 University Avenue West
5479
Waterloo, ON, CA N2L 3G1

Location coordinates:

Contact Info