Continuous Optimization Seminar- Julian Romero

Wednesday, November 14, 2018 4:00 pm - 4:00 pm EST (GMT -05:00)

Title: Why Random Reshuffling Beats Stochastic Gradient Descent

Speaker: Julian Romero
Affiliation: University of Waterloo
Room: MC 5479

Abstract:  Over the first few lectures in the seminar we studied the Stochastic Gradient Descent (SGD) method to minimize functions of the form $f=\sum_{i=1}^m f_i$ for $m$ large. In this talk I will go over a variant of (SGD) called Random Reshuffling (RR). In this method, the descent directions are chosen from the component functions $f_i$ at random as in (SGD), but the choice is made \textit{without replacement} and in a cyclic fashion. In the past, numerical computations have shown evidence of (RR) outperforming (SGD), however a proof of this fact was missing until very recently. I will go over the basic ideas surrounding the proof for the case in which the functions $f_i$ are quadratics. Time permitting I will explain the general case when the $f_i$'s are smooth.