Title: How can we optimize nonsmooth objectives globally?
|Affiliation:||Humboldt University, Germany|
In machine learning objective functions that are only piecewise smooth and should be globally minimized abound. The standard method of dealing with them is to apply a stochastic gradient method disregarding the rare points of nonsmoothness and hoping for the best as far as global optimality of the computed solution is concerned. Without doubting that this optimistic approach often works very well in practice, we explore the possibility of successively abs-linearizing such functions and solving the resulting local model problems in one of three ways. Firstly Coordinate Global Descent (see S. Wright), a Savvy variant of the Heavy Ball method proposed by B.T. Polyak and the solution of an equivalent Mixed Integer Bilinear Optimization Problem by modern solvers like Gurobi. We present numerical results from simple regression tasks and the inevitable MNIST problem. Joint work with Angel Rojas.
200 University Avenue West
Waterloo, ON N2L 3G1