Special Seminar - Yi Zhou

Wednesday, January 17, 2018 9:30 am - 9:30 am EST (GMT -05:00)

Title: Stochastic algorithms for distributed optimization and machine learning

Speaker: Yi Zhou
Affiliation: Georgia Institute of Technology
Room: MC 5501


In the big data era, machine learning acts as a powerful tool to transform data into knowledge that helps us make predictions and decisions. It has strong ties to the field of optimization, in the way the latter provides methods and theory. While data tends to be collected in a distributed and on-line fashion, the standard machine learning models require centralizing the training data on one machine or in a data center, which incurs significant communication costs and puts data privacy at risk. To circumvent such an issue, we focus on the distributed machine learning models that are defined over networks with certain structures.

In this talk, we mainly focus on designing and analyzing efficient stochastic algorithms for distributed machine learning problems. In particular, machine learning problems defined over two different distributed topologies. In the first part of the talk, we present a class of communication-efficient algorithms for solving stochastic decentralized optimization problems in the setting without a central authority. They achieve the best-known complexity bounds for inter-node communication rounds and not improvable complexity bounds on the intra-node sub-gradient evaluations (i.e., sampling complexities). In the second part of the talk, we consider a distributed topology with a central authority, the star network setting, and study the distributed finite-sum optimization problems. We propose an optimal randomized incremental gradient method and show that it does not require any exact gradient evaluations even at the initial point, but can still achieve the optimal linear sampling complexity for solving deterministic finite-sum problems. Moreover, for stochastic finite-sum problems, the proposed algorithm maintains the optimal sublinear sampling complexity (up to a certain logarithmic factor), and attains a linear communication complexity.

The first part of the talk is a joint work with Dr. Guanghui (George) Lan and Dr. Soomin Lee, and the second part of the talk is based on a joint work with Dr. Guanghui (George) Lan.