Unsupervised learning; what can, what can't and what should not be done

Tuesday, May 28, 2019 11:00 am - 11:00 am EDT (GMT -04:00)

Professor Shai Ben-David, School of Computer Science
University of Waterloo


Unsupervised learning refers to the process of finding patterns and drawing conclusions from raw data(in contrast to supervised learning, where the training data is labeled, or scored, and the learner is expected to figure out a labeling/scoring rule for use in yet-unseen examples). Unlabeled data is, naturally, more readily available than supervised examples, and there is therefore much to gain from being able to utilize such data. However, our understanding on unsupervised learning is much less satisfactory than the established theory of supervised learning.

In this talk, I will discuss several aspects of the theory of unsupervised learning and describe some recent results and insights, as well as provide my idiosyncratic advice about how the research and practice of this important task should (and should not) be carried out.

In particular, I will highlight joint work with Hasan Ashiani, Nick Harvey, Chris Law, Abas Merhabian and Yniv Plan that has won Best Paper Award in last year's NeurIPS and work with Shay Moran, Pavel Hrubes, Amir Shpilka and Amir Yehudayoff  that was featured last January in Nature Magazine, as well as work with other past and current students of mine.


Shai Ben-David grew up in Jerusalem, Israel. He attended the Hebrew University studying physics, mathematics and psychology. He received his PhD under the supervision of Saharon Shelah and Menachem Magidor for a thesis in set theory. Professor Ben-David was a postdoctoral fellow at the University of Toronto in the Mathematics and the Computer Science departments, and in 1987 joined the faculty of the CS Department at the Technion (Israel Institute of Technology). He held visiting faculty positions at the Australian National University in Canberra (1997-8) and at Cornell University (2001-2004). In August 2004 he joined the School of Computer Science at the University of Waterloo.

His research interests span a wide spectrum of topics in the foundations of computer science and its applications, with a particular emphasis on statistical and computational machine learning. The common thread throughout his research is aiming to provide mathematical formulation and understanding of real world problems. In particular, he has been looking at popular machine learning and data mining paradigms that seem to lack clear theoretical justification.

Date and Time
Tuesday, 28 May 2019
11:00 AM - 12:00 PM

DC 1302
University of Waterloo

Light refreshments will be available.