Seminar • Machine Learning | Artificial Intelligence • From Data, to Models, and Back: Making ML Predictably Reliable

Monday, February 12, 2024 10:30 am - 11:30 am EST (GMT -05:00)

Please note: This seminar will take place in DC 1304.

Andrew Ilyas, PhD candidate
Department of Electrical Engineering and Computer Science, MIT

Despite ML models’ impressive performance, training and deploying them is currently a somewhat messy endeavor. An ML developer might find, for example, that the training dataset they gathered was inadequate for the task being solved; that their model exhibits harmful biases; or that the data in the real world is dynamic or adversarial. Motivated by this state of affairs, my goal is to make ML “predictably reliable” — i.e., to enable developers to build models that they can be confident will work in the wild. To this end, I aim to develop a precise understanding of the ML pipeline, characterizing the role of — and interactions between — deployment environments, learning algorithms, and especially training data (and the way we collect it) in shaping ML models’ performance, robustness, and biases.

To begin this talk, we will use a case study of adversarial inputs to argue that the mechanics of the ML pipeline do not always align with human intuition. We then describe a framework for improving our understanding of this pipeline by characterizing the impact of training data on model predictions, along with a variety of applications. We conclude by putting this framework into context within the broader goal of systematically understanding the ML pipeline.


Bio: Andrew Ilyas is a PhD student at MIT, advised by Constantinos Daskalakis and Aleksander Madry. His main interest is in reliable machine learning, where he seeks to understand the effects of the individual design choices involved in building ML models. He was previously supported by an Open Philanthropy AI Fellowship.