Please note: This seminar will take place in DC 1304.
Andrew
Ilyas,
PhD
candidate
Department
of
Electrical
Engineering
and
Computer
Science,
MIT
Despite ML models’ impressive performance, training and deploying them is currently a somewhat messy endeavor. An ML developer might find, for example, that the training dataset they gathered was inadequate for the task being solved; that their model exhibits harmful biases; or that the data in the real world is dynamic or adversarial. Motivated by this state of affairs, my goal is to make ML “predictably reliable” — i.e., to enable developers to build models that they can be confident will work in the wild. To this end, I aim to develop a precise understanding of the ML pipeline, characterizing the role of — and interactions between — deployment environments, learning algorithms, and especially training data (and the way we collect it) in shaping ML models’ performance, robustness, and biases.
To begin this talk, we will use a case study of adversarial inputs to argue that the mechanics of the ML pipeline do not always align with human intuition. We then describe a framework for improving our understanding of this pipeline by characterizing the impact of training data on model predictions, along with a variety of applications. We conclude by putting this framework into context within the broader goal of systematically understanding the ML pipeline.
Bio: Andrew Ilyas is a PhD student at MIT, advised by Constantinos Daskalakis and Aleksander Madry. His main interest is in reliable machine learning, where he seeks to understand the effects of the individual design choices involved in building ML models. He was previously supported by an Open Philanthropy AI Fellowship.