Please note: This distinguished lecture will take place in DC 1302 as well as livestreamed over Zoom.
Professor, Department of Computer Science, University of Toronto
Canada CIFAR AI Chair, Vector Institute for Artificial Intelligence
Associate Director and Research Lead, Schwartz Reisman Institute for Technology and Society
Reinforcement Learning (RL) is proving to be a powerful technique for building sequential decision making systems in cases where the complexity of the underlying environment is difficult to model. Two challenges that face RL are reward specification and sample complexity. Specification of a reward function — a mapping from state to numeric value — can be challenging, particularly when reward-worthy behaviour is complex and temporally extended. Further, when reward is sparse, it can require millions of exploratory episodes for an RL agent to converge to a reasonable quality policy.
In this talk I’ll show how formal languages and automata can be used to represent complex non-Markovian reward functions. I’ll present the notion of a Reward Machine, an automata-based structure that provides a normal form representation for reward functions, exposing function structure in a manner that greatly expedites learning. Finally, I’ll also show how these machines can be generated via symbolic planning or learned from data, solving (deep) RL problems that otherwise could not be solved.
Bio: Sheila McIlraith is a Professor in the Department of Computer Science at the University of Toronto, a Canada CIFAR AI Chair (Vector Institute), and an Associate Director at the Schwartz Reisman Institute for Technology and Society. Prior to joining U of T, McIlraith spent six years as a Research Scientist at Stanford University, and one year at Xerox PARC.
McIlraith’s research is in the area of AI knowledge representation and reasoning, and machine learning where she currently studies sequential decision-making, broadly construed, with a focus on human-compatible AI. McIlraith is a Fellow of the ACM and the Association for the Advancement of Artificial Intelligence (AAAI). She and co-authors have been recognized with two test-of-time awards from the International Semantic Web Conference (ISWC) in 2011, and from the International Conference on Automated Planning and Scheduling (ICAPS) in 2022.
Did you miss Sheila McIlraith’s lecture or would you like to hear it again? If so, just start the video below.
200 University Avenue West
Waterloo, ON N2L 3G1