Title: Learning from Non-Traditional Sources of Data
Abstract: Imitation learning has traditionally been focused on learning a policy or a reward function from expert demonstrations. However, in practice in many robotics applications, we have limited access to expert demonstrations. Today, I will talk about a set of techniques to address some of the challenges of learning from non-traditional sources of data, i.e., suboptimal demonstrations, rankings, play data, and physical corrections. I will first talk about our confidence-aware imitation learning approach that simultaneously estimates a confidence measure over demonstrations and the policy parameters. I will then talk about extending this approach to learn a confidence measure over expertise of different demonstrators in an unsupervised manner. Following up, I will discuss how we can learn more expressive models such as a multimodal reward function when learning from a mixture of ranking data. Finally, I talk about our recent efforts in learning from other non-traditional sources of data in interactive domains. Specifically, we show how predicting latent affordances can be substantial when learning from undirected play data in interactive domains, and how we can learn from a sequence of interactions through physical corrections.
Speaker Bio: Dorsa Sadigh is an assistant professor in Computer Science and Electrical Engineering at Stanford University. Her research interests lie in the intersection of robotics, learning, and control theory. Specifically, she is interested in developing algorithms for safe and adaptive human-robot interaction. Dorsa received her doctoral degree in Electrical Engineering and Computer Sciences (EECS) from UC Berkeley in 2017, and received her bachelor’s degree in EECS from UC Berkeley in 2012. She is awarded the NSF CAREER award, the AFOSR Young Investigator award, MIT TR35, and the Google Faculty Award, and the Amazon Faculty Research Award.
Date and Time:
Thursday, March 17, 2022
1:00 - 2:30 PM - EST
Recording: https://youtu.be/QO_K1WpDRTo