Abstract

The goal of learning from demonstration or imitation learning is to teach the model to generalize across unseen tasks based on available demonstrations. This ability can be important for the stable performance of a robot in a chaotic environment such as a kitchen when compared to a more structured setting such as a factory assembly line. By leaving the task learning up to the algorithm, human teleoperators can dictate the action of robots without any programming knowledge and improve overall productivity in various settings. Due to the difficulty of manually collecting gripper trajectory, successful application of learning from demonstrations would have to be able to learn from a sparse number of examples while still providing a high degree of predicted trajectory accuracy. Inspired by the development of transformer models for large language model tasks such as sentence translation and text generation, I seek to modify the model for trajectory prediction. While there have been previous works that managed to train end-to-end models capable of taking images and contexts and then generating control output, those works rely
on a massive quantity of demonstrations and detailed annotations. To facilitate the training process for a sparse number of demonstrations, we created a training pipeline that includes a DeeplabCut model for object position prediction, followed by the Task-Parameterized Transformer model for learning the demonstrated trajectories, and supplemented with data augmentations that allow the model to overcome the constraint of limited data. The resulting model is capable of outputting predicted gripper trajectory with better accuracy than previous works in trajectory prediction.

Presenter

Yinghan Chen, MASc candidate in Systems Design Engineering

Attend in person in E5-6006 or on Zoom

Zoom link:
https://uwaterloo.zoom.us/j/97666225290?pwd=eEpqR1VLZmdOdGlHajZFMjM0QlExZz09
Passcode: 354068

Attending this seminar will count towards the graduate student seminar attendance milestone!