MASc Seminar: Deep Multi Agent Reinforcement Learning for Autonomous Driving

Candidate: Sushrut Bhalla

Title: Deep Multi Agent Reinforcement Learning for Autonomous Driving

Date: January 7, 2020

Time: 9:00am

Place: EIT 3142

Supervisor(s): Crowley, Mark

Abstract:

Deep Learning and back-propagation have been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this work, we present techniques for centralized training of Multi-Agent Deep Reinforcement Learning (MARL) using the model-free Deep Q-Network (DQN) as the baseline model and communication between agents.

We present two novel, scalable and centralized MARL training techniques (MA-MeSN, MA-BoN), which separate the message learning module from the policy module. The separation of these modules helps in faster convergence in complex domains like autonomous driving simulators. Subsequently, we present a memory module to achieve a decentralized cooperative policy for execution and thus addressing the challenges of noise and communication bottlenecks in real-time communication channels. This work theoretically and empirically compares our centralized and decentralized training algorithms to current research in the field of MARL. We also present and release a new OpenAI-Gym environment which can be used for multi-agent research as it simulates multiple autonomous cars driving cooperatively on a highway. We compare the performance of our centralized algorithms to existing state-of-the-art algorithms, DIAL and IMS based on cumulative reward achieved per episode. MA-MeSN and MA-BoN achieve a cumulative reward of at-least 263% of the reward achieved by the DIAL and IMS. We also present an ablation study of the scalability of MA-BoN and see a linear increase in inference time and number of trainable parameters compared to quadratic increase for DIAL.

Support Waterloo Engineering