<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Sushrut Bhalla</style></author><author><style face="normal" font="default" size="100%">Sriram Ganapathi Subramanian</style></author><author><style face="normal" font="default" size="100%">Mark Crowley</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Deep Multi Agent Reinforcement Learning for Autonomous Driving</style></title><secondary-title><style face="normal" font="default" size="100%">Canadian Conference on Artificial Intelligence</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2020</style></year></dates><publisher><style face="normal" font="default" size="100%">Spring, Lecture Notes in Artificial Intelligence</style></publisher><volume><style face="normal" font="default" size="100%">LNAI 12109</style></volume><pages><style face="normal" font="default" size="100%">17</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;
	Deep Learning and back-propagation have been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this work, we present techniques for centralized training of Multi-Agent Deep Reinforcement Learning (MARL) using the model-free Deep Q-Network (DQN) as the baseline model and communication between agents. We present two novel, scalable and centralized MARL training techniques (MA-MeSN, MA- BoN), which achieve faster convergence and higher cumulative reward in complex domains like autonomous driving simulators. Subsequently, we present a memory module to achieve a decentralized cooperative pol- icy for execution and thus addressing the challenges of noise and com- munication bottlenecks in real-time communication channels. This work theoretically and empirically compares our centralized and decentralized training algorithms to current research in the field of MARL. We also present and release a new OpenAI-Gym environment which can be used for multi-agent research as it simulates multiple autonomous cars driving on a highway. We compare the performance of our centralized algorithms to existing state-of-the-art algorithms, DIAL and IMS based on cumu- lative reward achieved per episode. MA-MeSN and MA-BoN achieve a cumulative reward of at least 263% of the reward achieved by the DIAL and IMS. We also present an ablation study of the scalability of MA- BoN showing that it has a linear time and space complexity compared to quadratic for DIAL in the number of agents.
&lt;/p&gt;
</style></abstract></record></records></xml>