<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Bhalla, Sushrut</style></author><author><style face="normal" font="default" size="100%">Subramanian, Sriram Ganapathi</style></author><author><style face="normal" font="default" size="100%">Crowley, Mark</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Agmon, N.</style></author><author><style face="normal" font="default" size="100%">Taylor, M.E.</style></author><author><style face="normal" font="default" size="100%">Elkind, E.</style></author><author><style face="normal" font="default" size="100%">Veloso, M.</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Training Cooperative Agents for Multi-Agent Reinforcement Learning</style></title><secondary-title><style face="normal" font="default" size="100%">Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019)</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Autonomous Driving</style></keyword><keyword><style  face="normal" font="default" size="100%">MARL</style></keyword><keyword><style  face="normal" font="default" size="100%">Multi-Agent Reinforcement Learning</style></keyword><keyword><style  face="normal" font="default" size="100%">MultiAgent Systems</style></keyword><keyword><style  face="normal" font="default" size="100%">reinforcement learning</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2019</style></year></dates><pub-location><style face="normal" font="default" size="100%">Montreal, Canada</style></pub-location><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Deep Learning and back-propagation has been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this paper we present techniques for centralized training of Multi-Agent (Deep) Reinforcement Learning (MARL) using the model-free Deep Q-Network as the baseline model and message sharing between agents. We present a novel, scalable, centralized MARL training technique, which separates the message learning module from the policy module. The separation of these modules helps in faster convergence in complex domains like autonomous driving simulators. A second contribution uses the centrally trained model to bootstrap training of distributed, independent, cooperative agent policies for execution and thus addresses the challenges of noise and communication bottlenecks in real-time communication channels. This paper theoretically and empirically compares our centralized training algorithms to current research in the field of MARL. We also present and release a new OpenAI-Gym environment which can be used for multi-agent research as it simulates multiple autonomous cars driving cooperatively on a highway.</style></abstract></record></records></xml>