Training Cooperative Agents for Multi-Agent Reinforcement Learning


Bhalla, S., Subramanian, S.G. & Crowley, M., 2019. Training Cooperative Agents for Multi-Agent Reinforcement Learning. In Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019). Montreal, Canada.


Deep Learning and back-propagation has been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this paper we present techniques for centralized training of Multi-Agent (Deep) Reinforcement Learning (MARL) using the model-free Deep Q-Network as the baseline model and message sharing between agents. We present a novel, scalable, centralized MARL training technique, which separates the message learning module from the policy module. The separation of these modules helps in faster convergence in complex domains like autonomous driving simulators. A second contribution uses the centrally trained model to bootstrap training of distributed, independent, cooperative agent policies for execution and thus addresses the challenges of noise and communication bottlenecks in real-time communication channels. This paper theoretically and empirically compares our centralized training algorithms to current research in the field of MARL. We also present and release a new OpenAI-Gym environment which can be used for multi-agent research as it simulates multiple autonomous cars driving cooperatively on a highway.