Reinforcement Learning

All papers, news and topics related to Reinforcement Learning.
Subramanian, S.Ganapathi et al., 2021. Partially Observable Mean Field Reinforcement Learning. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS). 3–7 May. London, United Kingdom: International Foundation for Autonomous Agents and Multiagent Systems, pp. 537-545.
Traditional multi-agent reinforcement learning algorithms are not scalable to environments with more than a few agents, since these algorithms are exponential in the number of agents. Recent research has introduced successful methods to scale multi-agent reinforcement learning algorithms to many agent scenarios using mean field theory. Previous work in this field assumes that an agent has access to exact cumulative metrics regarding the mean field behaviour of the system, which it can then use to take its actions. In this paper, we relax this assumption and maintain a distribution to model the uncertainty regarding the mean field of the system. We consider two different settings for this problem. In the first setting, only agents in a fixed neighbourhood are visible, while in the second setting, the visibility of agents is determined at random based on distances. For each of these settings, we introduce a Q-learning based algorithm that can learn effectively. We prove that this Q-learning estimate stays very close to the Nash Q-value (under a common set of assumptions) for the first setting. We also empirically show our algorithms outperform multiple baselines in three different games in the MAgents framework, which supports large environments with many agents learning simultaneously to achieve possibly distinct goals.
Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) methods in the environmental sciences. Here, we present a scoping review of ML applications in wildfire science and management. Our overall objective is to improve awareness of ML methods among wildfire researchers and managers, as well as illustrate the diverse and challenging range of problems in wildfire science available to ML data scientists. To that end, we first present an overview of popular ML approaches used in wildfire science to date, and then review the use of ML in wildfire science as broadly categorized into six problem domains, including: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. Furthermore, we discuss the advantages and limitations of various ML approaches relating to data size, computational requirements, generalizability, and interpretability, as well as identify opportunities for future advances in the science and management of wildfires within a data science context. In total, we identfied 300 relevant publications up to the end of 2019, where the most frequently used ML methods across problem domains included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. As such, there exists opportunities to apply more current ML methods — including deep learning and agent based learning — in the wildfire sciences, especially in instances involving very large multivariate datasets. We must recognize, however, that despite the ability of ML methods to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods, such as deep learning, requires a dedicated and sophisticated knowledge of their application. Finally, we stress that the wildfire research and management communities play an active role in providing relevant, high quality, and freely available wildfire data for use by practitioners of ML methods.

Good News in the UWECEML Lab

May 15, 2019

As spring continues to tease us with cold and rain we all could use some good news to cheer us up, well it seems we've been storing it up recently, there are several exciting achievements to highlight:

Read more about Good News in the UWECEML Lab
Bhalla, S., Subramanian, S.G. & Crowley, M., 2019. Training Cooperative Agents for Multi-Agent Reinforcement Learning. In Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019). Montreal, Canada.
Deep Learning and back-propagation has been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this paper we present techniques for centralized training of Multi-Agent (Deep) Reinforcement Learning (MARL) using the model-free Deep Q-Network as the baseline model and message sharing between agents. We present a novel, scalable, centralized MARL training technique, which separates the message learning module from the policy module. The separation of these modules helps in faster convergence in complex domains like autonomous driving simulators. A second contribution uses the centrally trained model to bootstrap training of distributed, independent, cooperative agent policies for execution and thus addresses the challenges of noise and communication bottlenecks in real-time communication channels. This paper theoretically and empirically compares our centralized training algorithms to current research in the field of MARL. We also present and release a new OpenAI-Gym environment which can be used for multi-agent research as it simulates multiple autonomous cars driving cooperatively on a highway.
Fighting Fire with AI: Using Artificial Intelligence to Improve Modelling and Decision Making in Wildfire Management, at Banff International Research Station, Banff, Alberta, Canada, Friday, November 17, 2017:
I was invited to speak at this week-long workshop at the fabulous BIRS facility in Banff Alberta. The workshop was entitled "Forest and Wildland Fire Management: a Risk Management Perspective" which brought together a wide range of experts and stakeholders from across Canada as well as some researchers from around the world to discuss the latest research on Forest Fire Management. It was an incredibly productive week that built many new connections. Read more about Fighting Fire with AI: Using Artificial Intelligence to Improve Modelling and Decision Making in Wildfire Management
Using Deep Learning and Reinforcement Learning to Tame Spatially Spreading Processes, at University of Waterloo, Wednesday, October 25, 2017

This was an invited talk for the Waterloo Institute for Complexity and Innovation (WICI) seminar series. The talk was recorded and can be watched from WICI's website here.

Abstract:

Recent advances in Artificial Intelligence and Machine Learning (AI/ML) allow us to learn predictive models and control policies for larger, more complex systems than ever before. However, some important real world domains such as...

Read more about Using Deep Learning and Reinforcement Learning to Tame Spatially Spreading Processes
BIRC Workshop On Deep Learning In Medicine, at University Hospital, London, Ontario, Canada, Monday, August 28, 2017:

This all-day workshop brough together researchers, students and medical professionals from medical imaging, image processing and machine learning to discuss what the new class of machine learning algorithms known collectively as Deep Learning are, how they are and could be used for medicine and what the impacts for medicine as a whole are of this technology. The workshop was hosted by the Biomedical Imaging Research Centre (BIRC) at the University of Western Ontario. I gave an introductory...

Read more about BIRC Workshop On Deep Learning In Medicine

Spatiotemporal planning involves making choices at multiple locations in space over some planning horizon to maximize utility and satisfy various constraints. In Forest Ecosystem Management, the problem is to choose actions for thousands of locations each year including harvesting, treating trees for fire or pests, or doing nothing. The utility models could place value on sale of lumber, ecosystem sustainability or employment levels and incorporate legal and logistical constraints on actions such as avoiding large contiguous areas of clearcutting. Simulators developed by forestry researchers provide detailed dynamics but are generally inaccesible black boxes. We model spatiotemporal planning as a factored Markov decision process and present a policy gradient planning algorithm to optimize a stochastic spatial policy using simulated dynamics. It is common in environmental and resource planning to have actions at different locations be spatially interelated; this makes representation and planning challenging. We define a global spatial policy in terms of interacting local policies defining distributions over actions at each location conditioned on actions at nearby locations. Markov chain Monte Carlo simulation is used to sample landscape policies and estimate their gradients. Evaluation is carried out on a forestry planning problem with 1,880 locations using a variety of value models and constraints. Index