Speaker: Marc Bellemare (University of Alberta)
Title: Pixels and Priors: Learning a Generative Model of Atari 2600 Games
The model-based approach to reinforcement is appealing as it allows the agent to reason about unexplored parts of its environment. However, most existing model learning techniques impose harsh restrictions on the model structure or simply do not scale well to large domains. In this presentation, I will present a new algorithm for learning generative models of arbitrary Atari 2600 games. Using statistical data compression methods, this algorithm achieves a per-step cost linear in the number of pixels, making it an interesting alternative to other model learning methods. One of the novel components of this algorithm is the quad-tree factorization, which uses Bayesian model averaging to efficiently learn a variable-resolution factorization of the Atari 2600 observation space. I will provide redundancy bounds that guarantee the statistical efficiency and soundness of the approach and some recent empirical results on a variety of Atari 2600 games.