University of Waterloo
200 University Ave W, Waterloo, ON
N2L 3G1
Phone: (519) 888-4567
Staff and Faculty Directory
Contact the Department of Electrical and Computer Engineering
Candidate: Haobei Song
Title: From Reinforcement Learning to Approximate Optimal Learning
Date: August 9, 2019
Time: 2:00PM
Place: EIT 3145
Supervisor(s): Tripunitara, Mahesh
Abstract:
Reinforcement learning framework gives no closed form analysis for exploration/exploitation dilemma. As a consequence, there is no general theory to explain data efficiency, which often impacts practical applications of reinforcement learning algorithms.
The exploration/exploitation dilemma is mostly dealt with in an ad hoc approach and the heuristics is hardly transferable among different problems. This thesis instead steps out of the conventional reinforcement learning framework and looks at a larger and more general set of problems, aka. optimal learning problems, in the hope that the exploration/exploitation dilemma can be addressed in theory either explicitly or implicitly.
Optimal learning frameworks can be constructed based on existing reinforcement learning frameworks. Three different optimal learning formulations are proposed to address the issues in the three different reinforcement learning frameworks.
Following such formulation, three classes of approximate optimal learning algorithms are proposed drawing from the following principles respectively:
(1) Sample from a pool of prediction neural networks as dynamics model;
(2) Approximate Bayesian inference rule using entangled prediction feed forward network and belief recurrent neural network;
(3) Use memory based recurrent neural network to extract features from observations.
Empirical evidence is provided to show the improvement of the algorithms used.
University of Waterloo
200 University Ave W, Waterloo, ON
N2L 3G1
Phone: (519) 888-4567
Staff and Faculty Directory
Contact the Department of Electrical and Computer Engineering
The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is centralized within our Office of Indigenous Relations.