AI seminar: Regret-based elicitation of rewards for sequential decision problems
Speaker: Kevin Regan, University of Toronto
Traditional methods for finding optimal policies in stochastic, multi-step decision environments require a precise model of both the environment dynamics and the rewards associated with taking actions and the effects of those actions.