PhD Seminar • Artificial Intelligence | Machine Learning • Decoupling Extrinsic and Intrinsic Drives: Flexible Exploration versus Exploitation

Friday, January 30, 2026 1:30 pm - 2:30 pm EST (GMT -05:00)

Please note: This PhD seminar will take place in DC 2314 and online.

Junteng Zheng, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Jeff Orchard

A persistent challenge in reinforcement learning is getting an agent to explore intelligently when external rewards are rare — like trying to learn in the dark with only occasional hints. Intrinsic reward is one of the most popular ways to give an AI “curiosity” and push it to discover useful behaviors in sparse-reward settings, but many existing approaches still struggle with efficiency and with balancing exploration and exploitation.

In this seminar, I will present an alternative mechanism for leveraging intrinsic motivation that avoids directly combining intrinsic and extrinsic rewards, called Decoupled Intrinsic–Extrinsic Control (DIEC). We model a single agent as two decision-making subsystems with distinct objectives: one optimized solely for extrinsic, task-specific rewards, and the other driven exclusively by intrinsic motivation. Because these subsystems may advocate different actions, we reconcile them through a voting-based action selection mechanism. This design allows curiosity to influence behavior at the decision-making level rather than through reward shaping, providing a simple and principled framework for integrating intrinsic motivation into reinforcement learning. Empirical results demonstrate improved efficiency and effectiveness, and more reliable handling of the exploration–exploitation trade-off.


To attend this PhD seminar in person, please go to DC 2314. You can also attend virtually on Zoom.