PhD Seminar • Artificial Intelligence • A Simple Mixture Policy Parameterization for Improving Sample Efficiency of Conditional Value at Risk OptimizationExport this event to calendar

Friday, June 21, 2024 — 10:00 AM to 11:00 AM EDT

Please note: This PhD seminar will take place in DC 2585 and online.

Yudong Luo, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Pascal Poupart

Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts: a focus on tail-end performance that overlooks many sampled trajectories, and the potential of gradient vanishing when the lower tail of the return distribution is overly flat.

To address these challenges, we propose a simple mixture policy parameterization. This method integrates a risk-neutral policy with an adjustable policy to form a risk-averse policy. By employing this strategy, all collected trajectories can be utilized for policy updating, and the issue of vanishing gradients is counteracted by stimulating higher returns through the risk-neutral component, thus lifting the tail and preventing flatness. Our empirical study reveals that this mixture parameterization is uniquely effective across a variety of benchmark domains. Specifically, it excels in identifying risk-averse CVaR policies in some Mujoco environments where the traditional CVaR-PG fails to learn a reasonable policy.


To attend this PhD seminar in person, please go to DC 2585. You can also attend virtually using Zoom at https://vectorinstitute.zoom.us/j/85188441758.

Location 
DC - William G. Davis Computer Research Centre
Hybrid: DC 2585 | Online PhD seminar
200 University Ave West

Waterloo, ON N2L 3G1
Canada
Event tags 

S M T W T F S
26
27
28
29
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
4
5
6
  1. 2024 (168)
    1. August (3)
    2. July (7)
    3. June (17)
    4. May (23)
    5. April (41)
    6. March (27)
    7. February (25)
    8. January (25)
  2. 2023 (296)
    1. December (20)
    2. November (28)
    3. October (15)
    4. September (25)
    5. August (30)
    6. July (30)
    7. June (22)
    8. May (23)
    9. April (32)
    10. March (31)
    11. February (18)
    12. January (22)
  3. 2022 (245)
  4. 2021 (210)
  5. 2020 (217)
  6. 2019 (255)
  7. 2018 (217)
  8. 2017 (36)
  9. 2016 (21)
  10. 2015 (36)
  11. 2014 (33)
  12. 2013 (23)
  13. 2012 (4)
  14. 2011 (1)
  15. 2010 (1)
  16. 2009 (1)
  17. 2008 (1)