Department Seminar by Hongda HuExport this event to calendar

Wednesday, November 10, 2021 — 4:00 PM EST

Please Note: This seminar will be given online.

Student seminar series

Hongda Hu
University of Waterloo  

Link to join seminar: Hosted on Microsoft Teams

Risk-aware multiarmed bandit algorithm

Multi-armed bandit (MAB) is a type of online learning and sequential decision-making problem that occurs when the underlying models are unknown. The classic MAB problem focuses on purely maximizing the expected reward. In my paper, I consider the risk and analyzed the MAB problem under the mean-variance setting. The majority of MAB literature assumes independent arms and only pulls one arm pulled at each round. In my paper, I drop the independence assumption. The learner is allowed to pull multiple arms every time to analyze the correlations among arms. The risk-aware multiarmed bandit algorithm (RAMAB) is proposed in the paper, and I theoretically proved the proposed algorithm achieved logarithmic learning regret. Numerically, I show that our proposed algorithm performs well as compared to the benchmark.  

Event tags 

S M T W T F S
28
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
  1. 2022 (2)
    1. August (1)
    2. April (1)
  2. 2021 (87)
    1. December (10)
    2. November (12)
    3. October (8)
    4. September (5)
    5. July (4)
    6. June (3)
    7. May (6)
    8. April (8)
    9. March (13)
    10. February (7)
    11. January (12)
  3. 2020 (71)
    1. December (2)
    2. November (13)
    3. October (16)
    4. September (7)
    5. August (5)
    6. July (3)
    7. June (2)
    8. May (1)
    9. March (4)
    10. February (4)
    11. January (14)
  4. 2019 (65)
  5. 2018 (44)
  6. 2017 (55)
  7. 2016 (44)
  8. 2015 (37)
  9. 2014 (44)
  10. 2013 (46)
  11. 2012 (44)