Department seminar by Haoda Fu

Thursday, September 23, 2021 4:00 pm - 4:00 pm EDT (GMT -04:00)

Statistics & Biostatistics seminar series

Haoda Fu
Eli Lilly and Company

Our Recent Development on Cost Constraint Machine Learning Models

Suppose we can only pay $100 to diagnose a disease subtype for selecting best treatments. We can either measure 10 cheap biomarkers or 2 expensive ones.
How can we pick the optimal combinations to achieve highest diagnostic accuracy?
This is a nontrivial problem. For a special case, as each variable costs the same, the total cost constraint will be reduced to an L0 penalty which is the best subset selection problem. Until recently, there is no good solution even for this special case. Traditional algorithms can only solve up to ~35 variables for best subset selections. Thanks to the algorithms breakthrough in the field of optimization research. We have modified and extended a recently developed algorithm to handle our cost constraintproblems with thousands of variables.
In this talk, we will talk about the background of this problem, methods development, theoretical results. We will also show you an impressive example on dynamic programming. It will tell a story on how algorithms can make a difference on computing.  I hope that through this talk, you can feel the modern statistics which combined computer science, statistics, and algorithms.