Joint Lecture: Dean's Distinguished Women in Mathematics, Statistics and Computer Science Lecture Series and David Sprott Distinguished Lecture Series

Tuesday, September 23, 2025

Veridical Data Science towards Trustworthy AI

Abstract

In this talk, I will introduce the Predictability-ComputabilityStability (PCS) framework for veridical (truthful) data science, highlighting its critical role in producing reliable and actionable insights. I will share success stories from cancer detection and cardiology, showcasing how PCS principles have guided cost effective designs and improved outcomes in these projects. Since trustworthy uncertainty quantification is indispensable for trustworthy AI, I will discuss PCS uncertainty quantification for prediction in regression and multi-class classification. PCS-UQ consists of three steps: pred-check, bootstrap, and multiplicative calibration. Through test results over 26 benchmark datasets, PCS-UQ will be shown to outperform common forms of conformal prediction in terms of width, subgroup coverage, and subgroup interval width. Finally, the multiplicative step in PCS-UQ will be shown to be a new form of conformal prediction.

Bio

Bin Yu is CDSS Chancellor's Distinguished Professor in Statistics, EECS, Center for Computational Biology, and Senior Advisor at the Simons Institute for the Theory of Computing, all at UC Berkeley. Her research focuses on the practice and theory of statistical machine learning, veridical data science, responsible and safe AI, and solving interdisciplinary data problems in neuroscience, genomics, and precision medicine. She and her team have developed algorithms such as iterative random forests (iRF), stability-driven NMF, adaptive wavelet distillation (AWD), Contextual Decomposition for Transformers (CD-Т), SPEX and ProxySPEX for interpreting deep learning models, especially for compositional interpretability.

She is a member of the National Academy of Sciences and of the American Academy of Arts and Sciences. She was a Guggenheim Fellow, President of Institute of Mathematical Statistics (IMS), and delivered the Tukey Lecture of the Bernoulli Society, the IMS Rietz and Wald Lectures, and the Distinguished Achievement Award and Lecture (formerly Fisher Lecture) of COPSS (Committee of Presidents of Statistical Societies). She holds an Honorary Doctorate from The University of Lausanne. She is on the Editorial Board of Proceedings of National Academy of Science (PNAS) and a co-editor of the Harvard Data Science Review (HDSR).