Department Seminar
Lin
Zhang Room: M3 3127 |
Statistical inference and representation learning in genetics and single-cell genomics
Both statistical models and machine learning-based algorithms play vital roles in analyzing biomedical data. Although both approaches have demonstrated their power in unveiling biological systems, it is also known that statistical methods have a long-standing focus on inference, while machine learning-based methods concentrate on prediction. In this talk, I will present my recent works on statistical methods and machine learning approaches to analyze genetic and genomic data: specifically, (i) a novel class of retrospective regression-based association tests that offer a different perspective on the omnipresent genome-wide association studies between genetic factors (and other risk factors) and health outcomes, and (ii) machine learning-based models for large-scale single-cell data, which fulfills the biological needs in representation learning for high-dimensional, noisy single-cell multi-omics data. Successful applications notwithstanding, the optimal choice between statistical methods and machine learning approaches for analyzing genetic and genomic data remains debatable, which calls for more future research that integrates the two approaches for robust and interpretable biomedical research.