Seminar by Jun Young Park

Tuesday, October 21, 2025 10:00 am - 11:00 am EDT (GMT -04:00)

Statistics and Biostatistics seminar series

Jun Young Park
University of Toronto

Room: M3 3127


Preparing good data for more reproducible science in multi-site neuroimaging studies

Neuroimaging data provide rich information about the human brain, including its anatomy and function. Since such data are often collected across multiple study sites, substantial and unwanted variations can arise due to differences in scanner types, acquisition parameters, and preprocessing pipelines, implying that an increased sample size does not necessarily guarantee higher reproducibility. Inspired by the “batch effect” problem in -omics research, several statistical methods have been proposed to produce “batch-free” datasets by harmonizing heterogeneous means and variances across sites. Yet, it remains unclear how to effectively account for heterogeneous covariance structures in high-dimensional neuroimaging data. In this talk, I will present statistical methods that leverage the unique characteristics of imaging data (e.g., high dimensionality, spatial dependence in cortical thickness, network structure in functional MRI) to construct parametric models for site-specific covariance and to develop scalable methods for practical use. Real-data applications demonstrate that the proposed approaches outperform existing methods, offering a practical path toward increased reproducibility.