Seminar by Xiudi Li

Thursday, January 4, 2024 10:00 am - 11:00 am EST (GMT -05:00)

Department seminar

Xiudi Li
Harvard University

Room: M3 3127


Robust and efficient statistical inference using data from diverse real-world sources

Leveraging the wealth of data from diverse real-world sources provides new opportunities to derive generalizable evidence efficiently. Yet, unique challenges may arise in the integrative analysis of multi-source data, including heterogeneity across data sources and data-sharing constraints. In this talk, I will discuss two projects for robust and efficient statistical inference using data from multiple sources. In the first part of the talk, I will discuss a robust inference framework for federated learning of multi-source data, enabling statistical inference for the prevailing model, defined as the one matching the majority of the sites. I propose a novel sampling method to address the additional variation arising from the selection of eligible sites and devise uniformly valid confidence intervals. The proposed method is also communication-efficient, privacy-preserving, and broadly applicable. In the second part, I will present a general framework for using external data available at the planning stage of a clinical trial to identify and make statistical inference about the efficiency gain from covariate adjustment in this future trial. I propose efficient estimators that allow for the incorporation of flexible statistical learning tools and develop statistical inference procedures to accompany the proposed estimators. The methods in these two projects are applied in the context of COVID-19 to study (1) risk factors associated with COVID-19 mortality and (2) efficiency gain from adjusting for baseline covariates in COVID-19 therapeutic trials.