Data Systems Seminar Series • Eliminating Spurious Dependencies in Data: From Cleaning to Private Data Generation

Monday, March 16, 2026 10:30 am - 11:30 am EDT (GMT -04:00)

Please note: This seminar will take place in DC 1302.

Mostafa Milani, Assistant Professor
Department of Computer Science, Western University

Statistical dependencies embedded in data can reflect historical bias, measurement errors, or confounding effects. When such dependencies link sensitive attributes to outcomes in unintended ways, machine learning models trained on the data may inherit unfair or unstable behavior. While many fairness interventions operate at the model level, they leave the underlying data unchanged.

This talk presents a data-centric approach that addresses unwanted dependencies directly at the level of data processing by enforcing conditional independence (CI) constraints. Two complementary settings are considered. First, a probabilistic data cleaning framework is introduced that corrects datasets violating CI constraints by learning an optimal transport map. This map modifies the empirical data distribution as little as possible while removing specified conditional dependencies. Second, a method is presented for enforcing CI during differentially private synthetic data generation by constraining the structure learning stage of private graphical models. This prevents the synthetic data from encoding prohibited dependency paths, while preserving both privacy guarantees and predictive utility. Together, these works demonstrate how fairness constraints can be formulated as structural constraints on statistical dependencies, and how they can be enforced both in observed data and in privacy-preserving data release.


Bio: Mostafa Milani is an Assistant Professor in the Department of Computer Science at Western University. His research focuses on data quality, data cleaning, and trustworthy data systems, with an emphasis on fairness and privacy in structured data.

He previously completed postdoctoral research at the University of British Columbia and McMaster University. He has served on the program committees of leading conferences, including SIGMOD, VLDB, ICDE, and FAccT, and has taken on organizational roles such as Communications Chair at SIGMOD and Registration Chair at ICDE.