Master’s Thesis Presentation • Algorithms and Complexity • A Bias-Variance-Privacy Trilemma for Statistical Estimation

Friday, July 28, 2023 2:00 pm - 3:00 pm EDT (GMT -04:00)

Please note: This master’s thesis presentation will take place in DC 3317 and online.

Matthew Regehr, Master’s candidate
David R. Cheriton School of Computer Science

Supervisors: Professors Gautam Kamath, Shai Ben-David

The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean. Clipping controls the sensitivity and, hence, the variance of the noise that we add for privacy. But clipping also introduces statistical bias. We prove that this tradeoff is inherent: no algorithm can simultaneously have low bias, low variance, and low privacy loss for arbitrary distributions.

On the positive side, we show that unbiased mean estimation is possible under approximate differential privacy if we assume that the distribution is symmetric. Relaxing to approximate differential privacy is necessary. We show that, even when the data is sampled from a Gaussian, unbiased mean estimation is impossible under pure or concentrated differential privacy.