Master’s Thesis Presentation • Cryptography, Security, and Privacy (CrySP) • Ancestry Deconvolution via Differential Privacy

Wednesday, January 21, 2026 9:00 am - 10:00 am EST (GMT -05:00)

Please note: This master’s thesis presentation will take place online.

Raiyan Chowdhury, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Florian Kerschbaum

This thesis presents the first study of ancestry determination under differential privacy (DP). Direct-to-consumer genomics companies, such as 23andMe, offer ancestry testing to millions of individuals, yet remain vulnerable to severe data breaches. Such incidents are especially concerning because genomic data is uniquely identifying, highly correlated, and permanent once exposed. At the time of writing, 23andMe disclosed a catastrophic breach in October 2023 that compromised the genetic profiles of an estimated 6.9 million users, underscoring the urgent need for stronger privacy guarantees in genomic analysis.

In this work, we investigate the application of DP to ancestry deconvolution. Using the 1000 Genomes dataset and Gnomix, a state-of-the-art ancestry inference model, we evaluate how privatizing single nucleotide polymorphism (SNP) data affects ancestry classification accuracy. We implement both naïve and correlation-aware local differential privacy (LDP) mechanisms across varying privacy budgets, enabling a systematic study of the privacy-utility trade-off in ancestry inference.

Our results demonstrate that while naïve DP perturbations significantly degrade accuracy, correlation-aware LDP mechanisms preserve substantially more predictive power by accounting for linkage disequilibrium (LD). This thesis establishes a foundation for private ancestry deconvolution, providing an empirical benchmark of state-of-the-art DP methods in genomics and highlighting both the challenges and potential of integrating DP into ancestry testing.


Attend this master’s thesis presentation virtually on Zoom.