Please note: This PhD seminar will take place in DC 2310 and online.
Nils Lukas, PhD candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Florian Kerschbaum
Language models (LMs) have been shown to leak information about their training data through sentence-level membership inference and reconstruction attacks. Understanding the risk of LMs leaking Personally Identifiable Information (PII) has received less attention, which can be attributed to the false assumption that dataset curation techniques such as scrubbing are sufficient to prevent PII leakage. Scrubbing reduces but does not prevent the risk of PII leakage. We introduce game-based definitions for three types of PII leakage via black-box extraction, inference, and reconstruction attacks with only API access to the LM.
Our main contributions are (i) novel attacks that can extract up to 10× more PII sequences than existing attacks, (ii) showing that sentence-level differential privacy reduces the risk of PII disclosure but still leaks about 3% of PII sequences, and (iii) a subtle connection between record-level membership inference and PII reconstruction. I summarize related work and provide an overview of future work for privacy attacks against LMs.
To attend this PhD seminar in person please go to DC 2310.