Tutte seminar - Karthekeyan Chandraskearan

Friday, October 18, 2013 3:30 pm - 4:30 pm EDT (GMT -04:00)

Faster private release of marginals on small databases

Speaker: Professor Karthekeyan Chandraskearan
Affiliation: Harvard University
Room: Mathematics and Computer Building (MC) 5158

Abstract:

The goal of privacy-preserving data analysis is to enable rich statistical analyses on the database while protecting the privacy of the individual records. One of the most important classes of statistics on a dataset is its marginals. The answer to a k-way marginal query is the fraction of database's records with a given value in each of a given set of up to k attributes. In this work, we focus on the notion of differential privacy to answer marginal queries. Early work in differential privacy gave algorithms that are private provided the number of records in the database is comparable to (or larger than) the number of queries. Although a remarkable line of work improved this requirement to achieve privacy on small databases, this was at the expense of an exponential blow up in the run time. 

In this work, we give an algorithm with improved run time (sub-exponential) for privately answering marginal queries on small databases. Our algorithm uses a new approximate representation of the database that is inspired by tools from learning theory. This representation is derived by approximating the disjunction on low Hamming weight inputs using low-degree polynomials with coefficients of bounded L1-norm. We show new upper and lower bounds for such polynomials. Our lower bound matches our upper bound and thus, exhibits the tightness of our approach when polynomials are used to derive the approximate representation of the database.
 
Joint work with J. Thaler, J. Ullman, and A. Wan.