Faster private release of marginals on small databases
Speaker: | Professor Karthekeyan Chandraskearan |
---|---|
Affiliation: | Harvard University |
Room: | Mathematics and Computer Building (MC) 5158 |
Abstract:
The goal of privacy-preserving data analysis is to enable rich statistical analyses on the database while protecting the privacy of the individual records. One of the most important classes of statistics on a dataset is its marginals. The answer to a k-way marginal query is the fraction of database's records with a given value in each of a given set of up to k attributes. In this work, we focus on the notion of differential privacy to answer marginal queries. Early work in differential privacy gave algorithms that are private provided the number of records in the database is comparable to (or larger than) the number of queries. Although a remarkable line of work improved this requirement to achieve privacy on small databases, this was at the expense of an exponential blow up in the run time.
In
this
work,
we
give
an
algorithm
with
improved
run
time
(sub-exponential)
for
privately
answering
marginal
queries
on
small
databases.
Our
algorithm
uses
a
new
approximate
representation
of
the
database
that
is
inspired
by
tools
from
learning
theory.
This
representation
is
derived
by
approximating
the
disjunction
on
low
Hamming
weight
inputs
using
low-degree
polynomials
with
coefficients
of
bounded
L1-norm.
We
show
new
upper
and
lower
bounds
for
such
polynomials.
Our
lower
bound
matches
our
upper
bound
and
thus,
exhibits
the
tightness
of
our
approach
when
polynomials
are
used
to
derive
the
approximate
representation
of
the
database.
Joint
work
with
J.
Thaler,
J.
Ullman,
and
A.
Wan.