Wednesday, May 7, 2014 2:30 pm
-
2:30 pm
EDT (GMT -04:00)
Speaker: | Matteo Riondato, Brown University |
Abstract: |
Frequent
Itemsets
mining
is
one
of
the
key
tasks
in
knowledge
discovery
from
databases.
The
cost
of
mining
algorithms
for
this
problem
depends
on
the
number
of
itemsets
and
on
the
size
of
the
dataset.
In
this
talk
I
will
present
three
algorithms
that
cuts
the
dependency
on
the
dataset
size
by
using
random
samples
of
the
dataset
to
extract
approximations
of
the
collection
of
frequent
itemsets
with
guaranteed
high-quality.
The
algorithms
use
advanced
data
structures,
VC-dimension,
and
MapReduce. Bio: Matteo Riondato is graduating with a PhD in computer science from Brown University. His dissertation, titled "Sampling-based Randomized Algorithms for Big Data Analytics" explores the connection between statistical learning theory and data mining. In his research, he tries to bridge the gap between data analytics, database systems, and theory by exploiting the power of modern statistics and probability in new ways that are efficient for modern problems, modern systems, and real data. |