DB Meeting - Fast Mining of Frequent Itemsets through Sampling | Data Systems Group

Wednesday, May 7, 2014 2:30 pm - 2:30 pm EDT (GMT -04:00)

Speaker:

Matteo Riondato, Brown University

Abstract:

Frequent Itemsets mining is one of the key tasks in knowledge discovery from databases. The cost of mining algorithms for this problem depends on the number of itemsets and on the size of the dataset. In this talk I will present three algorithms that cuts the dependency on the dataset size by using random samples of the dataset to extract approximations of the collection of frequent itemsets with guaranteed high-quality. The algorithms use advanced data structures, VC-dimension, and MapReduce.
Bio: Matteo Riondato is graduating with a PhD in computer science from Brown University. His dissertation, titled "Sampling-based Randomized Algorithms for Big Data Analytics" explores the connection between statistical learning theory and data mining. In his research, he tries to bridge the gap between data analytics, database systems, and theory by exploiting the power of modern statistics and probability in new ways that are efficient for modern problems, modern systems, and real data.

Location Information

Location Address: DC - William G. Davis Computer Research Centre
200 University Avenue West
Room 1331
Waterloo, ON, CA N2L 3G1

Location coordinates:

Link to map: https://www.google.ca/maps/place/Davis+Centre+Library/@43.4720375,-80.5457127,17z/data=!4m8!1m2!2m1!1sdc+uw!3m4!1s0x882bf401b75d49ef:0xbfcdb8b1e5ec45c4!8m2!3d43.472714!4d-80.5421904?dcr=0