Big data analytics

The Mathematics faculty offers an unmatched environment that unites statistical research with that of computer science, on the analysis and management of big data. Novel database solutions are integrated with innovative approaches to machine learning from statistics and artificial intelligence, of value for applications across government and business (to name but a few of the key receptors). We see the usage of big data analytics daily in the presentation of content on the web pages we visit (and efforts such as personalized advertising introduced by companies such as Google). Key researchers come from the Statistics and Actuarial Science department, the Department of Combinatorics and Optimization and the research groups of Artificial Intelligence, Computational Statistics, Databases and Algorithms and Complexity, in the Cheriton School of Computer Science.

Ali Ghodsi of Statistics and Actuarial Science department develops statistical machine learning methods for analyzing large-scale datasets, with application to dimensionality reduction, data mining, bioinformatics, computer vision and sequential decision making.

Wayne Oldford of the Statistics and Actuarial Science department conducts research on the visualization and analysis of high-dimensional data. Included are connections to several industry sectors including Primal Fusion and the Ocean Tracking Network.

Matthias Schonlau from Statistics and Actuarial Science department has interests that include automated analyses of surveys. He joined UWaterloo after a 14-year career in industry where he successfully developed software for such problems as prediction tasks in data mining.

Mu Zhu from Statistics and Actuarial Science department conducts research that combines interests in multivariate analysis, nonparametric estimation and machine learning, including current projects focused on matrix factorization techniques for recommender systems and ensemble learning for variable selection.

Steve Vavasis from the Department of Combinatorics and Optimization works on convex optimization and its application to problems in data mining and machine learning. He teaches a graduate course on optimization for big data.​

Pascal Poupart of the Computational Statistics and Artificial Intelligence groups in the Cheriton School of Computer Science is a machine learning researcher with expertise in the development of approximate scalable algorithms for Partially Observable Markov Decision Processes (POMDPs). This research enables effective reasoning under uncertainty for intelligent systems operating with big data in a variety of application areas including assistive technologies for the elderly.

Shai Ben-David of the Cheriton School of Computer Science is a leading researcher in the area of theoretical machine learning and on the critical subproblem of clustering, of particular value towards effective solutions for managing big data through innovative analysis of algorithms for data mining.

Jesse Hoey and Dan Lizotte conduct research in the Cheriton School at the intersection of Artificial Intelligence and Health Informatics. Jesse Hoey directs the School's Computational Health Informatics and is focused on research that enables planning and acting in large scale applications under uncertainty. Dan Lizotte offers significant contributions for clinical decision support through a combination of statistical and machine learning methodology, interacting with massive sets of data.

Tamer Özsu of the Cheriton School's Database Group is interested in challenges such as effectively scaling graph database management systems to facilitate online social networks and in issues such as enabling effective parallel processing of database operators in big data environments and integrating consistency mechanisms for data centers supporting large-scale utility computing.

Ian Munro of the Algorithms and Complexity Group and the Databases Group of the Cheriton School studies the application of algorithmic techniques for the efficient routing of data requests and multi-level caches that are increasingly found in systems managing big data.

Perhaps the largest data set currently being explored by our researchers is that of Ming Li from the Cheriton School of Computer Science who is developing algorithms for question answering from voice-based interfaces, involving many terabytes of data.