Welcome to the Data Systems Group

The Data Systems Group at the University of Waterloo's Cheriton School of Computer Science builds innovative, high-impact platforms, systems, and applications for processing, managing, analyzing, and searching the vast collections of data that are integral to modern information societies — colloquially known as “big data” technologies.

Our capabilities span the full spectrum from unstructured text collections to relational data, and everything in between including semi-structured sources such as time series, log data, graphs, and other data types. We work at multiple layers in the software stack, ranging from storage management and execution platforms to user-facing applications and studies of user behaviour.

Our research tackles all phases of the information lifecycle, from ingest and cleaning to inference and decision support.

News

Professor Renée J. Miller has been named the Canada Excellence Research Chair in Data Intelligence. She is currently a University Distinguished Professor at Khoury College of Computer Science at Northeastern University. She will be joining the University of Waterloo in June 2024 as the Cheriton School of Computer Science’s first Canada Excellence Research Chair, bringing her expertise in data science to the School’s world-class Data Systems Group.

Recent PhD graduate Michael Abebe has received the 2023 Cheriton Distinguished Dissertation Award. Now in its fifth year, the award was established to recognize excellence in computer science doctoral research. In addition to the recognition, recipients receive a cash prize of $1,000.

Cheriton students Jeremy Chen, Yuqing Huang and Mushi Wang and Professors Semih Salihoğlu and Ken Salem have received a 2022 ACM SIGMOD Research Highlight Award for their paper “Accurate summary-based cardinality estimation through the lens of cardinality estimation graphs.” No stranger to scholarly recognition, this research earlier received the Best Experiment, Analysis and Benchmark Award at VLDB 2022, the 48th International Conference on Very Large Databases, where it was presented originally.