Ihab Ilyas named ACM Fellow

Wednesday, January 13, 2021

Professor Ihab Ilyas has been named a 2020 ACM Fellow for his contributions to data cleaning and data integration.

The Association for Computing Machinery is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. ACM fellowships are conferred to the top 1 percent of the association’s members, and the prestigious recognition indicates excellence in technical, professional and leadership contributions that advance computing, promote the exchange of ideas, and further ACM’s objectives.

This year the Association for Computing Machinery named 95 members as ACM Fellows for wide-ranging and fundamental contributions in artificial intelligence, cloud computing, computer graphics, computational biology, data science, human-computer interaction, software engineering, theoretical computer science, and virtual reality, among other areas. Fellows are nominated by their peers, with nominations reviewed by a distinguished selection committee.

“Congratulations to Ihab on receiving this much-deserved recognition from ACM,” said Raouf Boutaba, Professor and Director of the Cheriton School of Computer Science. “His fundamental contributions to data cleaning and data integration have had significant and lasting impacts, both in shaping the direction of data systems research and in the development of technologies adopted by industry.”

Professor Ilyas completed his PhD at Purdue University in 2004. He joined the Cheriton School of Computer Science as a faculty member later that year and is a core member of the School’s Data Systems Group. He has made fundamental contributions to database technology, in particular to rank-aware query processing, uncertain data management, and data cleaning. Early in his career, he pioneered the notion of rank-aware query processing, providing new scan and join operators that rank query results. He proposed an effective cost-based optimization framework that integrates these processing operators in relational database engines.

photo of Professor Ihab Ilyas

Ihab Ilyas is a Professor in the Cheriton School of Computer Science and the NSERC-Thomson Reuters Research Chair on Data Quality. His research focuses on big data and database systems, with special interest in data quality and integration, managing uncertain data, machine learning for data curation, and information extraction.

Professor Ilyas and his cadre of students were the first to define the problem of ranking uncertain data, where the record membership, the score values or both are uncertain. This work launched a new line of research in the database community to better understand the interplay between data uncertainty and ranking requirements by users.

“I’m honoured to receive ACM’s recognition as a Fellow and thank the Cheriton School of Computer Science for supporting my nomination,” Professor Ilyas said. “This recognition would not be possible without the diligent work of my awesome graduate students over the years as well as that of my talented colleagues and collaborators. I am grateful to all of them.”

Since 2009, Professor Ilyas has focused on data quality and the technical challenges in building data-cleaning systems. His group introduced novel practical algorithms and system prototypes. This work circumvents the limitations of previous data-cleaning solutions that either narrowly focused on single types of data errors or simply ignored many real-life considerations that prevented their adoption.

He led construction of NADEEF, the first extensible and open-source data cleaning system that allows users to declare a heterogeneous set of integrity constraints to a backend that holistically detects and suggests repairs for data violations. His group also introduced FastDC, the state-of-the-art algorithm to mine for denial constraints from dirty data and captures most business rules in practice. With his colleagues, Theodoros Rekatsinas and Christopher Ré, and his former PhD student, Xu Chu, he introduced HoloClean, an open-source a statistical inference engine to impute, clean and enrich data. HoloClean compiles both statistical signals and declarative constraints as learning features, and has proved superior to all previous data repair proposals. The work has been commercialized and deployed in multiple large enterprises as a strong proof of real-world impact.

Professor Ilyas has published his contributions in leading journals, including ACM Transactions on Database Systems, VLDB Journal, and the Proceedings of the VLDB Endowment, and at top database conferences, including ACM SIGMOD, VLDB, and IEEE International Conference on Data Engineering. He coauthored Data Cleaning, an ACM book published in July 2019 that serves as a reference for researchers and practitioners interested in data quality and data cleaning. He has also coauthored several influential surveys — A survey of top-k query processing techniques in relational database systems, Probabilistic ranking techniques in relational databases: synthesis lectures on data management, and Trends in cleaning relational data: consistency and deduplication.

Professor Ilyas is an elected member of the VLDB Endowment Board of Trustees and an elected SIGMOD Vice Chair.

Professor Ilyas co-founded two companies based on his research — Inductiv, a Waterloo-based start-up, now part of Apple, that uses AI for structured data cleaning, and Tamr, a start-up focusing on large-scale data integration and cleaning.

Over his career, Professor Ilyas has been recognized with multiple awards. He won a Government of Ontario Early Researcher Award in 2008, was named an IBM CAS Fellow from 2006–10, held a Cheriton Faculty Fellowship at the University of Waterloo from 2013–16, received the Google Faculty Award in 2014, and was named an ACM Distinguished Scientist in 2014. Since 2018, he has held the Thomson Reuters-NSERC Industrial Research Chair in Data Cleaning. In 2020, he was named a Faculty Affiliate at the Vector Institute.