Cheriton School of Computer Science Professor Ihab Ilyas has been named a Fellow of the Institute of Electrical and Electronics Engineers for his contributions to data integration, data cleaning and rank-aware query processing.
IEEE Fellowships are a prestigious professional recognition and an important career achievement. A Fellow is the highest grade of IEEE membership and it is conferred to those with an outstanding record of accomplishments. Each year, the total number of IEEE members recognized as Fellows does not exceed one-tenth of one per cent of the Institute’s total voting membership.
“Congratulations to Ihab on being named an IEEE Fellow,” said Raouf Boutaba, Professor and Director of the Cheriton School of Computer Science. “Ihab is well known to computer scientists in both academia and industry because of his contributions to scalable automatic error detection, to data cleaning, and to imputation of dirty structured data. He has pioneered scalable automatic data cleaning through the systems he, his students and collaborators have developed that use state-of-the-art machine learning models.”
Professor Ilyas is the seventh faculty member at the Cheriton School of Computer Science to receive the prestigious recognition of IEEE Fellow, following Professors N. Asokan, Raouf Boutaba, J. Alan George, Ming Li, M. Tamer Özsu, and Srinivasan Keshav who is an adjunct at the Cheriton School of Computer Science and is now at Cambridge.
IEEE is the world’s largest technical professional organization dedicated to advancing technology for the benefit of humanity. IEEE and its members inspire a global community through its highly cited publications, conferences, technology standards, and professional and educational activities.
More on Professor Ilyas’s contributions
Scalable
automatic
error
detection,
cleaning
and
imputation
of
dirty
structured
data
Professor
Ilyas
has
addressed
multiple
technical
challenges
in
scalable
automatic
error
detection,
cleaning
and
imputation
of
dirty
structured
data.
His
research
presents
data
errors
as
a
noisy
channel
with
a
probabilistic
model
to
generate
original
clean
data,
and
a
probabilistic
realization
model
that
pollutes
that
data.
Several
key
results
on
mining
the
constraints
and
on
the
learnability
of
these
machine-learning
models’
parameters
using
only
the
observed
dirty
data
have
helped
to
create
pragmatic
and
scalable
solutions.
These
machine-learning
solutions
incorporated
modern
techniques
such
as
self-supervision,
data
augmentation,
embedding,
and
schema-level
attention
mechanisms
to
build
learnable
complex
error
detection
and
repair
models.
Key
insights
include
how
violations
of
business
rules
and
integrity
constraints
can
be
incorporated
into
these
machine-learning
models,
which
has
allowed
decades
of
logical
cleaning
research
to
be
incorporated
in
modern
and
scalable
techniques.
Professor Ilyas’s research in data cleaning has been recognized by academia and industry alike. He and his former PhD student Xu Chu, now faculty at Georgia Tech, coauthored Data Cleaning, among the most downloaded books in the ACM Books series. He has given many invited keynote addresses and presentations at top institutions and venues. His start-up Tamr, among the top companies in data integration and preparation, has raised more than $70 million and serves dozens of Fortune 500 companies. And Inductiv, Professor Ilyas’s start-up that uses machine learning to automate the task of identifying and correcting errors in data, was acquired by Apple Inc. in 2020. Inductiv’s technology is based on HoloClean, a next generation of machine-learning techniques to clean data that began in 2017 as a collaborative academic project led by Professor Ilyas and his colleagues Professors Theodoros Rekatsinas at the University of Wisconsin-Madison and Christopher Ré at Stanford University.
Rank-aware
query
processing
Professor
Ilyas
has
also
integrated
rank-aware
querying
into
database
technologies
to
allow
effective
retrieval
in
large
data
sets
such
as
those
in
multimedia
and
video
databases.
He
developed
algorithms
and
techniques
that
markedly
changed
how
database
systems
handle
ranking
and
user-preferences
in
processing
queries.
His rank-join algorithm has been the state-of-the-art physical query operator to produce query answers ranked on user preference in a way that avoids computation of the entire answer sets. He introduced RankSQL, the first end-to-end rank-aware query engine based on novel ranked relational algebra semantics and built on top of PostgreSQL. In addition, he first introduced the problem of ranking uncertain data and provided the first meaningful semantics for the interplay between uncertainty and score-based ranking. He and his research group were the first to define the problem of ranking where either the record membership or the score values — or both — are uncertain. Professor Ilyas has also addressed uncertainty in the ranking function itself. His many papers in this area have defined a new line of research in the database community and have provided valuable insight and several practical semantics of how to produce the most probable top-k records with respect to user preferences.