Please note: This master’s thesis presentation will be given online.
Omar
Attia, Master’s
candidate
David
R.
Cheriton
School
of
Computer
Science
Supervisor: Professor Ihab Ilyas
Machine learning data repair systems (e.g., HoloClean) have achieved state-of-the-art performance for the data repair problem on many datasets. However, these systems still face significant challenges when applied to sparse datasets.
In this work, we study the challenges presented by such datasets to machine learning data repair systems. We suggest dataset-independent methods to mitigate the effects of data sparseness. Finally, we present our results on a large, sparse real-world dataset: Census.
To join this master’s thesis presentation on Zoom, please go to https://us04web.zoom.us/j/9515296655?pwd=c2NOYTUzS3I3QU1GQlRndmN3dXNJQT09.