PhD Seminar - Big data cleaning
Speaker: | Xu Chu |
Abstract: | Data quality is one of the most important problems in data management and data science, since dirty data often leads to inaccurate data analytics results and wrong business decisions. |
Speaker: | Xu Chu |
Abstract: | Data quality is one of the most important problems in data management and data science, since dirty data often leads to inaccurate data analytics results and wrong business decisions. |
Amira Ghenai, PhD candidate
David R. Cheriton School of Computer Science
People regularly use web search engines to investigate the efficacy of medical treatments. Search results can contain documents that present incorrect information that contradicts current established medical understanding on whether a treatment is helpful or not for a health issue. If people are influenced by the incorrect information found in search results, they can make harmful decisions about the appropriate treatment.
Kareem El Gebaly, PhD candidate
David R. Cheriton School of Computer Science
The process of analyzing relational data typically involves tasks facilitating gaining familiarity or insights and coming up with findings or conclusions based on the data. This process is usually practiced by data experts (data scientists) that share their output with potentially less data expert audience (everyone).
Brad Glasbergen, PhD candidate
David R. Cheriton School of Computer Science
Michael Abebe, PhD candidate
David R. Cheriton School of Computer Science
Xi He, PhD candidate
Computer Science Department, Duke University
Jennifer Widom
Frederick Emmons Terman Dean, School of Engineering
Fletcher Jones Professor, Computer Science and Electrical Engineering
Stanford University
Babar Naveed Memon, Master’s candidate
David R. Cheriton School of Computer Science
Remote Direct Memory Access (RDMA) can be used to implement a shared storage abstraction or a shared nothing abstraction for distributed applications. We argue that the shared storage abstraction is an overkill for loosely coupled applications and that the shared nothing abstraction does not leverage all the benefits of RDMA.
Anil Pacaci, PhD candidate
David R. Cheriton School of Computer Science
Haotian Zhang, PhD candidate
David R. Cheriton School of Computer Science