Events

Panos K. Chrysanthis, University of Pittsburgh

Abstract: Online analytics, in most advanced scientific, business, and defense applications, rely heavily on the efficient execution of large numbers of Aggregate Continuous Queries (ACQs). ACQs continuously aggregate streaming data and periodically produce results such as max or average over a given window of the latest data. It was shown that in processing ACQs it is beneficial to use incremental evaluation, which involves storing and reusing calculations performed over the unchanged parts of the window, rather than performing the re-evaluation of the entire window after each update.

Wednesday, December 12, 2018 12:15 pm - 12:15 pm EST (GMT -05:00)

PhD Seminar • GAL: Graph-Aware Layout for Disk-Resident Graph Databases

Zeynep Korkmaz, PhD candidate
David R. Cheriton School of Computer Science

Analysis on graphs have powerful impact on solving many social and scientific problems, and applications often perform expensive traversals on large scale graphs. Caching approaches on top of persistent storage are among the classical solutions to handle high request throughput. However, graph processing applications have poor access locality, and caching algorithms do not improve disk I/O sufficiently. We present GAL, a graph-aware layout for disk-resident graph databases that generates a storage layout for large-scale graphs on disk with the objective of increasing locality of disk blocks and reducing the number of I/O operations for transactional workloads.

Thursday, December 13, 2018 9:00 am - 9:00 am EST (GMT -05:00)

PhD Seminar • Dynamic Sampling used in TREC Core 2018

Haotian Zhang, PhD candidate
David R. Cheriton School of Computer Science

Dynamic sampling (DS) is applied to create a sampled set of relevance judgments in our participation of TREC Common Core Track 2018. One goal was to test the effectiveness and efficiency of this technique with a set of non-expert, secondary relevance assessors. We consider NIST assessors to be the experts and the primary assessors. Another goal was to make available to other researchers a sampled set of relevance judgments (prels) and thus allow the estimation of retrieval metrics that have the potential to be more robust than the standard NIST provided relevance judgments (qrels). In addition to creating the prels, we also submitted several runs based on our manual judging and the models produced by our HiCAL system.

Wednesday, February 13, 2019 12:15 pm - 12:15 pm EST (GMT -05:00)

PhD Seminar • Accuracy-Aware Differentially Private Data Exploration

Chang Ge, PhD candidate
David R. Cheriton School of Computer Science

Organizations are increasingly interested in allowing external data scientists to explore their sensitive datasets. Due to the popularity of differential privacy, data owners want the data exploration to ensure provable privacy guarantees. However, current systems for differentially private query answering place an inordinate burden on the data analysts to understand differential privacy, manage their privacy budget and even implement new algorithms for noisy query answering. Moreover, current systems do not provide any guarantees to the data analyst on the quantity they care about, namely accuracy of query answers.

Wednesday, February 27, 2019 12:15 pm - 12:15 pm EST (GMT -05:00)

PhD Seminar • DimmStore: Tackling Memory Power Footprint of Database Systems

Alexey Karyakin, PhD candidate
David R. Cheriton School of Computer Science

Energy consumed by the main memory in existing database systems does not effectively scale down with lower system utilization, both in terms of actual memory usage and load conditions. At the same time, main memory represents a sizable portion of the total server energy footprint, which makes it an outlier as the rest of the system moves towards energy proportionality.

We introduce DimmStore, a prototype main-memory database system that addresses the problem of memory energy consumption.

Haotian Zhang, PhD candidate
David R. Cheriton School of Computer Science

James She, Department of Electronic and Computer Engineering
Hong Kong University of Science and Technology

Wednesday, May 29, 2019 12:15 pm - 12:15 pm EDT (GMT -04:00)

PhD Seminar • Predictable and Consistent Information Extraction

Besat Kassaie, PhD candidate
David R. Cheriton School of Computer Science

Wednesday, June 12, 2019 12:15 pm - 12:15 pm EDT (GMT -04:00)

PhD Seminar • HoloDetect: Few-Shot Learning for Error Detection

Alireza Heidari, PhD candidate
David R. Cheriton School of Computer Science

We introduce a few-shot learning framework for error detection. We show that data augmentation (a form of weak supervision) is key to training high-quality, ML-based error detection models that require minimal human involvement.

Filter by:

DSG Seminar Series • Algorithms and Optimizations for Incremental Window-Based Aggregations

PhD Seminar • GAL: Graph-Aware Layout for Disk-Resident Graph Databases

PhD Seminar • Dynamic Sampling used in TREC Core 2018

PhD Seminar • Accuracy-Aware Differentially Private Data Exploration

PhD Seminar • DimmStore: Tackling Memory Power Footprint of Database Systems

PhD Seminar • Sampling Strategies and Active Learning for Volume Estimation

PhD Defence • Increasing the Efficiency of High-Recall Information Retrieval

Seminar • Data Science and Social Computing for Emerging Social Media and Multimedia Applications

PhD Seminar • Predictable and Consistent Information Extraction

PhD Seminar • HoloDetect: Few-Shot Learning for Error Detection