MMath Thesis Presentation • Cardinality Estimation in Streaming Graph Data Management Systems

Friday, February 16, 2024 11:00 am - 11:00 am EST (GMT -05:00)

Kerem Akillioglu, MMath candidate
David R. Cheriton School of Computer Science

Graph processing has become an increasingly popular paradigm for data management systems. Concurrently, there is a pronounced demand for specialized systems dedicated to streaming processing that are essential to address the continual flow of data and the inherent dynamism in streaming data. Yet, the lack of a standardized, general-purpose query framework specifically for streaming graphs is a notable gap in existing technologies. This shortfall emphasizes the necessity for a more comprehensive solution for processing and analyzing streaming graph data efficiently in real time. Enhancing this solution is crucially dependent on improving the query processing pipeline, especially on cardinality estimation and query optimization, both of which are key factors in ensuring optimal system performance.

In this thesis, a novel cardinality estimation technique, called GraphSketch, that is tailored for streaming graph database management systems (GDBMS) is proposed. GraphSketch is a sketch-based framework designed to concisely summarize streaming graphs, enabling both accurate and efficient cardinality estimations. The thesis delves into the theoretical foundations of GraphSketch, outlining its conceptual design and the specific methodologies employed in its construction. Additionally, the thesis elaborates on the suitability of GraphSketch for streaming systems, highlighting its capability for incremental updates, which are pivotal in maintaining efficiency in the rapidly evolving environment of streaming data.