The Data Systems Seminar Series provides a forum for presenting and discussing key issues in data systems, both current and emerging. It complements our internal meetings by welcoming insights from external colleagues.
The schedule for the 2024–25 academic year is outlined below and will be updated as additional speakers are confirmed.
Seminars are typically held on Mondays at 10:30 a.m. in room DC 1304, unless otherwise noted. Because of ongoing COVID-19 precautions, some sessions may be held virtually via Zoom; these will be clearly marked.
The talks are open to the public.
We will post the presentation videos whenever possible. Past DSG Seminar videos can be found on the DSG YouTube channel.
The Data Systems Seminar Series is supported by
Kris Shrishak |
Arijit Khan |
Tilmann Rabl |
Miao Qiao |
Philip Bernstein |
Weiran Liu |
Jeff Dalton |
- |
- |
26 September 2024; 2pm (Note special time)
Title: | Privacy and PETs: An Interaction with Human Rights Law |
Speaker: | Kris Shrishak, Enforce |
Abstract: | Privacy enhancing technologies (PETs) have been researched and promoted for the past few decades. Amidst greater public awareness of personal data collection and application of data protection regulations, the number of implementations of PETs have increased in the past few years. Given the promise and expectation of PETs to protect people’s privacy and the hope of researchers to see use-cases of PETs, what is the reality of PETs in today’s world? Are the privacy needs of people being met? This talk will take you through a journey of PETs, visiting data protection law and international human rights law along the way. |
Bio: | Dr. Kris Shrishak is a public interest technologist and a Senior Fellow at Enforce. He advises legislators on emerging technologies and global AI governance (including EU AI Act). He is regularly invited to speak at the European Parliament and has testified at the Irish Parliament. His work focuses on privacy tech, anti-surveillance, emerging technologies, and algorithmic decision making. His expert commentary appears in The New York Times, The Washington Post, the BBC, the LA Times, Süddeutsche Zeitung, Politico, The Irish Times and other leading media. He has been interviewed on TV and radio, including on CNN, the BBC, Euronews and France24. He has written for Bulletin of Atomic Scientists, Nikkei Asia and Euronews, among others. He works on the kind of cryptography that allows computing on encrypted data and proving existence of information without revealing them. These technologies, broadly known as privacy enhancing technologies (PETs), could be beneficial. However, there are risks that have not been sufficiently researched. Previously Kris was a researcher at Technical University Darmstadt in Germany where he worked on applied cryptography, PETs and Internet infrastructure security. |
7 October 2024; 10:30
Title: | User-friendly Explanations for Graph Neural Networks |
Speaker: | Arijit Khan, Aalborg University |
Abstract: | Graph data, e.g., social and biological networks, financial transactions, and knowledge graphs are pervasive in the natural world, where nodes are entities with features, and edges denote relations among them. Machine learning and recently, graph neural networks (GNNs) become ubiquitous, e.g., in cheminformatics, bioinformatics, fraud detection, question answering, and recommendation. However, GNNs are “black-box” - it remains a desirable yet nontrivial task to explain the results of high-quality GNNs for domain experts. In this talk, I shall introduce our ongoing works about how data management techniques can assist in generating user-friendly, configurable, queryable, and robust explanations for graph neural networks. |
Bio: | Arijit Khan is an Associate Professor at Aalborg University, Denmark. His PhD is from University of California, Santa Barbara, USA, and he did a post-doc in the Systems group at ETH Zurich, Switzerland. He has been an assistant professor in the School of Computer Science and Engineering, Nanyang Technological University, Singapore. His research is on data management and machine learning for the emerging problems in large graphs. He is an IEEE senior member and an ACM distinguished speaker. Arijit is the recipient of the IBM Ph.D. Fellowship (2012-13), a VLDB Distinguished Reviewer award (2022), and a SIGMOD Distinguished PC award (2024). He is the author of a book on uncertain graphs and over 80 publications in top venues including ACM SIGMOD, VLDB, IEEE TKDE, IEEE ICDE, SIAM SDM, USENIX ATC, EDBT, The Web Conference (WWW), ACM WSDM, ACM CIKM, ACM TKDD, and ACM SIGMOD Record. Dr Khan is serving as an associate editor of IEEE TKDE 2019-2024 and ACM TKDD 2023-now, proceedings chair of EDBT 2020, IEEE ICDE TKDE poster track co-chair 2023, ACM CIKM short paper track co-chair 2024, and IEEE ICDE demonstration paper track program co-chair 2025. |
21 October 2024; 10:30 (Note special location: DC1302)
Title: | Efficiency in Data Systems |
Speaker: | Tilmann Rabl, University of Potsdam |
Abstract: | For the longest time, acquiring new hardware resulted in significant software efficiency gains due to exponential improvements of hardware capabilities. Physical limits in hardware manufacturing have brought former niche designs into standard components, such as multiple cores and specialized circuits. Even with these new designs, hardware improvements are decreasing, while software and applications are still becoming increasingly complex and resource demanding. In this talk, we will discuss efficiency of data systems. We will start with a general discussion of system efficiency and look at the design of efficient architectures. Incorporating estimations on hardware and power production carbon intensity, we will then discuss hardware replacement frequencies and try to establish new rules of thumb on the ideal hardware lifecycles for database deployments and discuss implications on database development. |
Bio: | Tilmann Rabl is a professor for Data Engineering Systems at the Digital Engineering Faculty of the University of Potsdam and the Hasso Plattner Institute. His research focuses on efficiency of database and ML systems, real-time analytics, hardware efficient data processing, and benchmarking. |
28 October 2024; 10:30
Title: | Scalable Query Processing with Graphs |
Speaker: | Miao Qiao, University of Auckland |
Abstract: | Graph-based query processing faces scalability challenges. This talk explores two facets of the problem. First, when a graph grows too large for efficient querying, can query processing algorithms exhibit strongly local properties, making the search independent of the graph’s overall size? Second, in approximate nearest neighbor search using indexes of Hierarchical Navigable Small World (HNSW) graphs, how can we compress the index while maintaining equivalent query performance, when queries include attribute-based filters? To address the first question, we examine cases where dense subgraph search admits strongly local algorithms and where it does not. For the second, we present a novel compression method that transforms the n^2 HNSW graphs into a more compact structure called the 2D segment graph, enabling lossless compression while preserving query efficiency. Theory plays a central role in both solutions, shaping the performance and feasibility of scalable graph-based querying. |
Bio: | Dr. Qiao is a Senior Lecturer in Computer Science at the University of Auckland, New Zealand, a role equivalent to Associate Professor in tenure-track systems. Her research centers on big data management, with a focus on query optimization, indexing, joins, sampling, graph analysis, and graph-based nearest neighbor search. She has advanced indexing techniques for query processing in graph databases, including shortest distance and subgraph matching queries. Her recent work on range-filtering nearest neighbor search, along with an ongoing submission on its dynamic variant, has potential applications in modern vector databases, particularly for unstructured queries. |
13 December 2024; 11:00 (Note the unusual day and time)
Title: | DDS: DPU-optimized Disaggregated Storage |
Speaker: | Philip Bernstein, Microsoft Research |
Abstract: |
A DPU is a network interface card (NIC) with programmable compute and memory resources. It sits on the system bus, PCIe, which is the fastest path to access SSDs, and it directly connects to the network. It therefore can process storage requests as soon as they arrive at the NIC, rather than passing them through to the host. DPUs are widely deployed in public clouds and will soon be ubiquitous. In this talk, we’ll describe DPU-Optimized Disaggregated Storage (DDS), our software platform for offloading storage operations from a host storage server to a DPU. It reduces the cost and improves the performance of supporting a database service. DDS heavily uses DMA, zero-copy, and userspace I/O to minimize overhead and thereby improve throughput. It also introduces an offload engine that can directly execute storage requests on the DPU. For example, it can offload GetPage@LSN to the DPU of an Azure SQL Hyperscale page server. This removes all host CPU consumption (saving up to 17 cores), reduces latency by 70%, and increases throughput by 75%. This is joint work with Qizhen Zhang, Badrish Chandramouli, Jason Hu, and Yiming Zheng. |
Bio: | Philip A. Bernstein is a Distinguished Scientist in the Data Systems Group in Microsoft Research. He has published over 200 papers and two books on the theory and implementation of database systems, especially on transaction processing and data integration, and has contributed to many database products. He is a Fellow of the ACM and AAAS, a winner of the E.F. Codd SIGMOD Innovations Award, and a member of the Washington State Academy of Sciences and the National Academy of Engineering. He received a B.S. degree from Cornell and M.Sc. and Ph.D. from University of Toronto. |
29 January 2025; 12:00 (Note special time)
Title: | |
Speaker: | Weiran Liu, Alibaba Group |
Abstract: | |
Bio: |
14 April 2025; 10:30
Title: | |
Speaker: | |
Abstract: | |
Bio: |
12 May 2025; 10:30
Title: | |
Speaker: | |
Abstract: | |
Bio |
9 June 2025; 10:30
Title: | |
Speaker: | Jeff Dalton, University of Edinburgh |
Abstract: | |
Bio |