Data Systems Seminar Series (2021-2022)

The Data Systems Seminar Series provides a forum for presentation and discussion of interesting and current data systems issues. It complements our internal data systems meetings by bringing in external colleagues. The talks that are scheduled for this year are listed below.

The talks are usually held on a Monday at 10:30 am in room DC 1302. Exceptions are flagged. Due to Covid-19, until further notice all talks will be virtual over zoom. 

The talks are open to public. Please register here.

We will try to post the presentation notes, whenever that is possible. Please click on the presentation title to access these notes.

The Data Systems Seminar Series is supported by

Yanshuai Cao
Goetz Graefe
Thomas Neumann
Natacha Crooks
David Doermann
Vasiliki Kalavri
Oana Balmau

20 September 2021; 10:30AM

Title: Cross-Domain Text-to-SQL Semantic Parsing video
Speaker: Yanshuai Cao, Borealis.AI
Abstract: Large-scale pre-training has enabled many NLP applications via transfer learning. However, many studies have shown that current deep learning models often rely on superficial cues and dataset biases to achieve seemingly high performance on a given dataset without proper understanding. This talk will discuss the challenges of cross-domain text-to-SQL semantic parsing and how it can be a test-bed for learning to reason in the real world. I will review recent advances in this field, including some of our work tackling the scarce data aspect of this problem. In particular, I will discuss how models encode prior knowledge about this problem's structures; how to train deep transformers on small datasets; and how to perform data augmentation when minor changes could alter the semantics. I will also showcase Turing, the natural language database interface demo built from our cross-domain text-to-SQL semantic parser.
Bio: Yanshuai Cao is a Senior Research Lead at Borealis AI, conducting R&D and building products for RBC.  His research spans natural language processing, generative models, and adversarial machine learning. Yanshuai received his Ph.D. from the University of Toronto under supervision of David J. Fleet and Aaron Hertzmann.

18 October 2021; 10:30AM

Title: Recent Advances in Transactional Concurrency Control video
Speaker: Goetz Graefe, Google, Inc.
Abstract: False conflicts have given locking and serializability a reputation for poor concurrency, poor scalability, and poor system performance. Causes include unnecessarily coarse lock scopes, excessive lock durations, and simplistic lock modes. This talk surveys three published techniques that aim to address these false conflicts.
Bio: Goetz Graefe used to be a professor in Portland, OR and Boulder, CO. He served as a software architect in Microsoft's SQL Server product and as HP Fellow in Hewlett Packard Enterprise. He has been with Google for the last five years. He wrote the Cascades query optimization framework and was awarded the 2017 ACM SIGMOD Edgar F. Codd Innovations Award. He is interested in database query optimization, query execution, indexing, stream indexing, transactions, concurrency control. logging, recovery, and availability.

15 November 2021; 10:30AM 

Title: TBDvideo
Speaker: Thomas Neumann, Technische Universität München
Abstract: TBD
Bio: TBD

13 December 2021; 10:30AM


TBD video

Speaker: Natacha Crooks, University of California, Berkeley
Abstract: TBD
Bio: TBD

10 January 2022; 10:301M


TBD video

Speaker: David Doermann, University at Buffalo
Abstract: TBD
Bio: TBD

25 April 2022; 10:30PM

Title: TBD video
Speaker: Vasiliki Kalavri, Boston University
Abstract: TBD
Bio: TBD

30 May 2022; 10:30AM

Title: TBD videonotes
Speaker: Oana Balmau, McGill University


Bio: TBD

20 June 2022; 10:30AM 

Title: TBD video
Speaker: TBD
Abstract: TBD
Bio: TBD