Please note: This PhD seminar will take place online.
Siddhartha Sahu, PhD candidate
David. R. Cheriton School of Computer Science
Supervisor: Professor Semih Salihoğlu
Differential computation (DC) is a recent general technique for sharing and maintaining computation across evolving datasets for arbitrary dataflow computations, including those that contain nested recursive loops. It’s generality makes it suitable for a wide range of applications, including graph analytics which often perform iterative computations. On the flip side, such a general technique is oblivious to application specific optimizations.
In this talk, we’ll describe how DC can be used to build large-scale data systems that can benefit from computation sharing, when datasets can be modeled as evolving snapshots. We’ll then compare other similar systems and techniques for data processing, and present the advantages and tradeoffs. Specifically, we will describe how application specific implementations can often result in better performance.
The Rust programming language is seeing rapid adoption as a replacement for traditional languages such as C, C++, and Java, due to its strong memory and type safety guarantees. Differential Dataflow, the reference implementation of DC that we used in our research, is implemented in Rust. In the second part of the talk, we’ll discuss our experience in using Rust for data systems research and why it is an attractive programming language worth considering for new research projects.