Current students

As 24/7 availability becomes increasingly important for modern applications, database systems are frequently replicated in order to stay up and running in the face of database server failure. It is no longer acceptable for an application to wait for a database to recover from a log on disk --- most mission-critical applications need immediate failover to a replica.

(Editor’s note: I was unaware that Kyle Kingsbury was doing a linearizability analysis of Hazelcast when I was writing this post. Kyle’s analysis resulted in Greg Luck, Hazelcast’s CEO, to write a blog post where he cited the PACELC theorem, and came to some of the same conclusions that I came to in writing this post.

Apache Parquet and Apache ORC have become a popular file formats for storing data in the Hadoop ecosystem. Their primary value proposition revolves around their “columnar data representation format”. To quickly explain what this means: many people model their data in a set of two dimensional tables where each row corresponds to an entity, and each column an attribute about that entity. However, storage is one-dimensional --- you can only read data sequentially from memory or disk in one dimension.

In my previous blog post, I discussed the relatively new Apache Arrow project, and compared it with two similar column-oriented storage formats in ORC and Parquet. In particular, I explained how storage formats targeted for main memory have fundamental differences from storage formats targeted for disk-resident data.

April 20th, 2018 marked the end of our first iteration of DAWNBench, the first deep learning benchmark and competition that measures end-to-end performance: the time/cost required to achieve a state-of-the-art accuracy level for common deep learning tasks, as well as the latency/cost of inference at this state-of-the-art accuracy level.

Professor Tamer Özsu of the Data Systems Group, along with Professor Ed Lank of the Human-Computer Interation Group, have been named 2018 David R. Cheriton Faculty Fellows, positions they will hold until 2021. These prestigious three-year fellowships support the work of leading faculty members in the School of Computer Science and are made possible through the David R. Cheriton Endowment for Excellence in Computer Science.