Please note: This master’s thesis presentation will take place online.
Suraj Singh, Master’s candidate
David R. Cheriton School of Computer Science
Supervisors: Professors Bernard Wong, Khuzaima Daudjee
Serverless computing has become an increasingly popular paradigm for building cloud applications. There has been a recent trend of building stateful applications on top of serverless platforms in the form of workflows composed of individual functions. As functions are short-lived and state is not recoverable across function invocations, these applications typically store state that is used between functions in an external storage system. Such storage systems should enforce concurrency control, as different workflow instances may update overlapping state simultaneously. However, existing concurrency control algorithms typically incur significant latency due to locking or read/write set validation. This is undesirable, since latency is an important performance metric for workflow applications as each stage is executed sequentially. Furthermore, they can abort transactions in a manner that is oblivious to application preferences.
In this thesis, we present Arbor, a sharded dependency-tracking storage system designed for low-latency interactions with serverless workflows while ensuring serializability. Arbor introduces a two-round commit model where submitted client transactions are organized in a dependency graph.
Transactions are then processed in batches, off the critical path of client execution, allowing clients to continue executing without having to wait for Arbor to validate each transaction. As Arbor processes transactions, it organizes them into a tree where each branch is a serialized execution and conflicts result in new branches being created. It then commits one branch from this tree and prunes the rest. To minimize re-executions, Arbor chooses the longest branch by default, but application developers can implement their own policies. Pruning branches is simple with Arbor, since it can re-execute the corresponding transactions by invoking the respective functions from the serverless platform. Furthermore, Arbor is designed to be scalable. Data is partitioned by key, but the metadata of its dependency graph is replicated.
This design allows single-shard transactions in each batch to be processed independently, while multi-shard transactions are replicated and processed by each shard. Our evaluation on a cluster of machines shows that Arbor’s two-round commit model reduces transaction execution latency by a median value of 1.45x when compared to a system that uses OCC and commits transactions synchronously.