Paul M. Wythes '55, P'86 and Marcia R. Wythes P'86 Professor
Department of Computer Science, Princeton University
Machine learning (ML) is transforming how systems are designed. Numerous recent research results show that applying ML to systems problems can achieve better results than previous state-of-the-arts. This talk will share our experience with designing learned cache systems.
Cache is one of the most important and ubiquitous components in systems and networked infrastructure. It is small, fast storage to store and serve frequently accessed data in front of a large and inexpensive storage system. If the cache has a low miss ratio, it can achieve performance close to a large fast memory at a similar cost of the inexpensive storage. Previous state-of-the-art (SOTA) cache replacement algorithms are based on heuristics — they typically work well for some access patterns and poorly for others. Even with five decades of extensive study, there is a large gap between the miss ratios of the SOTA algorithms and that of an offline optimal algorithm such as Belady’s MIN.
When ML meets such a problem, it views the cache replacement as a prediction problem to minimize cache ratios based on past access patterns. To design a practical learned cache system, we must solve three challenging problems.
The first is how to train and update an ML model to achieve good prediction accuracies as the cache system runs. The second is to how to reduce the prediction overhead for each eviction to be on par with a heuristic algorithm. To simply mimic Belady’s MIN algorithm that evicts the object whose next access is the furthest in the future would require running predictions on all cached objects, which would be infeasible in practice. The third is to find a good tradeoff between the space for ML model and other data structures and that for cached data.
In this talk, I will describe how we solve these problems in designing two learned cache systems. Both systems achieve substantially lower miss ratios than the SOTA heuristic algorithms with different production workloads. Their prototype implementations achieve similar throughputs to those based on the SOTA algorithms with modest increases in CPU utilizations.
Biography: Professor Kai Li received his doctorate from Yale University in 1986 and joined Princeton University the same year. His research interests include distributed and parallel systems, computer architecture, storage systems, knowledge base construction, and privacy preservation for machine learning. He won seven most influential or test-of-time paper awards in seven different areas in computer science. He is an ACM Fellow, an IEEE Fellow, a foreign member of Chinese Academy of Engineering, a member of the Washington State Academy of Sciences, and a member of the US National Academy of Engineering.
Professor Li has worked on a wide variety of topics in computer science. In the area of distributed systems, he pioneered techniques that allow users to program using a shared-memory programming model on clusters of computers. In the area of computer architecture, he developed an efficient protected communication mechanism, which evolved into an industry standard for building efficient computer clusters. In the area of storage systems, he co-founded Data Domain, Inc., and pioneered deduplication storage systems for efficient backup and data replication for disaster recovery. The deduplication technology has allowed the industry to build new generation storage systems. In the area of machine learning and computer vision, he co-led the development of ImageNet knowledge base, which propelled deep learning as the mainstream machine learning method for computer vision as well as several other application domains.
Did you miss Kai Li’s lecture or would you like to hear it again? If so, just start the video below.
200 University Avenue West
Waterloo, ON N2L 3G1