The Association for Computing Machinery has named Matei Zaharia as the recipient of the 2025 ACM Prize in Computing for his visionary development of distributed data systems and computing infrastructure, which has enabled large-scale machine learning, analytics, and AI at global scale.
Professor Zaharia is a graduate of the University of Waterloo, where he earned a Bachelor of Mathematics in 2007 with a double major in Computer Science and Combinatorics & Optimization. He graduated with a GPA often described as “100 minus epsilon,” a value infinitesimally smaller than 100. He won Waterloo’s 2014 Young Alumni Achievement Medal. Currently, he is an Associate Professor in the Electrical Engineering and Computer Sciences Department at the University of California, Berkeley, and a Co-Founder and CTO of Databricks.

Professor Zaharia at the 2017 Cheriton Research Symposium, where he gave a lecture titled “Composable Parallel Processing in Apache Spark and Weld.”
The ACM Prize in Computing recognizes early-to-mid-career computer scientists whose work has had broad and lasting impact. The award carries a $250,000 prize, with financial support provided by an endowment from Infosys Ltd, a global leader in next-generation digital services and consulting.
Professor Zaharia’s work addressed a central challenge in computing: how to work with and analyze rapidly growing volumes of data efficiently, and at a scale previously accessible only to the largest technology companies. Early distributed data systems were limited in speed and poorly suited to emerging workloads such as machine learning and interactive analysis. Through a sequence of open-source systems, each targeting a distinct bottleneck, Professor Zaharia changed what any organization could do with massive datasets.
As a doctoral student at UC Berkeley, Professor Zaharia started Apache Spark, a new approach to distributed computing that reliably leverages memory to accelerate computations. This design made Spark dramatically faster than existing frameworks for the kinds of iterative computations essential to machine learning, while its unified architecture allowed batch processing, streaming, graph computation, and interactive queries to run within a single system.
Spark quickly moved from research into widespread use and is now the de facto standard for large-scale data analytics, deployed across tens of thousands of organizations and integrated into major cloud platforms. Professor Zaharia’s doctoral dissertation on Spark received the ACM Doctoral Dissertation Award in 2014.