Please note: This PhD seminar will take place online.
Zeping Mao, PhD candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Ming Li
Biological research has traditionally been hypothesis-driven, with individual studies often focusing on a specific molecule, gene, or mechanism and establishing conclusions through stepwise experimental validation. While this paradigm has produced rigorous and reliable knowledge, it is often limited in scale and throughput. In contrast, recent advances in AI for biology have increasingly relied on large-scale data-driven approaches, where the central goal is to learn predictive patterns for a particular biological task from broad datasets rather than to fully characterize an entire system.
This seminar examines the contrast between these two research paradigms and discusses how modern AI is reshaping the scientific value of large-scale biological data. It will also present a recent project aimed at addressing a key bottleneck in AI for biology: the lack of sufficiently large and well-structured experimental datasets in many domains. By developing a scalable pipeline spanning gene synthesis, protein expression, and protein characterization, this work explores how advances in wet-lab data generation can directly support progress in biological artificial intelligence.