Seminar by Mufan Li | Statistics and Actuarial Science

Department seminar

Mufan Li
Princeton University

Room: M3 3127

Infinite-Depth Neural Networks as Depthwise Stochastic Processes

Recent advances in neural network research have predominantly focused on infinite-width architectures, yet the complexities inherent in modelling networks with substantial depth call for a novel theoretical framework. In this presentation, we explore a unique approach to modelling neural networks using the proportional infinite-depth-and-width limit.

In fact, naively stacking non-linearities in deep networks leads to problematic degenerate behaviour at initialization. To address this challenge and achieve a well-behaved infinite-depth limit, we introduce a fundamentally novel framework: we treat neural networks as depthwise stochastic processes. Within this framework, the limit is characterized by a stochastic differential equation (SDE) that governs the feature covariance matrix. Notably, the framework we introduced leads to a very accurate model of finite size networks. Finally, we will briefly discuss several applications, including stabilizing gradients for Transformers, saving computational costs in hyperparameter tuning, and a new spectrum result for products of random matrices.

Location Information

Location Address: M3 - Mathematics 3
200 University Avenue West
M3 3127
Waterloo, ON, CA N2L 3G1

Location coordinates: