Seminar by Mufan Li

Monday, February 12, 2024 10:00 am - 11:00 am EST (GMT -05:00)

Department seminar

Mufan Li
Princeton University

Room: M3 3127


Infinite-Depth Neural Networks as Depthwise Stochastic Processes 

Recent advances in neural network research have predominantly focused on infinite-width architectures, yet the complexities inherent in modelling networks with substantial depth call for a novel theoretical framework. In this presentation, we explore a unique approach to modelling neural networks using the proportional infinite-depth-and-width limit.

In fact, naively stacking non-linearities in deep networks leads to problematic degenerate behaviour at initialization. To address this challenge and achieve a well-behaved infinite-depth limit, we introduce a fundamentally novel framework: we treat neural networks as depthwise stochastic processes. Within this framework, the limit is characterized by a stochastic differential equation (SDE) that governs the feature covariance matrix. Notably, the framework we introduced leads to a very accurate model of finite size networks. Finally, we will briefly discuss several applications, including stabilizing gradients for Transformers, saving computational costs in hyperparameter tuning, and a new spectrum result for products of random matrices.