Applied Mathematics Seminar | Francesco Cagnetta, From Data Statistics to Scaling Laws: Toward a Physics of Representation Learning | Applied Mathematics

Tuesday, May 5, 2026 3:30 pm - 4:30 pm EDT (GMT -04:00)

iCal

Location

MC 5501

Speaker

Francesco Cagnetta, Theoretical and Scientific Data Science Group at SISSA

Title

From Data Statistics to Scaling Laws: Toward a Physics of Representation Learning

Abstract

The successes of modern learning systems largely stem from their ability to learn representations: coarse-grained descriptions of data that retain predictive information while discarding irrelevant microscopic details. Approximation theory helps explain why deep architectures can represent such structure efficiently, while mechanistic interpretability has begun to reveal what these systems encode in practice. Yet we still lack a predictive theoretical framework---a “physics” of representation learning---that explains how useful representations emerge during training and how they depend on the statistical structure of the data.

In this talk, I will describe a model-based approach toward such a framework, inspired by the physics of complex systems: isolate robust structural properties of real data in analytically controlled settings, derive quantitative predictions, and test them in realistic machine-learning scenarios. As a main example, I will show how this perspective leads to a predictive theory of Neural Scaling Laws: the ubiquitous power-law relationships between a machine-learning model's performance and its training resources, such as dataset size. In particular, I will argue that the scaling exponents of modern transformer-based language models trained on real text corpora can be derived from measurable statistical properties of language. By linking learning curves to measurable structure in data, this approach turns scaling into a quantitative probe---and a concrete target---for a physics of representation learning. More generally, it suggests a route toward similarly predictive descriptions of other emergent properties of modern learning systems.