Master's Defence | Anthony Caterini, A Novel Mathematical Framework for the Analysis of Neural Networks | Applied Mathematics

MC 6460

Candidate

Anthony Caterini,
Applied Mathematics, University of Waterloo

Title

A Novel Mathematical Framework for the Analysis of Neural Networks

Abstract

Over the past decade, Deep Neural Networks (DNNs) have become very popular models for processing large amounts of data. However, we do not fully understand why DNNs are so effective. We explore one way to approach this problem in this work: we develop a generic mathematical framework for representing neural networks. In chapter 1, we start by exploring mathematical contributions to neural networks. We can rigorously explain some properties of DNNs, but we cannot fully describe the mechanics of a generic neural network. We also note that most approaches to describing neural networks rely upon breaking down the parameters and inputs into scalars, as opposed to referencing their underlying vector spaces. Our framework strictly operates over these vector spaces, affording a more natural description of DNNs for analysis.We then develop the generic framework in chapter 3. We describe one step of gradient descent over the space in which the parameters reside, and we can represent error backpropagation in a concise form. Besides the standard squared and cross-entropy losses, we also demonstrate that our framework extends to a higher-order loss function. After developing the generic framework, we apply it to three specific network examples in chapter 4: The Multilayer Perceptron, Convolutional Neural Network, and Deep Auto-Encoder. In chapter 5, we use some of the results from the previous chapters to develop a framework for Recurrent Neural Networks (RNNs). We describe a generic RNN first, which extends the earlier results, and then the specific case of the vanilla RNN. We again compute gradients directly over inner product spaces.