Optimal Approximation by ReLU MLPs of Maximal Regularity
Nov 20, 2024
4:30PM to 5:30PM
Date/Time
Date(s) - 20/11/2024
4:30 pm - 5:30 pm
Location: BSB B105
Speaker: Ruiyang Hong, PhD Candidate, Math and Stats McMaster University
Abstract: The foundations of deep learning are supported by the seemingly opposing perspectives of approximation or learning theory. The former advocates for large/expressive models that need not generalize, while the latter considers classes that generalize but may be too small/constrained to be universal approximators. Motivated by real?world deep learning implementations that are both expressive and statistically reliable, we ask: “Is there a class of neural networks that is both large enough to be universal but structured enough to generalize?” We constructively provides a positive answer to this question by identifying a highly structured class of ReLU multilayer perceptions (MLPs), which are optimal function approximators and are statistically well?behaved. We show that any L?Lipschitz function from [0,1]? to [?n,n] can be approximated to a uniform Ld/(2n) error on [0,1]? with a sparsely connected L?Lipschitz ReLU MLP of width ?(dn?), depth ?(?og(d)), with ?(dn?) nonzero parameters, and whose weights and biases take values in {0,± 1/2} except in the first and last layers which instead have magnitude at?most $n$. We achieve this by avoiding the standard approach to constructing optimal ReLU approximators, which sacrifices regularity by relying on small spikes. Instead, we introduce a new construction that perfectly fits together linear pieces using Kuhn triangulations and avoids these small spikes. Supervisor: Dr. Anastasis Kratsios