printable pdf
比利时vs摩洛哥足彩 ,
university of california san diego

****************************

mathematics of information, data, and signals seminar

boris hanin

princeton university

random fully connected neural networks as perturbatively solvable models

abstract:

fully connected networks are roughly described by two structural parameters: a depth l and a width n. it is well known that, with some important caveats on the scale at initialization, in the regime of fixed l and the limit of infinite n, neural networks at the start of training are a free (i.e. gaussian) field and that network optimization is kernel regression for the so-called neural tangent kernel (ntk). this is a striking and insightful simplification of infinitely overparameterized networks. however, in this particular infinite width limit neural networks cannot learn data-dependent features, which is perhaps their most important empirical feature. to understand feature learning one must therefore study networks at finite width. in this talk i will do just that. i will report on recent work joint with dan roberts and sho yaida (done at a physics level of rigor) and some more mathematical ongoing work which allows one to compute, perturbatively in 1/n and recursively in l, all correlation functions of the neural network function (and its derivatives) at initialization. an important upshot is the emergence of l/n, instead of simply l, as the effective network depth. this cut-off parameter provably measures the extent of feature learning and the distance at initialization to the large n free theory.

march 3, 2022

11:30 am

https://msu.zoom.us/j/96421373881

(the passcode is the first prime number > 100)

****************************