The Neural Race Reduction: Feature learning dynamics in deep architectures
Neurosciences Building, Gunn Rotunda (E241)
Abstract: What is the relationship between task geometry, network architecture, and emergent feature learning dynamics in nonlinear deep networks? I will describe the Gated Deep Linear Network framework, which schematizes how pathways of information flow impact learning dynamics within an architecture. Because of the gating, these networks can compute nonlinear functions of their input. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning. The reduction takes the form of a neural race with an implicit bias towards shared representations, which then govern the model’s ability to systematically generalize, multi-task, and transfer. We show how appropriate network architectures can help factorize and abstract knowledge. Together, these results begin to shed light on the links between architecture and network performance.
Bio: Dr Andrew Saxe is an Associate Professor/Joint Group Leader at the Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre at UCL. His research focuses on the theory of deep learning and its applications to phenomena in neuroscience and psychology. His work has been recognised by the Schmidt Science Polymath and UK Blavatnik Finalist in Life Sciences awards.
Hosted by - Julia Costacurta (Linderman Lab)