A Riemannian mean field formulation for two-layer neural networks with batch normalization

نویسندگان

چکیده

The training dynamics of two-layer neural networks with batch normalization (BN) is studied. It written as the a network without BN on Riemannian manifold. Therefore, we identify BN’s effect changing metric in parameter space. Later, infinite-width limit considered, and mean-field formulation derived for dynamics. shown to be Wasserstein gradient flow Theoretical analysis provided well-posedness convergence flow.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Riemannian approach to batch normalization

Batch Normalization (BN) has proven to be an effective algorithm for deep neural network training by normalizing the input to each neuron and reducing the internal covariate shift. The space of weight vectors in the BN layer can be naturally interpreted as a Riemannian manifold, which is invariant to linear scaling of weights. Following the intrinsic geometry of this manifold provides a new lea...

متن کامل

Mean field theory for asymmetric neural networks.

The computation of mean firing rates and correlations is intractable for large neural networks. For symmetric networks one can derive mean field approximations using the Taylor series expansion of the free energy as proposed by Plefka. In asymmetric networks, the concept of free energy is absent. Therefore, it is not immediately obvious how to extend this method to asymmetric networks. In this ...

متن کامل

A Mean Field Learning Algorithm for Unsupervised Neural Networks

We introduce a learning algorithm for unsupervised neural networks based on ideas from statistical mechanics. The algorithm is derived from a mean eld approximation for large, layered sigmoid belief networks. We show how to (approximately) infer the statistics of these networks without resort to sampling. This is done by solving the mean eld equations, which relate the statistics of each unit t...

متن کامل

Riemannian metrics for neural networks

We describe four algorithms for neural network training, each adapted to different scalability constraints. These algorithms are mathematically principled and invariant under a number of transformations in data and network representation, from which performance is thus independent. These algorithms are obtained from the setting of differential geometry, and are based on either the natural gradi...

متن کامل

Mean-field theory of fluid neural networks

Jordi Delgado and Ricard V. Solé Departament de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Campus Nord, Mòdul C6, Jordi Girona Salgado 1-3 08034 Barcelona, Spain Complex Systems Research Group, Departament de Fı́sica i Enginyeria Nuclear, Universitat Politècnica de Catalunya, Sor Eulàlia d’ Anzizu s/n, Campus Nord, Mòdul B4, 08034 Barcelona, Spain ~Received 27 May ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Research in the Mathematical Sciences

سال: 2022

ISSN: ['2522-0144', '2197-9847']

DOI: https://doi.org/10.1007/s40687-022-00344-0