Optimal errors and phase transitions in high-dimensional generalised linear models]]>

High-dimensional generalized linear models are basic building blocks of current data analysis tools including multilayers neural networks. They arise in signal processing, statistical inference, machine learning, communication theory, and other fields. I will explain how to establish rigorously the intrinsic information-theoretic limitations of inference and learning for a class of randomly generated instances of generalized linear models, thus closing several old conjectures. Examples will be shown where one can delimit regions of parameters for which the optimal error rates are efficiently achievable with currently known algorithms. I will discuss how the proof technique, based on the recently developed adaptive interpolation method, is able to deal with the output nonlinearity and also to some extent with non-separable input distributions. |

Fluctuation of the free energy of Sherrington-Kirkpatrick model with Curie-Weiss interaction: the paramagnetic regime]]>

We consider a spin system containing pure two spin Sherrington-Kirkpatrick Hamiltonian with Curie-Weiss interaction. The model where the spins are spherically symmetric was considered by Baik and Lee and Baik et al. which shows a two dimensional phase transition with respect to temperature and the coupling constant. In this paper we prove a result analogous to Baik and Lee in the “paramagnetic regime” when the spins are i.i.d. Rademacher. We prove the free energy in this case is asymptotically Gaussian and can be approximated by a suitable linear spectral statistics. Unlike the spherical symmetric case the free energy here can not be written as a function of the eigenvalues of the corresponding interaction matrix. The method in this paper relies on a dense sub-graph conditioning technique introduced by Banerjee . The proof of the approximation by the linear spectral statistics part is close to Banerjee and Ma. | |

Direct/Inverse Hopfield model and Restricted Boltzmann Machines]]>

Mean-field methods fail to reconstruct the parameters of the model when the dataset is clusterized. This situation is found at low temperatures because of the emergence of multiple thermodynamic states. The paradigmatic Hopfield model is considered in a teacher-student scenario as a problem of unsupervised learning with Restricted Boltzmann Machines (RBM). For different choices of the priors on units and weights, the replica symmetric phase diagram of random RBM’s is analyzed and in particular the paramagnetic phase boundary is presented as directly related to the optimal size of the training set necessary for a good generalization. The connection between the direct and inverse problem is pointed out by showing that inference can be efficiently performed by suitably adapting both standard learning techniques and standard approaches to the direct problem. | |

]]>

Invitation to random tensors]]>

On bulk deviations for the local behavior of random interlacements]]>

In this talk we will discuss some recent large deviation asymptotics concerning the local behavior of random interlacements on Z^d, d≥3. In particular, we will describe the link with previous results concerning macroscopic holes left inside a large box, by the the adequately thickened connected component of the boundary of the box in the vacant sets of random interlacements. | |

Gaussian fluctuations in directed polymers]]>

Random walk on a simple exclusion process]]>

In this talk we will study the asymptotic behavior of a random walk that evolves on top of a simple symmetric exclusion process. This nice example of a random walk on a dynamical random environment presents its own challenges due to the slow mixing properties of the underlying medium. We will discuss a law of large numbers that has been proved recently for this random walk. Interestingly, we can only prove this law of large numbers for all but two exceptional densities of the exclusion process. The main technique that we have employed is a multi-scale renormalization that has been derived from works in percolation theory. | |

The geometry of random walk isomorphisms]]>

The classical random walk isomorphism theorems relate the local time of a random walk to the square of a Gaussian free field. I will present non-Gaussian versions of these theorems, relating hyperbolic and hemispherical sigma models (and their supersymmetric versions) to non-Markovian random walks interacting through their local time. Applications include a short proof of the Sabot-Tarres limiting formula for the vertex-reinforced jump process (VRJP) and a Mermin-Wagner theorem for hyperbolic sigma models and the VRJP. This is joint work with Tyler Helmuth and Andrew Swan. | |

Theory of Deep Learning 5: Information theoretic approach to deep learning theory: a test using statistical physics methods]]>

Relying on the heuristic replica method from statistical physics we present an estimator for entropies and mutual informations in models of deep model networks. Using this new tool, we test numerically the relation between generalisation and information. ]]>

Theory of Deep Learning 4: Training Neural Networks in the Lazy and Mean Field Regimes]]>

Theory of Deep Learning 3: Neural Tangent Kernel: Convergence and Generalization of Deep Neural Networks]]>

Theory of Deep Learning 2: Over-parametrization in neural networks: an overview and a definition]]>

We will start by reviewing some of the recent literature on the geometry of the loss function, and how SGD navigates the landscape in the OP regime. Then we will see how to define OP by finding a sharp transition described by the models fitting abilities to its training set. Finally, we will discuss how this critical threshold is connected to the generalization properties of the model, and argue that life beyond this threshold is (more or less) as good as it gets. ]]>

Theory of Deep Learning 1: Introduction to the main questions]]>

In this first talk I will introduce the main theoretical questions about deep neural networks:

1. Representation - what can deep neural networks represent?

2. Optimization - why and under what circumstances can we successfully train neural networks?

3. Generalization - why do deep neural networks often generalize well, despite huge capacity?

As a preface I will review the basic models and algorithms (Neural Networks, (stochastic) gradient descent, ...) and some important concepts from machine learning (capacity, overfitting/underfitting, generalization, ...). ]]>

TBA]]>

TBA]]>

Local law and eigenvector delocalization for supercritical Erdos-Renyi graphs]]>

Joint work with Yukun He and Matteo Marcozzi.

]]>Oriented first passage percolation on the hypercube]]>

Cusp Universality for Wigner-type Random Matrices]]>

High-dimensional Gaussian fields with isotropic increments seen through spin glasses]]>

The Ginibre ensemble and Gaussian multiplicative chaos]]>

(Joint work in progress with Paul Bourgade and Guillaume Dubach).

]]>Ubiquity of phases in some percolation models with long-range correlations

Tags: TAG Events Forschung Mathematik]]>

This talk is based on joint works with A. Prévost (Köln) and P.-F. Rodriguez (Bures-sur-Yvette).

]]>