On the Role of Channel Capacity in Learning Gaussian Mixture Models

On the Role of Channel Capacity in Learning Gaussian Mixture Models

Elad Romanov (Stanford)

Jun

Fri, Jun 3 2022, 2pm

Zoom ID: 92716427348
Password: 032264

Abstract: We study the sample complexity of learning the unknown centers of a balanced spherical Gaussian mixture model (GMM) in R^d. In particular, we are interested in the following question: what is the maximal noise level (variance) for which the sample complexity is essentially the same as when estimating the centers from labeled measurements? To that end, we restrict attention to a Bayesian formulation of the problem, where the centers are uniformly distributed on the sphere. Our main results characterize the exact noise threshold, as a function of the number of centers k and dimension d, below which the GMM learning problem is essentially as easy as learning from labeled observations, and above which it is substantially harder. The exact location of the threshold occurs at (log k)/d = (1⁄2)log(1+1/sigma^2), which is the capacity of the additive white Gaussian noise (AWGN) channel. Thinking of the set of k centers as a code, this noise threshold can be interpreted as the largest noise level for which the error probability of the code over the AWGN channel is small. Previous works on the GMM learning problem have identified the minimum distance between the centers as a key parameter in determining the statistical difficulty of learning the corresponding GMM. While our results are only proved for GMMs whose centers are uniformly distributed over the sphere, they hint that perhaps it is the decoding error probability associated with the center constellation as a channel code that determines the statistical difficulty of learning the corresponding GMM, rather than just the minimum distance.

Joint work with Or Ordentlich (Hebrew University) and Tamir Bendory (Tel Aviv University).

Bio: Elad Romanov is currently a postdoctoral researcher in the department of statistics, Stanford. His research interests span broadly signal processing, information theory, high-dimensional statistics, and everything [(math)+(data science)].

IT-Forum