A large range of problems in statistics, machine learning and signal processing can be cast as fitting a structured low-rank matrix to noisy data. A popular approach to the resulting rank-constrained optimization problem is convex SDP relaxation, which runs in polynomial time in principle and enjoys desirable statistical guarantees. For large-scale problems, however, the (super-)quadratic time complexity of SDPs is often too high. An attractive alternative is to run projected gradient descent directly over the space of low-rank matrices. This approach is highly scalable, but its convergence and statistical accuracy were unclear due to non-convexity. Here we develop a unified framework characterizing the convergence of this non-convex method as well as the statistical properties of the resulting solutions. Our results provide insights into why we should expect non-convex methods to work in general, and yield global guarantees for a large number of concrete problems, including matrix completion with real and one-bit observations, matrix decomposition, robust and sparse PCA and graph clustering. Our framework is powerful enough to accommodate nonlinear measurements, matrices with arbitrary rank, and the noisy, constrained setting with additional structures in the noise and target matrices, and moreover does not require resampling.
Yudong Chen is an assistant professor at the School of Operations Research and Information Engineering (ORIE), Cornell University. In 2013-2015 he was a postdoctoral scholar at the Department of Electrical Engineering and Computer Sciences at University of California, Berkeley. He obtained his Ph.D. in Electrical and Computer Engineering from the University of Texas at Austin in 2013, and his M.S. and B.S. from Tsinghua University. His research interests include machine learning, high-dimensional and robust statistics, large-scale optimization, and applications in networks and transportation systems.