Information Systems Lab (ISL) Colloquium

ISL Colloquium presents "The Hidden Convex Optimization Landscape of Deep Neural Networks"

Topic: 
The Hidden Convex Optimization Landscape of Deep Neural Networks
Abstract / Description: 

The popularity of Deep Neural Networks (DNNs) continues to grow as a result of the great empirical success in a large number of machine learning tasks. However, despite their prevalence in machine learning and the dramatic surge of interest, there are major gaps in our understanding of the fundamentals of neural net models. Understanding the mechanism behind their extraordinary generalization properties remains an open problem. A significant challenge arises in the non-convexity of training DNNs. In non-convex optimization, the choice of optimization method and its internal parameters such as initialization, mini-batching and step sizes have a considerable effect on the quality of the learned model. This is in sharp contrast to convex optimization problems, where these optimization parameters have no effect, and globally optimal solutions can be obtained in a very robust, efficient, transparent and reproducible manner.

In this talk, we introduce exact convex optimization formulations of multilayer neural network training problems. We show that two and three layer neural networks with ReLU or polynomial activations can be globally trained via convex programs with the number of variables polynomial in the number of training samples and number of hidden neurons. Our results provide an equivalent characterization of neural networks as convex models where a mixture of locally linear models are fitted to the data with sparsity inducing convex regularization. Moreover, we show that certain standard two and three layer convolutional neural networks can be globally optimized in fully polynomial time. We discuss extensions to batch normalization and generative adversarial networks. Finally, we present numerical simulations verifying our claims and illustrating that standard local search heuristics such as stochastic gradient descent can be inefficient compared to the proposed convex program.

Date and Time: 
Thursday, June 3, 2021 - 4:30pm

ISL Colloquium: Multi-Scale Methods for Machine Learning

Topic: 
Multi-Scale Methods for Machine Learning
Abstract / Description: 

Many modern applications require us to very quickly find relevant results from an enormous output space of potential candidates, for example, finding the best matching product from a large catalog or suggesting related search phrases on a search engine. The size of the output space for these problems can be in the millions to billions. Moreover, observational or training data is often limited for many of the so-called "long-tail" of items in the output space. Given the inherent paucity of training data for most of the items in the output space, developing machine learning models that perform well for spaces of this size is a contemporary challenge. In this talk, I will present a multi-scale machine learning framework called Prediction for Enormous and Correlated Output Spaces (PECOS). PECOS proceeds by first building a hierarchy over the output space using unsupervised learning, and then learning a machine learning model that makes predictions at each level of the hierarchy. Finally, the multi-scale predictions are combined to obtain an overall predictive model. This leads to an inference method that scales logarithmically with the size of the output space. A key to obtaining high performance is leveraging sparsity and developing highly efficient sparse matrix routines.

Date and Time: 
Thursday, May 20, 2021 - 4:30pm

ISL Colloquium: Interior Point Methods for Nearly Linear Time Algorithms

Topic: 
Interior Point Methods for Nearly Linear Time Algorithms
Abstract / Description: 

Linear programming is a foundational continuous optimization problem and special cases, e.g. maximum cardinality bipartite matching, are among the most fundamental problems in combinatorial optimization. In this talk, I will survey recent advances in solving these problems using interior point methods. I will discuss how new interior point methods can be leveraged to efficiently reduce these problems to data structures problems which can be solved efficiently with techniques from randomized numerical linear algebra and graph theory. This talk will highlight recent joint work which showed that linear programs with d variables and n constraints can be solved with high probability to high precision in time Otilde(nd + poly(d)) and that a variety of graph problems, e.g. maximum flow and optimal transport, on n-node -m-edge graphs can be solved to with high probability to high precision in time Otilde(m + n^1.5). These results constitute the first polynomial time methods for solving these problems to high precision which run in nearly-linear time on sufficiently large and dense instances. No previous experience with interior point methods required.
This talk will touch upon joint work with Jan van den Brand, Yin Tat Lee, Yang P. Liu, Danupon Nanongkai, Richard Peng, Thatchaphol Saranurak, Zhao Song, and Di Wang including 2009.01802, 2002.02304, and 2101.05719.

Date and Time: 
Thursday, May 6, 2021 - 4:30pm

ISL Colloquium: Private Stochastic Convex Optimization

Topic: 
Private Stochastic Convex Optimization
Abstract / Description: 

I will summarize some recent works on differentially private (DP) algorithms for stochastic convex optimization: the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. In the standard l2/l2 setting, we will see two approaches to getting optimal rates for this problem. We show that for a wide range of parameters, privacy causes no additional overhead in accuracy or run time. In the process, we will develop techniques for private stochastic optimization that work for other geometries. For the LASSO setting when optimizing over the l1 ball, we will see private algorithms that achieve optimal rates. Based on joint works with various subsets of Hilal Asi, Raef Bassily, Vitaly Feldman, Tomer Koren and Abhradeep Thakurta.

Date and Time: 
Thursday, April 29, 2021 - 4:30pm

ISL Colloquium presents "Adaptive Experimental Design for Best Identification and Multiple Testing"

Topic: 
Recent results in planted assignment problems
Abstract / Description: 

Motivated by applications such as particle tracking, network de-anonymization, and computer vision, a recent thread of research is devoted to statistical models of assignment problems, in which the data are random weight graphs correlated with the latent permutation. In contrast to problems such as planted clique or stochastic block model, the major difference here is the lack of low-rank structures, which brings forth new challenges in both statistical analysis and algorithm design.

In the first half of the talk, we discuss the linear assignment problem, where the goal is to reconstruct a perfect matching planted in a randomly weighted bipartite graph, whose planted and unplanted edge weights are independently drawn from two different distributions. We determine the sharp threshold at which the optimal reconstruction error (fraction of misclassified edges) exhibits a phase transition from imperfect to perfect. Furthermore, for exponential weight distributions, this phase transition is shown to be of infinite-order, confirming the conjecture in [Semerjian et al. 2020]. The negative result is shown by proving that, below the threshold, the posterior distribution is concentrated away from the hidden matching by constructing exponentially many long augmenting cycles.

In the second half of the talk, we discuss the quadratic assignment problem (graph matching), where the goal is to recover the hidden vertex correspondence between two edge-correlated Erdos-Renyi graphs. We prove that there exists a sharp threshold, above which one can correctly match all but a vanishing fraction of the vertices and below which matching any positive fraction is impossible, a phenomenon known as the "all-or-nothing" phase transition. The proof builds upon a tight characterization of the mutual information via the truncated second-moment method and an appropriate "area theorem". Achieving these thresholds with efficient algorithms remains open.

This talk is based on joint work with Jian Ding, Jiaming Xu, Dana Yang and Sophie Yu. Preprints available at: https://arxiv.org/abs/2103.09383, https://arxiv.org/abs/2008.10097, https://arxiv.org/abs/2102.00082.

Date and Time: 
Thursday, April 22, 2021 - 4:30pm

ISL Colloquium presents "Demystifying the Efficiency of Reinforcement Learning: Two Recent Stories"

Topic: 
Demystifying the Efficiency of Reinforcement Learning: Two Recent Stories
Abstract / Description: 

Reinforcement learning (RL), which is frequently modeled as sequential learning and decision making in the face of uncertainty, is garnering growing interest in recent years due to its remarkable success in practice. In contemporary RL applications, it is increasingly more common to encounter environments with prohibitively large state and action space, thus imposing stringent requirements on the sample and computational efficiency of the RL algorithms in use. Despite the empirical success, however, the theoretical underpinnings for many popular RL algorithms remain highly inadequate even for the tabular setting.

In this talk, we present two vignettes regarding the effectiveness of RL algorithms. The first vignette demonstrates that a perturbed model-based RL approach is minimax optimal under a generative model, without suffering from a sample size barrier that was present in all past work. The second vignette covers policy optimization in reinforcement learning. On the one hand, we demonstrate that the popular softmax policy gradient method can take exponential time to converge; on the other hand, employing natural policy gradients and enforcing entropy regularization provably achieve fast global convergence. These results cover two distinctive RL paradigms, and might shed light on the efficacy of these algorithms in more complicated scenarios.

Date and Time: 
Thursday, April 15, 2021 - 4:30pm

Reinforcement Learning Forum Talk: Provable Model-based Nonlinear Bandit and Reinforcement Learning

Topic: 
Provable Model-based Nonlinear Bandit and Reinforcement Learning
Abstract / Description: 

Deep model-based reinforcement learning methods have achieved state-of-the-art sample efficiency but we lack a theoretical understanding of them. This talk will first show that convergence to a global maximum requires an exponential number of samples even for a one-layer neural net bandit problem, which is strictly easier than RL. Therefore, we propose to study convergence to local maxima. For both nonlinear bandit and RL, I will present a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL), which provably converges to a local maximum with sample complexity that only depends on the sequential Rademacher complexity of the model class. Our results imply novel global or local regret bounds on several concrete settings such as linear bandit with finite or sparse model class, and two-layer neural net bandit.

Paper: https://arxiv.org/pdf/2102.04168.pdf


Details available at RL site, link below

 

Date and Time: 
Thursday, April 15, 2021 - 2:30pm

ISL Colloquium presents "Adaptive Experimental Design for Best Identification and Multiple Testing"

Topic: 
Scalable semidefinite programming
Abstract / Description: 

Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This talk describes a provably correct randomized algorithm for solving large, weakly constrained SDP problems by economizing on the storage and arithmetic costs. Numerical evidence shows that the method is effective for a range of applications, including relaxations of MaxCut, abstract phase retrieval, and quadratic assignment problems. Running on a laptop equivalent, the algorithm can handle SDP instances where the matrix variable has over 10^14 entries.

This talk will highlight the ideas behind the algorithm in a streamlined setting. The insights include a careful problem formulation, design of a bespoke optimization method, use of randomized eigenvalue computations, and use of randomized sketching methods.

Joint work with Alp Yurtsever, Olivier Fercoq, Madeleine Udell, and Volkan Cevher. Based on arXiv 1912.02949 (Scalable SDP, SIMODS 2021) and other papers (SketchyCGM in AISTATS 2017, Nyström sketch in NeurIPS 2017).

Date and Time: 
Thursday, April 8, 2021 - 4:30pm

Pages

Subscribe to RSS - Information Systems Lab (ISL) Colloquium