Information Systems Lab (ISL) Colloquium

ISL Colloquium: A Differential View of Reliable Communications

Topic: 
A Differential View of Reliable Communications
Abstract / Description: 

This talk introduces a "differential" approach to information theory. In contrast to the more traditional "elemental" approach, in which we work to understand communication networks by studying the behavior of their elements in isolation, the differential approach works to understand the impact components can have on the larger networks in which they are employed. Results achieved through this differential viewpoint highlight some startling facts about network communications -- including both opportunities where even very small changes to a communication network can have a big impact on network performance and vulnerabilities where small failures can cause big harm.

Date and Time: 
Thursday, May 31, 2018 - 4:15pm
Venue: 
Packard 101

ISL Colloquium: Finite Sample Guarantees for Control of an Unknown Linear Dynamical System

Topic: 
Finite Sample Guarantees for Control of an Unknown Linear Dynamical System
Abstract / Description: 

In principle, control of a physical system is accomplished by first deriving a faithful model of the underlying dynamics from first principles, and then solving an optimal control problem with the modeled dynamics. In practice, the system may be too complex to precisely characterize, and an appealing alternative is to instead collect trajectories of the system and fit a model of the dynamics from the data. How many samples are needed for this to work? How sub-optimal is the resulting controller?

In this talk, I will shed light on these questions when the underlying dynamical system is linear and the control objective is quadratic, a classic optimal control problem known as the Linear Quadratic Regulator. Despite the simplicity of linear dynamical systems, deriving finite-time guarantees for both system identification and controller performance is non-trivial. I will first talk about our results in the "one-shot" setting, where measurements are collected offline, a model is estimated from the data, and a controller is synthesized using the estimated model with confidence bounds. Then, I will discuss our recent work on guarantees in the online regret setting, where noise injected into the system for learning the dynamics needs to trade-off with state regulation.

This talk is based on joint work with Sarah Dean, Horia Mania, Nikolai Matni, and Benjamin Recht.

Date and Time: 
Thursday, May 24, 2018 - 4:15pm
Venue: 
Packard 101

ISL Colloquium & IT Forum: Random initialization and implicit regularization in nonconvex statistical estimation

Topic: 
Random initialization and implicit regularization in nonconvex statistical estimation
Abstract / Description: 

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation / learning problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory is often either far from optimal or completely lacks theoretical guarantees.

This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves near-optimal entrywise error control.

Date and Time: 
Wednesday, May 23, 2018 - 4:15pm
Venue: 
Building 370

ISL Colloquium: Low-dimensional Structures and Deep Models for High-dimensional Data

Topic: 
Low-dimensional Structures and Deep Models for High-dimensional Data
Abstract / Description: 

In this talk, we will discuss a class of models and techniques that can effectively model and extract rich low-dimensional structures in high-dimensional data such as images and videos, despite nonlinear transformation, gross corruption, or severely compressed measurements. This work leverages recent advancements in convex optimization from Compressive Sensing for recovering low-rank or sparse signals that provide both strong theoretical guarantees and efficient and scalable algorithms for solving such high-dimensional combinatorial problems. We illustrate how these new mathematical models and tools could bring disruptive changes to solutions to many challenging tasks in computer vision, image processing, and pattern recognition. We will also illustrate some emerging applications of these tools to other data types such as 3D range data, web documents, image tags, bioinformatics data, audio/music analysis, etc. Throughout the talk, we will discuss strong connections of algorithms from Compressive Sensing with other popular data-driven methods such as Deep Neural Networks, providing some new perspectives to understand Deep Learning.

This is joint work with John Wright of Columbia, Emmanuel Candes of Stanford, Zhouchen Lin of Peking University, Shenghua Gao of ShanghaiTech, and my former students Zhengdong Zhang of MIT, Xiao Liang of Tsinghua University, Arvind Ganesh, Zihan Zhou, Kerui Min of UIUC.

Date and Time: 
Thursday, May 3, 2018 - 4:15pm
Venue: 
Packard 101

ISL Colloquium: Reinforcement Learning: Hidden Theory, and New Super-Fast Algorithms

Topic: 
Reinforcement Learning: Hidden Theory, and New Super-Fast Algorithms
Abstract / Description: 

Stochastic Approximation algorithms are used to approximate solutions to fixed point equations that involve expectations of functions with respect to possibly unknown distributions. The most famous examples today are TD- and Q-learning algorithms. The first half of this lecture will provide an overview of stochastic approximation, with a focus on optimizing the rate of convergence. A new approach to optimize the rate of convergence leads to the new Zap Q-learning algorithm. Analysis suggests that its transient behavior is a close match to a deterministic Newton-Raphson implementation, and numerical experiments confirm super fast convergence.

Date and Time: 
Tuesday, May 1, 2018 - 4:30pm
Venue: 
Packard 101

ISL Seminar: Inventing Algorithms via Deep Learning

Topic: 
Inventing Algorithms via Deep Learning
Abstract / Description: 

Deep learning is a part of daily life, owing to its successes in computer vision and natural language processing. In these applications, the success of the model-free deep learning approach can be attributed to a lack of (mathematical) generative model. In yet other applications, the data is generated by a simple model and performance criterion mathematically precise and training/test samples infinitely abundant, but the space of algorithmic choices is enormous (example: chess). Deep learning has recently shown strong promise in these problems too (example: alphazero). In this talk, we study two canonical problems of great scientific and engineering interest through the lens of deep learning.

The first is reliable communication over noisy media where we successfully revisit classical open problems in information theory; we show that creatively trained and architected neural networks can beat state of the art on the AWGN channel with noisy feedback by a 100 fold improvement in bit error rate.

The second is optimization and classification problems on graphs, where the key algorithmic challenge is scalable performance to arbitrary sized graphs. Representing graphs as randomized nonlinear dynamical systems via recurrent neural networks, we show that creative adversarial training allows one to train on small size graphs and test on much larger sized graphs (100~1000x) with approximation ratios that rival state of the art on a variety of optimization problems across the complexity theoretic hardness spectrum.

Apart from the obvious practical value, this study of mathematically precise problems sheds light on the mysteries of deep learning methods: training example choices, architectural design decisions and loss function/learning methodologies. Our (mostly) empirical research is conducted under the backdrop of a theoretical research program of understanding gated neural networks (eg: attention networks, GRU, LSTM); we show the first provably (globally) consistent algorithms to recover the parameters of a classical gated neural network architecture: mixture of experts (MoE).

Date and Time: 
Thursday, April 26, 2018 - 4:15pm
Venue: 
Packard 101

ISL Colloquium: Reinforcement Learning without Reinforcement

Topic: 
Reinforcement Learning without Reinforcement
Abstract / Description: 

Reinforcement Learning (RL) is concerned with solving sequential decision-making problems in the presence of uncertainty. RL is really about two problems together. The first is the 'Bellman problem': Finding the optimal policy given the model, which may involve large state spaces. Various approximate dynamic programming and RL schemes have been developed, but either there are no guarantees, or not universal, or rather slow. In fact, most RL algorithms have become synonymous with stochastic approximation (SA) schemes that are known to be rather slow. This is an even more difficult problem for MDPs with continuous state (and action) spaces. We present a class of non-SA algorithms for reinforcement learning in continuous state space MDP problems based on 'empirical' ideas, which are simple, effective and yet universal with probabilistic guarantees. The idea involves randomized Kernel-based function fitting combined with 'empirical' updates. The key is the first known "probabilistic contraction analysis" method we have developed for analysis of fairly general stochastic iterative algorithms, wherein we show convergence to a probabilistic fixed point of a sequence of random operators via a stochastic dominance argument.

The second RL problem is the 'online learning (or the Lai-Robbins) problem' when the model itself is unknown. We propose a simple posterior sampling-based regret-minimization reinforcement learning algorithm for MDPs. It achieves O(sqrt{T})-regret which is order-optimal. It not only optimally manages the "exploration versus exploitation tradeoff" but also obviates the need for expensive computation for exploration. The algorithm differs from classical adaptive control in its focus on non-asymptotic regret optimality as opposed to asymptotic stability. This seems to resolve a long standing open problem in Reinforcement Learning.

Date and Time: 
Tuesday, April 24, 2018 - 4:00pm
Venue: 
Packard 101

ISL Colloquium: Recent Developments in Compressed Sensing

Topic: 
Recent Developments in Compressed Sensing
Abstract / Description: 

Compressed sensing refers to the reconstruction of high-dimensional but low-complexity objects from a limited number of measurements. Examples include the recovery of high-dimensional but sparse vectors, and the recovery of high-dimensional but low-rank matrices, which includes the so-called partial realization problem in linear control theory. Much of the work to date focuses on probabilistic methods, which are CPU-intensive and have high computational complexity. In contrast, deterministic methods are far faster in execution and more efficient in terms of storage. Moreover, deterministic methods draw from many branches of mathematics, including graph theory and algebraic coding theory. In this talk a brief overview will be given of such recent developments.

Date and Time: 
Thursday, April 19, 2018 - 4:15pm
Venue: 
Packard 101

IT Forum: From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform

Topic: 
From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform
Abstract / Description: 

Characterizing the rate-distortion region of Gaussian multiterminal source coding is a longstanding open problem in network information theory. In this talk, I will show how to obtain new conclusive results for this problem using nonlinear analysis and convex relaxation techniques. A byproduct of this line of research is an efficient algorithm for determining the optimal distributed Karhunen–Loève transform in the high-resolution regime, which partially settles a question posed by Gastpar, Dragotti, and Vetterli. I will also introduce a generalized version of the Gaussian multiterminal source coding problem where the source-encoder connections can be arbitrary. It will be demonstrated that probabilistic graphical models offer an ideal mathematical language for describing how the performance limit of a generalized Gaussian multiterminal source coding system depends on its topology, and more generally they can serve as the long-sought platform for systematically integrating the existing achievability schemes and converse arguments. The architectural implication of our work for low-latency lossy source coding will also be discussed.

This talk is based on joint work with Jia Wang, Farrokh Etezadi, and Ashish Khisti.


The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time: 
Friday, April 13, 2018 - 1:15pm
Venue: 
Packard 202

ISL Special Seminar: Low- and high-dimensional computations in neural circuits

Topic: 
Low- and high-dimensional computations in neural circuits
Abstract / Description: 

Computation in the brain is distributed across large populations. Individual neurons are noisy and receive limited information but, by acting collectively, neural populations perform a wide variety of complex computations. In this talk I will discuss two approaches to understanding these collective computations. First, I will introduce a method to identify and decode unknown variables encoded in the activity of neural populations. While the number of neurons in a population may be large, if the population encodes a low-dimensional variable there will be low-dimensional structure in the collective activity, and the method aims to find and parameterize this low-dimensional structure. In the rodent head direction (HD) system, the method reveals a nonlinear ring manifold and allows encoded head direction and the tuning curves of single cells to be recovered with high accuracy and without prior knowledge of what neurons were encoding. When applied to sleep, it provides mechanistic insight into the circuit construction of the ring manifold and, during nREM sleep, reveals a new dynamical regime possibly linked to memory consolidation in the brain. I will then address the problem of understanding genuinely high-dimensional computations in the brain, where low-dimensional structure does not exist. Modern work studying distributed algorithms on large sparse networks may provide a compelling approach to neural computation, and I will use insights from recent work on error correction to construct a novel architecture for high-capacity neural memory. Unlike previous models, which yield either weak (linear) increases in capacity with network size or exhibit poor robustness to noise, this network is able to store a number of states exponential in network size while preserving noise robustness, thus resolving a long-standing theoretical question.
These results demonstrate new approaches for studying neural representations and computation across a variety of scales, both when low-dimensional structure is present and when computations are high-dimensional.

Date and Time: 
Tuesday, March 6, 2018 - 10:00am
Venue: 
Clark S360

Pages

Subscribe to RSS - Information Systems Lab (ISL) Colloquium