## EE Student Information

Academic Year 20-21 FAQ and updated EE course list

# IT-Forum

## ISL Colloquium & IT Forum: Random initialization and implicit regularization in nonconvex statistical estimation

Topic:
Random initialization and implicit regularization in nonconvex statistical estimation
Abstract / Description:

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation / learning problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory is often either far from optimal or completely lacks theoretical guarantees.

This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves near-optimal entrywise error control.

Date and Time:
Wednesday, May 23, 2018 - 4:15pm
Venue:
Building 370

## IT-Forum: Tight sample complexity bounds via dualizing LeCam's method

Topic:
Tight sample complexity bounds via dualizing LeCam's method
Abstract / Description:

In this talk we consider a general question of estimating linear functional of the distribution based on the noisy samples from it. We discover that the (two-point) LeCam lower bound is in fact achievable by optimizing bias-variance tradeoff of an empirical-mean type of estimator. We extend the method to certain symmetric functionals of high-dimensional parametric models.

Next, we apply this general framework to two problems: population recovery and predicting the number of unseen species. In population recovery, the goal is to estimate an unknown high-dimensional distribution (in $L_\infty$-distance) from noisy samples. In the case of \textit{erasure} noise, i.e. when each coordinate is erased with probability $\epsilon$, we discover a curious phase transition in sample complexity at $\epsilon=1/2$. In the second (classical) problem, we observe $n$ iid samples from an unknown distribution on a countable alphabet and the goal is to predict the number of new species that will be observed in the next (unseen) $tn$ samples. Again, we discover a phase transition at $t=1$. In both cases, the complete characterization of sample complexity relies on complex-analytic methods, such as Hadamard's three-lines theorem.

Joint work with Yihong Wu (Yale).

Date and Time:
Friday, May 4, 2018 - 1:15pm
Venue:
Packard 202

## IT Forum: From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform

Topic:
From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform
Abstract / Description:

Characterizing the rate-distortion region of Gaussian multiterminal source coding is a longstanding open problem in network information theory. In this talk, I will show how to obtain new conclusive results for this problem using nonlinear analysis and convex relaxation techniques. A byproduct of this line of research is an efficient algorithm for determining the optimal distributed Karhunen–Loève transform in the high-resolution regime, which partially settles a question posed by Gastpar, Dragotti, and Vetterli. I will also introduce a generalized version of the Gaussian multiterminal source coding problem where the source-encoder connections can be arbitrary. It will be demonstrated that probabilistic graphical models offer an ideal mathematical language for describing how the performance limit of a generalized Gaussian multiterminal source coding system depends on its topology, and more generally they can serve as the long-sought platform for systematically integrating the existing achievability schemes and converse arguments. The architectural implication of our work for low-latency lossy source coding will also be discussed.

This talk is based on joint work with Jia Wang, Farrokh Etezadi, and Ashish Khisti.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, April 13, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum & ISL presents Robust sequential change-point detection

Topic:
Robust sequential change-point detection
Abstract / Description:

Sequential change-point detection is a fundamental problem in statistics and signal processing, with broad applications in security, network monitoring, imaging, and genetics. Given a sequence of data, the goal is to detect any change in the underlying distribution as quickly as possible from the streaming data. Various algorithms have been developed including the commonly used CUSUM procedure. However, there is a still a gap when applying change-point detection methods to real problems, notably, due to the lack of robustness. Classic approaches usually require exact specification of the pre and post change distributions forms, which may be quite restrictive and do not perform well with real data. On the other hand, Huber’s classic robust statistics built based on least favorable distributions are not directly applicable since they are computationally intractable in the multi-dimensional setting. In this seminar, I will present several of our recent works in developing computationally efficient and robust change-point detection algorithms with certain near optimality properties, by building a connection of statistical sequential analysis with (online) convex optimization.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, March 16, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: Restricted Isometry Property of Random Projection for Low-Dimensional Subspaces

Topic:
Restricted Isometry Property of Random Projection for Low-Dimensional Subspaces
Abstract / Description:

Dimensionality reduction is in demand to reduce the complexity of solving large-scale problems with data lying in latent low-dimensional structures in machine learning and computer version. Motivated by such need, in this talk I will introduce the Restricted Isometry Property (RIP) of Gaussian random projections for low-dimensional subspaces in R^N, and prove that the projection Frobenius norm distance between any two subspaces spanned by the projected data in R^n for n smaller than N remain almost the same as the distance between the original subspaces with probability no less than 1 - e^O(-n).

Previously the well-known Johnson-Lindenstrauss (JL) Lemma and RIP for sparse vectors have been the foundation of sparse signal processing including Compressed Sensing. As an analogy to JL Lemma and RIP for sparse vectors, this work allows the use of random projections to reduce the ambient dimension with the theoretical guarantee that the distance between subspaces after compression is well preserved.

As a direct result of our theory, when solving the subspace clustering (SC) problem at a large scale, one may conduct SC algorithm on randomly compressed samples to alleviate the high computational burden and still have theoretical performance guarantee. Because the distance between subspaces almost remains unchanged after projection, the clustering error rate of any SC algorithm may keep as small as that conducting in the original space. Considering that our theory is independent of SC algorithms, this may benefit future studies on other subspace related topics.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 23, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: BATS: Network Coding in Action

Topic:
BATS: Network Coding in Action
Abstract / Description:

Multi-hop wireless networks can be found in many application scenarios, including IoT, fog computing, satellite communication, underwater communication, etc. The main challenge in such networks is the accumulation of packet loss in the wireless links. With existing technologies, the throughput decreases exponentially fast with the number of hops.

In this talk, we introduce BATched Sparse code (BATS code) as a solution to this challenge. BATS code is a rateless implementation of network coding. The advantages of BATS codes include low encoding/decoding complexities, high throughput, low latency, and low storage requirement. This makes BATS codes ideal for implementation on IoT devices that have limited computing power and storage. At the end of the talk, we will show a video demonstration of BATS code over a Wi-Fi network with 10 IoT devices acting as relay nodes.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 9, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: Deterministic Random Matrices

Topic:
Deterministic Random Matrices
Abstract / Description:

Random matrices have become a very active area of research in the recent years and have found enormous applications in modern mathematics, physics, engineering, biological modeling, and other fields. In this work, we focus on symmetric sign (+/-1) matrices (SSMs) that were originally utilized by Wigner to model the nuclei of heavy atoms in mid-50s. Assuming the entries of the upper triangular part to be independent +/-1 with equal probabilities, Wigner showed in his pioneering works that when the sizes of matrices grow, their empirical spectra converge to a non-random measure having a semicircular shape. Later, this fundamental result was improved and substantially extended to more general families of matrices and finer spectral properties. In many physical phenomena, however, the entries of matrices exhibit significant correlations. At the same time, almost all available analytical tools heavily rely on the independence condition making the study of matrices with structure (dependencies) very challenging. The few existing works in this direction consider very specific setups and are limited by particular techniques, lacking a unified framework and tight information-theoretic bounds that would quantify the exact amount of structure that matrices may possess without affecting the limiting semicircular form of their spectra.

From a different perspective, in many applications one needs to simulate random objects. Generation of large random matrices requires very powerful sources of randomness due to the independence condition, the experiments are impossible to reproduce, and atypical or non-random looking outcomes may appear with positive probability. Reliable deterministic construction of SSMs with random-looking spectra and low algorithmic and computational complexity is of particular interest due to the natural correspondence of SSMs and undirected graphs, since the latter are extensively used in combinatorial and CS applications e.g. for the purposes of derandomization. Unfortunately, most of the existing constructions of pseudo-random graphs focus on the extreme eigenvalues and do not provide guaranties on the whole spectrum. In this work, using binary Golomb sequences, we propose a simple completely deterministic construction of circulant SSMs with spectra converging to the semicircular law with the same rate as in the original Wigner ensemble. We show that this construction has close to lowest possible algorithmic complexity and is very explicit. Essentially, the algorithm requires at most 2log(n) bits implying that the real amount of randomness conveyed by the semicircular property is quite small.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 2, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: Recent Advances in Algorithmic High-Dimensional Robust Statistics

Topic:
Recent Advances in Algorithmic High-Dimensional Robust Statistics
Abstract / Description:

Fitting a model to a collection of observations is one of the quintessential problems in machine learning. Since any model is only approximately valid, an estimator that is useful in practice must also be robust in the presence of model misspecification. It turns out that there is a striking tension between robustness and computational efficiency. Even for the most basic high-dimensional tasks, such as robustly computing the mean and covariance, until recently the only known estimators were either hard to compute or could only tolerate a negligible fraction of errors.

In this talk, I will survey the recent progress in algorithmic high-dimensional robust statistics. I will describe the first robust and efficiently computable estimators for several fundamental statistical tasks that were previously thought to be computationally intractable. These include robust estimation of mean and covariance in high dimensions, and robust learning of various latent variable models. The new robust estimators are scalable in practice and yield a number of applications in exploratory data analysis.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, January 26, 2018 - 1:15pm
Venue:
Packard 202

## IT Forum: Tight regret bounds for a latent variable model of recommendation systems

Topic:
Tight regret bounds for a latent variable model of recommendation systems
Abstract / Description:

We consider an online model for recommendation systems, with each user being recommended an item at each time-step and providing 'like' or 'dislike' feedback. A latent variable model specifies the user preferences: both users and items are clustered into types. The model captures structure in both the item and user spaces, and our focus is on simultaneous use of both structures. We analyze the situation in which the type preference matrix has i.i.d. entries. Our analysis elucidates the system operating regimes in which existing algorithms are nearly optimal, as well as highlighting the sub-optimality of using only one of item or user structure (as is done in commonly used item-item and user-user collaborative filtering). This prompts a new algorithm that is nearly optimal in essentially all parameter regimes.

Joint work with Prof. Guy Bresler.

Date and Time:
Friday, November 10, 2017 - 1:15pm
Venue:
Packard 202

## IT-Forum: Information Theoretic Limits of Molecular Communication and System Design Using Machine Learning

Topic:
Information Theoretic Limits of Molecular Communication and System Design Using Machine Learning
Abstract / Description:

Molecular communication is a new and bio-inspired field, where chemical signals are used to transfer information instead of electromagnetic or electrical signals. In this paradigm, the transmitter releases chemicals or molecules and encodes information on some property of these signals such as their timing or concentration. The signal then propagates the medium between the transmitter and the receiver through different means such as diffusion, until it arrives at the receiver where the signal is detected and the information decoded. This new multidisciplinary field can be used for in-body communication, secrecy, networking microscale and nanoscale devices, infrastructure monitoring in smart cities and industrial complexes, as well as for underwater communications. Since these systems are fundamentally different from telecommunication systems, most techniques that have been developed over the past few decades to advance radio technology cannot be applied to them directly.

In this talk, we first explore some of the fundamental limits of molecular communication channels, evaluate how capacity scales with respect to the number of particles released by the transmitter, and the optimal input distribution. Finally, since the underlying channel models for some molecular communication systems are unknown, we demonstrate how techniques from machine learning and deep learning can be used to design components such as detection algorithms, directly from transmission data, without any knowledge of the underlying channel models.

Date and Time:
Monday, October 16, 2017 - 3:25pm to 4:25pm
Venue:
Packard 202