## EE Student Information

### The Department of Electrical Engineering supports Black Lives Matter. Read more.

• • • •

#### EE Student Information, Spring & Summer Quarters 19-20: FAQs and Updated EE Course List.

Updates will be posted on this page, as well as emailed to the EE student mail list.

As always, use your best judgement and consider your own and others' well-being at all times.

# IT-Forum

## IT-Forum: Data-Driven Policy Learning: Generalization and Optimization

Topic:
Data-Driven Policy Learning: Generalization and Optimization
Abstract / Description:

The problem of learning good treatment assignment rules from observational data lies at the heart of many challenges in data-driven decision making. While there is a growing body of literature devoted to this problem, most existing results are focused on the binary-action case (i.e., where one action corresponds to assignment to control and to assignment to treatment). In this paper, we study the offline multi-action policy learning problem with observational data and, building on the theory of efficient semi-parametric inference, propose and implement a policy learning algorithm that achieves asymptotically minimax-optimal regret. To the best of our knowledge, this is the first result of this type in the multi-action setup and provides a substantial performance improvement over the existing learning algorithms. We additionally investigate the application aspects of policy learning by working with decision trees, and discuss two different approaches for solving the key step of the learning algorithm to exact optimality, one using a mixed integer program formulation and the other using a tree-search based algorithm.

This is joint work with Susan Athey and Stefan Wager.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Yanjun Han and Yihui Quek. To suggest speakers, please contact any of the students.

Date and Time:
Friday, October 5, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum presents Representations, fairness, and privacy: information-theoretic tools for machine learning

Topic:
Representations, fairness, and privacy: information-theoretic tools for machine learning
Abstract / Description:

Information theory can shed light on the algorithm-independent limits of learning from data and serve as a design driver for new machine learning algorithms. In this talk, we discuss a set of flexible information-theoretic tools called the principal inertia components (PICs) that can be used to (i) understand fairness and discrimination in machine learning models, (ii) provide an estimation-theoretic view of privacy, and (iii) characterize data representations learned by complex learning models. The PICs enjoy a long history in both the statistics and information theory, and provide a fine-grained decomposition of the dependence between two random variables. We illustrate these techniques in both synthetic and real-world datasets, and discuss future research directions.

Date and Time:
Friday, September 28, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: Reverse hypercontractivity beats measure concentration for information theoretic converses

Topic:
Reverse hypercontractivity beats measure concentration for information theoretic converses
Abstract / Description:

Concentration of measure is a collection of tools and results from analysis and probability theory that have been used in many areas of pure and applied mathematics. Arguably, the first data science application of measure concentration (under the name ''blowing-up lemma'') is the proof of strong converses in multiuser information theory by Ahlswede, G\'acs and K\"orner in 1976. Since then, measure concentration has found applications in many other information theoretic problems, most notably the converse (impossibility) results in information theory. Motivated by this, information theorists (e.g. Marton) have also contributed to the mathematical foundations of measure concentration using their information-theoretic techniques.

Now, after all the past 40 years of such progress, we found that, amusingly, measure concentration is not the right hammer for many of these information theoretic applications. We introduce a new machinery based on functional inequalities and reverse hypercontractivity which yields strict improvements in terms of sharpness of the bounds, generality of the source/channel distributions, and simplicity of the proofs. Examples covered in the talk include: 1. optimal second-order converse to common randomness generation with rate-limited communication; 2. sharpening the relay channel converse bounds by and Wu and Ozgur with much simpler proofs.

The work benefited from collaborations with Thomas Courtade, Paul Cuff, Ayfer Ozgur, Ramon van Handel, and Sergio Verd\'u.

Date and Time:
Friday, August 24, 2018 - 1:15pm
Venue:
Packard 202

Topic:
Abstract / Description:

Quantization plays a critical role in digital signal processing systems. Quantizers are typically designed to obtain an accurate digital representation of the input signal, operating independently of the system task, and are commonly implemented using serial scalar analog-to-digital converters (ADCs). This talk is concerned with hardware-limited task-based quantization, where a system utilizing a serial scalar ADC is designed to provide a suitable representation in order to allow the recovery of a parameter vector underlying the input signal. We propose hardware-limited task-based quantization systems for a fixed and finite quantization resolution, and characterize their achievable distortion. Our results illustrate the benefits of properly taking into account the underlying task in the design of the quantization scheme.

Date and Time:
Wednesday, June 13, 2018 - 1:15pm
Venue:
Packard 202

## IT Forum: Phase transitions in generalized linear models

Topic:
Phase transitions in generalized linear models
Abstract / Description:

This is joint work with Jean Barbier, Florent Krzakala, Nicolas Macris and Lenka Zdeborova.

We consider generalized linear models (GLMs) where an unknown $n$-dimensional signal vector is observed through the application of a random matrix and a non-linear (possibly probabilistic) componentwise output function.

We study the models in the high-dimensional limit, where the observations consists of $m$ points, and $m/n \to \alpha > 0$ as $n \to \infty$. This situation is ubiquitous in applications ranging from supervised machine learning to signal processing.

We will analyze the model-case when the observation matrix has i.i.d. elements and the components of the ground-truth signal are taken independently from some known distribution.

We will compute the limit of the mutual information between the signal and the observations in the large system limit. This quantity is particularly interesting because it is related to the free energy (i.e. the logarithm of the partition function) of the posterior distribution of the signal given the observations. Therefore, the study of the asymptotic mutual information allows to deduce the limit of important quantities such as the minimum mean squared error for the estimation of the signal.

We will observe some phase transition phenomena. Depending on the noise level, the distribution of the signal and the non-linear function of the GLM we may encounter various scenarios where it may be impossible / hard (only with exponential-time algorithms) / easy (with polynomial-time algorithms) to recover the signal.

Date and Time:
Friday, May 18, 2018 - 1:15pm
Venue:
Packard 202

## ISL Colloquium & IT Forum: Random initialization and implicit regularization in nonconvex statistical estimation

Topic:
Random initialization and implicit regularization in nonconvex statistical estimation
Abstract / Description:

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation / learning problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory is often either far from optimal or completely lacks theoretical guarantees.

This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves near-optimal entrywise error control.

Date and Time:
Wednesday, May 23, 2018 - 4:15pm
Venue:
Building 370

## IT-Forum: Tight sample complexity bounds via dualizing LeCam's method

Topic:
Tight sample complexity bounds via dualizing LeCam's method
Abstract / Description:

In this talk we consider a general question of estimating linear functional of the distribution based on the noisy samples from it. We discover that the (two-point) LeCam lower bound is in fact achievable by optimizing bias-variance tradeoff of an empirical-mean type of estimator. We extend the method to certain symmetric functionals of high-dimensional parametric models.

Next, we apply this general framework to two problems: population recovery and predicting the number of unseen species. In population recovery, the goal is to estimate an unknown high-dimensional distribution (in $L_\infty$-distance) from noisy samples. In the case of \textit{erasure} noise, i.e. when each coordinate is erased with probability $\epsilon$, we discover a curious phase transition in sample complexity at $\epsilon=1/2$. In the second (classical) problem, we observe $n$ iid samples from an unknown distribution on a countable alphabet and the goal is to predict the number of new species that will be observed in the next (unseen) $tn$ samples. Again, we discover a phase transition at $t=1$. In both cases, the complete characterization of sample complexity relies on complex-analytic methods, such as Hadamard's three-lines theorem.

Joint work with Yihong Wu (Yale).

Date and Time:
Friday, May 4, 2018 - 1:15pm
Venue:
Packard 202

## IT Forum: From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform

Topic:
From Gaussian Multiterminal Source Coding to Distributed Karhunen–Loève Transform
Abstract / Description:

Characterizing the rate-distortion region of Gaussian multiterminal source coding is a longstanding open problem in network information theory. In this talk, I will show how to obtain new conclusive results for this problem using nonlinear analysis and convex relaxation techniques. A byproduct of this line of research is an efficient algorithm for determining the optimal distributed Karhunen–Loève transform in the high-resolution regime, which partially settles a question posed by Gastpar, Dragotti, and Vetterli. I will also introduce a generalized version of the Gaussian multiterminal source coding problem where the source-encoder connections can be arbitrary. It will be demonstrated that probabilistic graphical models offer an ideal mathematical language for describing how the performance limit of a generalized Gaussian multiterminal source coding system depends on its topology, and more generally they can serve as the long-sought platform for systematically integrating the existing achievability schemes and converse arguments. The architectural implication of our work for low-latency lossy source coding will also be discussed.

This talk is based on joint work with Jia Wang, Farrokh Etezadi, and Ashish Khisti.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, April 13, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum & ISL presents Robust sequential change-point detection

Topic:
Robust sequential change-point detection
Abstract / Description:

Sequential change-point detection is a fundamental problem in statistics and signal processing, with broad applications in security, network monitoring, imaging, and genetics. Given a sequence of data, the goal is to detect any change in the underlying distribution as quickly as possible from the streaming data. Various algorithms have been developed including the commonly used CUSUM procedure. However, there is a still a gap when applying change-point detection methods to real problems, notably, due to the lack of robustness. Classic approaches usually require exact specification of the pre and post change distributions forms, which may be quite restrictive and do not perform well with real data. On the other hand, Huber’s classic robust statistics built based on least favorable distributions are not directly applicable since they are computationally intractable in the multi-dimensional setting. In this seminar, I will present several of our recent works in developing computationally efficient and robust change-point detection algorithms with certain near optimality properties, by building a connection of statistical sequential analysis with (online) convex optimization.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, March 16, 2018 - 1:15pm
Venue:
Packard 202

## IT-Forum: Restricted Isometry Property of Random Projection for Low-Dimensional Subspaces

Topic:
Restricted Isometry Property of Random Projection for Low-Dimensional Subspaces
Abstract / Description:

Dimensionality reduction is in demand to reduce the complexity of solving large-scale problems with data lying in latent low-dimensional structures in machine learning and computer version. Motivated by such need, in this talk I will introduce the Restricted Isometry Property (RIP) of Gaussian random projections for low-dimensional subspaces in R^N, and prove that the projection Frobenius norm distance between any two subspaces spanned by the projected data in R^n for n smaller than N remain almost the same as the distance between the original subspaces with probability no less than 1 - e^O(-n).

Previously the well-known Johnson-Lindenstrauss (JL) Lemma and RIP for sparse vectors have been the foundation of sparse signal processing including Compressed Sensing. As an analogy to JL Lemma and RIP for sparse vectors, this work allows the use of random projections to reduce the ambient dimension with the theoretical guarantee that the distance between subspaces after compression is well preserved.

As a direct result of our theory, when solving the subspace clustering (SC) problem at a large scale, one may conduct SC algorithm on randomly compressed samples to alleviate the high computational burden and still have theoretical performance guarantee. Because the distance between subspaces almost remains unchanged after projection, the clustering error rate of any SC algorithm may keep as small as that conducting in the original space. Considering that our theory is independent of SC algorithms, this may benefit future studies on other subspace related topics.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 23, 2018 - 1:15pm
Venue:
Packard 202