## EE Student Information

Academic Year 20-21 FAQ and updated EE course list

# IT-Forum

## Estimation of entropy and differential entropy beyond i.i.d. and discrete distributions

Topic:
Estimation of entropy and differential entropy beyond i.i.d. and discrete distributions
Abstract / Description:

Recent years have witnessed significant progress in entropy and mutual information estimation, in particular in the large alphabet regime. Concretely, there exist efficiently computable estimators whose performance with n samples is essentially that of the maximum likelihood estimator with n log(n) samples, a phenomenon termed "effective sample size enlargement". Generalizations to processes with memory (estimation of the entropy rate) and continuous distributions (estimation of the differential entropy) have remained largely open. This talk is about the challenges behind those generalizations and recent progress in this direction. For estimating the entropy rate of a Markov chain, we show that when the mixing time is not too slow, at least S^2/log(S) samples are required to consistently estimate the entropy rate, where S is the size of the state space. In contrast, the empirical entropy rate requires S^2 samples to achieve consistency even if the Markov chain is i.i.d. We propose a general approach to achieve the S^2/log(S) sample complexity, and illustrate our results through estimating the entropy rate of the English language from the Penn Treebank (PTB) and the Google 1 Billion Word Dataset. For differential entropy estimation, we characterize the minimax behavior over Besov balls, and show that a fixed-k nearest neighbor estimator adaptively achieves the minimax rates up to logarithmic factors without knowing the smoothness of the density. The "effective sample size enlargement" phenomenon holds in both the Markov chain case and the case of continuous distributions.

Joint work with Weihao Gao, Yanjun Han, Chuan-Zheng Lee, Pramod Viswanath, Tsachy Weissman, Yihong Wu, and Tiancheng Yu.

Date and Time:
Friday, October 13, 2017 - 1:15pm
Venue:
Packard 202

## IT-Forum: Multi-Agent Online Learning under Imperfect Information: Algorithms, Theory and Applications

Topic:
Multi-Agent Online Learning under Imperfect Information: Algorithms, Theory and Applications
Abstract / Description:

We consider a model of multi-agent online learning under imperfect information, where the reward structures of agents are given by a general continuous game. After introducing a general equilibrium stability notion for continuous games, called variational stability, we examine the well-known online mirror descent (OMD) learning algorithm and show that the "last iterate" (that is, the actual sequence of actions) of OMD converges to variationally stable Nash equilibria provided that the feedback delays faced by the agents are synchronous and bounded. We then extend the result to almost sure convergence to variationally stable Nash equilibria under both unbiased noise and synchronous and bounded delays. Subsequently, to tackle fully decentralized, asynchronous environments with unbounded feedback delays, we propose a variant of OMD which we call delayed mirror descent (DMD), and which relies on the repeated leveraging of past information. With this modification, the algorithm converges to variationally stable Nash equilibria, with no feedback synchronicity assumptions, and even when the delays grow super-linearly relative to the game's horizon. We then again extend it to the case where there are both delays and noise.

In the second part of the talk, we present two applications of the multi-agent online learning framework. The first application is on non-convex stochastic optimization, where we characterize almost sure convergence of the well-known stochastic mirror descent algorithm to global optima for a large class of non-convex stochastic optimization problems (strictly including convex, quasi-convex and start-convex problems). A step further, our results also include as a special case the large-scale stochastic optimization problem, where stochastic mirror descent is applied in a distributed, asynchronous manner across multiple machines/processors. Time permitting, we will discuss how these results help (at least in part) clarify and affirm the recent successes of mirror-descent type algorithms in large-scale machine learning. The second application concerns power management on random wireless networks, where we use a game-design approach to derive robust power control algorithms that converge (almost surely) to the optimal power allocation in the presence of randomly fluctuating networks.

This is joint work with Nick Bambos, Stephen Boyd, Panayotis Mertikopoulos, Peter Glynn and Claire Tomlin.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, October 6, 2017 - 1:15pm
Venue:
Packard 101

## Biology as Information Dynamics [IT forum]

Topic:
Biology as Information Dynamics
Abstract / Description:

If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the 'replicator equation' – a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback-Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher's fundamental theorem of natural selection.

Date and Time:
Thursday, April 20, 2017 - 4:20pm
Venue:
Clark S361

## New Directions in Management Science & Engineering: A Brief History of the Virtual Lab

Topic:
New Directions in Management Science & Engineering: A Brief History of the Virtual Lab
Abstract / Description:

Lab experiments have long played an important role in behavioral science, in part because they allow for carefully designed tests of theory, and in part because randomized assignment facilitates identification of causal effects. At the same time, lab experiments have traditionally suffered from numerous constraints (e.g. short duration, small-scale, unrepresentative subjects, simplistic design, etc.) that limit their external validity. In this talk I describe how the web in general—and crowdsourcing sites like Amazon's Mechanical Turk in particular—allow researchers to create "virtual labs" in which they can conduct behavioral experiments of a scale, duration, and realism that far exceed what is possible in physical labs. To illustrate, I describe some recent experiments that showcase the advantages of virtual labs, as well as some of the limitations. I then discuss how this relatively new experimental capability may unfold in the future, along with some implications for social and behavioral science.

Date and Time:
Thursday, March 16, 2017 - 12:15pm
Venue:
Packard 101

## Minimum Rates of Approximate Sufficient Statistics [IT-Forum]

Topic:
Minimum Rates of Approximate Sufficient Statistics
Abstract / Description:

Given a sufficient statistic for a parametric family of distributions, one can estimate the parameter without access to the data itself but by using a sufficient statistic. However, the memory size for storing the sufficient statistic may be prohibitive. Indeed, for $n$ independent data samples drawn from a $k$-nomial distribution with $d=k-1$ degrees of freedom, the length of the code scales as $d\log n+O(1)$. In many applications though, we may not have a useful notion of sufficient statistics and also may not need to reconstruct the generating distribution exactly. By adopting an information-theoretic approach in which we consider allow a small error in estimating the generating distribution, we construct various notions of {\em approximate sufficient statistics} and show that the code length can be reduced to $\frac{d}{2}\log n + O(1)$. We consider errors measured according to the relative entropy and variational distance criteria. For the code construction parts, we leverage Rissanen's minimum description length principle, which yields a non-vanishing error measured using the relative entropy. For the converse parts, we use Clarke and Barron's asymptotic expansion for the relative entropy of a parametrized distribution and the corresponding mixture distribution. The limitation of this method is that only a weak converse for the variational distance can be shown. We develop new techniques to achieve vanishing errors and we also prove strong converses for all our statements. The latter means that even if the code is allowed to have a non-vanishing error, its length must still be at least $\frac{d}{2}\log n$.

This is joint work with Prof. Masahito Hayashi (Graduate School of Mathematics, Nagoya University and Center for Quantum Technologies, NUS

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Monday, April 17, 2017 - 1:15pm
Venue:
Packard 202

## Information Theory, Geometry, and Cover's Open Problem [IT-Forum]

Topic:
Information Theory, Geometry, and Cover's Open Problem
Abstract / Description:

Formulating the problem of determining the communication capacity of point-to-point channels as a problem in high-dimensional geometry is one of Shannon's most important insights that has led to the conception of information theory. However, such geometric insights have been limited to the point-to-point case, and have not been effectively utilized to attack network problems. In this talk, we present our recent work which develops a geometric approach to make progress on one of the central problems in network information theory, namely the capacity of the relay channel. In particular, consider a memoryless relay channel, where the channel from the relay to the destination is an isolated bit pipe of capacity C0. Let C(C0) denote the capacity of this channel as a function of C0. What is the critical value of C0 such that C(C0) first equals C(infinity)? This is a long-standing open problem posed by Cover and named ''The Capacity of the Relay Channel,'' in Open Problems in Communication and Computation, Springer-Verlag, 1987. In this talk, we answer this question in the Gaussian case and show that C0 can not equal to C(infinity) unless C0=infinity, regardless of the SNR of the Gaussian channels, while the cut-set bound would suggest that C(infinity) can be achieved at finite C0. The key step in our proof is a strengthening of the isoperimetric inequality on a high-dimensional sphere, which we use to develop a packing argument on a spherical cap that resembles Shannon's sphere packing idea for point-to-point channels.

Joint work with Leighton Barnes and Ayfer Ozgur.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 24, 2017 - 1:15pm
Venue:
Packard 202

## Codes and card tricks: Magic for adversarial crowds [IT-Forum]

Topic:
Codes and card tricks: Magic for adversarial crowds
Abstract / Description:

Rated by Ron Graham as a top-10 mathematical card trick of the 20th century, Diaconis' mind reader is a magic that involves the interaction with five collaborative volunteers. Inspired by this, we perform a similar card trick in this talk with the upgrade to tolerate bluffing volunteers. The theory behind this trick will be used to develop fundamental limits as well as code constructions for faster delay estimation in positioning systems.

This is a joint work with Sihuang Hu and Ofer Shayevitz (https://arxiv.org/abs/1605.09038).

Date and Time:
Friday, January 27, 2017 - 1:15pm
Venue:
Packard 202

## On Two Problems in Coded Statistical Inference [IT-Forum]

Topic:
On Two Problems in Coded Statistical Inference
Abstract / Description:

While statistical inference and information theory are deeply related fields, problems which lie at the intersection of both disciplines usually fall between the two stools, and lack definitive answers. In this talk, I will discuss recent advances in two such problems.

In the first part of the talk, I will discuss a distributed hypothesis testing problem, in which the hypotheses regard the joint statistics of two sequences, one available to the decision function directly (as side information), while the other is conveyed through a limited-rate link. The goal is to design a system which obtains the optimal trade-off between the false-alarm and misdetection exponents. I will define a notion of "channel detection codes", and show that the optimal exponents of the distributed hypothesis testing problem is directly related to the exponents of these codes. Then, I will discuss a few bounds on the exponents of channel detection codes, as well as prospective improvements. This approach has a two merits over previous works: It is suitable for any pair of memoryless joint distributions, and it provides bounds on the entire false-alarm/misdetection curve, rather than just bounds on its boundary points (Stein's exponent).

In the second part of the talk (time permitting), I will discuss a parameter estimation problem over an additive Gaussian noise channel with bandlimited input. In case one is allowed to design both the modulator and the estimator, the absolute \$alpha$-th moment of the estimation error can decrease exponentially with the transmission time. I will discuss several new upper (converse) bounds for the optimal decrease rate.

Joint work with Yuval Kochman (Hebrew university) and Neri Merhav (Technion).

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 17, 2017 - 1:15pm
Venue:
Packard 202

## Semantic security versus active adversaries and wiretap channels with random parameters [IT-Forum]

Topic:
Semantic security versus active adversaries and wiretap channels with random parameters
Abstract / Description:

Physical Layer Security (PLS) guarantees protection against computationally-unlimited eavesdroppers without using a key. These guarantees come at the price of an unrealistic assumption that the eavesdropper's channel is fully known to the legitimate parties. Furthermore, typical PLS metrics are incompatible with the features of the data they are designed to protect. For these reasons, PLS has found limited use in practice despite its various benefits. By means of a novel and stronger version of Wyner's soft-covering lemma, we upgrade IT security proofs to the stronger and more practically viable semantic-security metric, while removing the 'known eavesdropper channel' assumption. As applications we derive the semantic-security capacity of the type constrained arbitrarily varying wiretap channel (WTC), and as its special case, solve the problem of the WTC of type II with a noisy main channel -- a problem by Ozarow and Wyner that was open since 1984. The scenario where the state sequence is random (rather than arbitrary) is also considered. We construct a simple semantically-secure superposition code that strictly outperforms the best previously known achievable rates. The construction implicitly includes a key agreement phase (by means of the random and i.i.d. state sequence) that is crucial for the aforementioned improvement.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 10, 2017 - 1:15pm
Venue:
Packard 202

## Information-theoretic tradeoffs in control [IT-Forum]

Topic:
Abstract / Description:

Consider a distributed control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is to minimize a quadratic cost function in the state variables and control signal, known as the linear quadratic regulator (LQR). We study the fundamental tradeoff between the communication rate r bits/sec and the limsup of the expected cost b.

We consider an information-theoretic rate-cost function, which quantifies the minimum mutual information between the channel input and output, given the past, that is compatible with a target LQR cost. We provide a lower (converse) bound to the rate-cost function, which applies as long as the system noise has a probability density function, and which holds for a general class of codes that can take full advantage of the memory of the data observed so far and that are not constrained to have any particular structure. The rate-cost function has operational significance in multiple scenarios of interest: among other, it allows us to lower bound the minimum communication rate for fixed and variable length quantization, and for control over a noisy channel.

Perhaps surprisingly, the bound can be approached by a simple variable-length lattice quantization scheme, as long as the system noise satisfies a smoothness condition. The quantization scheme only quantizes the innovation, that is, the difference between the controller's belief about the current state and the encoder's state estimate. To prove that this simple scheme is almost as good as the optimum if the target cost is not too large, we derive a new nonasymptotic upper bound on the entropy of a lattice quantizer in terms of the Shannon lower bound to rate-distortion function and a smoothness parameter of the source.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, March 3, 2017 - 1:15pm
Venue:
Packard 202