# IT-Forum

## IT-Forum

Topic:
On choice and communication in random matching markets
Abstract / Description:

We study the structure of stable matchings in two-sided random matching markets. First we show that even the slightest imbalance yields an essentially unique stable matching explaining why unique stable matching are empirically ubiquitous. This raises the question of how the composition of the market determines the set of potential matches. Towards this goal we study the communication requirements for reaching a stable matching in tiered random markets. We find that in such markets, a small amount of communication is needed from each agent to reach a stable matching with high probability. In particular, we construct a communication protocol that finds a stable matching with $O(\log^2 n)$ bits of communication on average per agent, and $O(\sqrt{n}\polylog (n))$ bits in worst case for an agent. Our results are tight in the sense that any communication protocol requires at least these costs, both on average and for the worst case agent.

Our construction reveals an illuminating structure of stable matchings. Agents ''apply'' to their favorite agents in a ''target tier'', which is the worst tier they can guarantee to be matched into. On the other hand, each agent maybe reached out by other agents from her top potential tier(s). Our results suggest that agents do not need to know their complete preferences, but only their favorite potential matches in their target tier and their preferences among agents who reach out to them.

Based on joint works with (i) Yash Kanoria and Jacob Leshno, and (ii) with Mark Braverman, Yash Kanoria and Peng Shi.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, May 6, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
TBA
Abstract / Description:

Data in the form of noisy pairwise comparisons arises in many domains such as sports, crowdsourcing and operations research. Some fundamental inferential questions include estimating a total or partial order of the items, or estimating the set of underlying pairwise-comparison probabilities. Prior works on these topics often rely on restrictive "parametric" models, that often provide poor fits to the data. Instead, we consider much richer models that rely on "permutations" rather than parameters. We establish tight statistical (information-theoretic) guarantees on estimation under these models, showing that this increased flexibility in the models yields a significantly higher robustness to the estimation procedures, while remarkably, increasing the worst case risk by only logarithmic factors. We also discuss various associated computational challenges.

Joint work with Sivaraman Balakrishnan, Adityanand Guntuboyina and Martin J. Wainwright

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, April 29, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Neural universal discrete denoising
Abstract / Description:

In this talk, I will present a novel framework of applying deep neural network (DNN) to the universal discrete denoising problem. DNN has recently shown remarkable performance improvements in diverse applications, and most of the success are based on the supervised learning framework. While successful in many applications, it is not straightforward to apply such framework to the universal discrete denoising problem, in which a denoiser tries to estimate an unknown finite-valued clean data based on its noisy observation. The reason is because the ground-truth label for a denoiser is the clean data subject to the estimation and is clearly not available for training a denoiser. In this work, I follow the framework of DUDE (Discrete Universal DEnoiser) and devise a novel way of training a DNN as a discrete denoiser solely based on the given noisy observation. The key idea is to utilize an unbiased estimate of the true loss of a denoiser and define a novel objective function for DNN based on the "pseudo-labels". The resulting scheme is dubbed as Neural DUDE, and the experimental results show that Neural DUDE significantly outperforms the original DUDE, which is the state-of-the-art on several discrete denoising problems. Furthermore, we show that Neural DUDE overcomes the critical limitation of DUDE, namely, it is much more robust to the choice of the hyper-parameter and has a concrete way of choosing the best hyper-parameter for given data. Such property makes Neural DUDE an attractive choice in practice. Finally, I will conclude with some potential future research directions, such as extending the framework to the denoising of continuous-valued data.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.

Date and Time:
Friday, April 22, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Securing massive MIMO at the Physical Layer
Abstract / Description:

Massive MIMO is one of the highlights of the envisioned 5G communication systems. In the massive MIMO paradigm, the base station is equipped with a number of antennas, typically much larger than the number of users served. While many issues behind the design of multicellular massive MIMO systems have been studied extensively, security of massive MIMO has not been addressed in most part. In this talk, we provide a brief introduction to physical layer security and massive MIMO before we discuss major vulnerabilities of massive MIMO as well as potential defense strategies.

To that end, we consider a single-cell downlink massive MIMO system in the presence of attackers capable of simultaneous jamming and eavesdropping. We show that, classical attacks of data jamming and data eavesdropping can be simultaneously rendered useless: as the number of antennas grow, with proper precoding and power control, mobiles can simultaneously achieve (1) full equivocation without the need to use wiretap encoding under data eavesdropping and (2) no-attack achievable rates under data jamming. However, we introduce a new attack strategy that involves jamming pilot signals and eavesdropping in succession and show significant reductions in the achievable secrecy rate, even in the asymptotic regime in the number of antennas. To counter this attack, we develop a defense strategy in which we use a secret key to encrypt the pilot sequence assignments, rather than encrypt the data. We show that hiding the training signal assignments from the attacker enables the mobiles to achieve secure degrees of freedom, identical to the achievable degrees of freedom under no attack.

Date and Time:
Friday, April 15, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Low-rank Estimation: Why Non-convex Gradient Descent Works
Abstract / Description:

A large range of problems in statistics, machine learning and signal processing can be cast as fitting a structured low-rank matrix to noisy data. A popular approach to the resulting rank-constrained optimization problem is convex SDP relaxation, which runs in polynomial time in principle and enjoys desirable statistical guarantees. For large-scale problems, however, the (super-)quadratic time complexity of SDPs is often too high. An attractive alternative is to run projected gradient descent directly over the space of low-rank matrices. This approach is highly scalable, but its convergence and statistical accuracy were unclear due to non-convexity. Here we develop a unified framework characterizing the convergence of this non-convex method as well as the statistical properties of the resulting solutions. Our results provide insights into why we should expect non-convex methods to work in general, and yield global guarantees for a large number of concrete problems, including matrix completion with real and one-bit observations, matrix decomposition, robust and sparse PCA and graph clustering. Our framework is powerful enough to accommodate nonlinear measurements, matrices with arbitrary rank, and the noisy, constrained setting with additional structures in the noise and target matrices, and moreover does not require resampling.

Date and Time:
Friday, April 1, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
When your big data seems too small: accurate inferences beyond the empirical distribution
Abstract / Description:

We discuss two problems related to the general challenge of making accurate inferences about a complex distribution, in the regime in which the amount of data (i.e the sample size) is too small for the empirical distribution of the samples to be an accurate representation of the underlying distribution. The first problem is the task of accurately estimating the eigenvalues of the covariance matrix of a distribution--the "population spectrum". (These eigenvalues contain basic information about the distribution, including the presence/lack of low-dimensional structure in the distribution and the applicability of many higher-level machine learning and multivariate statistical tools.) As we show, even in the regime where the sample size is linear or sublinear in the dimensionality of the distribution, and hence the eigenvalues and eigenvectors of the empirical covariance matrix are misleading, accurate approximations to the true population spectrum are possible. In the second portion of the talk, we discuss the problem of recovering a low-rank approximation to a matrix of probabilities P, given access to an observed matrix of "counts" obtained via independent samples from the distribution defined by P. This problem can be viewed as a generalization of "community detection", and is relevant to several recent machine learning efforts, including the work on constructing "word embeddings".

Date and Time:
Friday, February 12, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Asymptotically tight bounds on the depth of estimated context trees
Abstract / Description:

We study the maximum depth of context tree estimates, i.e., the maximum Markov order attainable by an estimated tree model given an input sequence of length n. We consider two classes of estimators:

1) Penalized maximum likelihood (PML) estimators where a context tree T is obtained by minimizing a cost of the form -log P_T(x^n) + f(n)|S_T|, where P_T(x^n) is the ML probability of the input sequence x^n under a tree model T, S_T is the set of states defined by T, and f(n) is an increasing (penalization) function of n (the popular BIC estimator corresponds to f(n)= (A-1)/2 log n, where A is the size of the input alphabet).

2) MDL estimators based on the KT probability assignment. In each case we derive an asymptotic upper bound, n^{1/2 + o(1)}, and we exhibit explicit input sequences that show that this bound is asymptotically tight up to the term o(1) in the exponent.

It is based on joint work with Gadiel Seroussi.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Kartik Venkat. To suggest speakers, please contact any of the students.

Date and Time:
Friday, February 5, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Computational Limits in Statistical Inference: Hidden Cliques and Sum of Squares
Abstract / Description:

Characterizing the computational complexity of statistical inference problems is an outstanding open problem. This is gaining increasing importance given the ubiquity of large scale data analysis and algorithms in application domains as diverse as genomics, healthcare, finance and social sciences. The hidden clique problem is a prototypical example of an inference problem wherein computational constraints dramatically alter the inference ability. The hidden clique problem posits to find a "hidden" or "planted" clique (a densely connected set of vertices) within an otherwise random Erdos-Renyi graph. Unfortunately, standard reduction tools from theoretical computer science to demonstrate hardness fail to capture statistical nature of the hidden clique problem. I will describe a framework based on the powerful Sum-of-Squares (SOS) technique for characterizing computational difficulty in the hidden clique and other related problems. The analysis of the SOS algorithms requires controlling the spectra of random matrices of a non-standard type. I will describe a set of results that represent the state-of-the-art in proving lower bounds for the hidden clique and related problems.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Kartik Venkat. To suggest speakers, please contact any of the students.

Date and Time:
Friday, January 22, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Multiple Testing and Adaptive Estimation via the Sorted L-One Norm
Abstract / Description:

In many real-world statistical problems, we observe a large number of potentially explanatory variables of which a majority may be irrelevant. For this type of problem, controlling the false discovery rate (FDR) guarantees that most of the discoveries are truly explanatory and thus replicable. In this talk, we propose a new method named SLOPE to control the FDR in sparse high-dimensional linear regression. This computationally efficient procedure works by regularizing the fitted coefficients according to their ranks: the higher the rank, the larger the penalty. This is analogous to the Benjamini-Hochberg procedure, which compares more significant p-values with more stringent thresholds. Whenever the columns of the design matrix are not strongly correlated, we show empirically that SLOPE obtains FDR control at a reasonable level while offering substantial power.

Although SLOPE is developed from a multiple testing viewpoint, we show the surprising result that it achieves optimal squared errors under Gaussian random designs over a wide range of sparsity classes. An appealing feature is that SLOPE does not require any knowledge of the degree of sparsity. This adaptivity to unknown sparsity has to do with the FDR control, which strikes the right balance between bias and variance. The proof of this result presents several elements not found in the high-dimensional statistics literature.

The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.

The Information Theory Forum is organized by graduate students Jiantao Jiao and Kartik Venkat. To suggest speakers, please contact any of the students.

Date and Time:
Friday, January 15, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202

## IT-Forum

Topic:
Improving on the Cut-Set Bound via Geometric Analysis of Typical Sets
Abstract / Description:

The cut-set bound developed by (Cover and El Gamal, 1979) has since remained the best known upper bound on the capacity of the Gaussian relay channel. We develop a new upper bound on the capacity of the Gaussian primitive relay channel which is tighter than the cut-set bound. Combined with a simple tensorization argument proposed by (Courtade and Ozgur, 2015), our result also implies that the current capacity approximations for Gaussian relay networks, which have linear gap to the cut-set bound in the number of nodes, are order-optimal and leads to a lower bound on the pre-constant.

Our approach for developing the new bound significantly deviates from the standard information-theoretic approach for proving upper bounds on the capacity of multi-user channels. We build on measure concentration to analyze the geometric probabilistic relations between the typical sets of the n-letter random variables associated with a reliable code for communicating over the channel. These relations translate to new entropy inequalities between the n-letter random variables involved. Our approach can be extended to the discrete memoryless relay channel, and in particular, can be used to obtain surprising insights on a long-standing open problem posed by (Cover, 1987), namely, what is the minimum required relay-to-destination communication rate in order for the capacity of the relay channel to be equal to that of the broadcast cut.

Date and Time:
Friday, January 8, 2016 - 1:00pm to 2:00pm
Venue:
Packard 202