IT-Forum

ISL Colloquium and IT-Forum present "A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression"

Topic: 
A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression
Abstract / Description: 

Many modern data sources arising from social networks, biological data, etc. are best viewed as indexed by combinatorial structures such as graphs, rather than time series. The local weak limit theory for sparse graphs, also known as the objective method, due to Benjamini, Schramm, Aldous, Steele, Lyons and others, provides a framework which enables one to make sense of a stationary process indexed by graphs. The theory of time series is the engine driving an enormous range of applications in areas such as control theory, communications, information theory and signal processing. It is to be expected that a theory of stationary stochastic processes indexed by combinatorial structures, in particular graphs, would eventually have a similarly wide-ranging impact.

Employing the above framework, we introduce a notion of entropy for probability distributions on rooted graphs. This is a generalization of the notion of entropy introduced by Bordenave and Caputo to graphs which carry marks on their vertices and edges. Such marks can represent information on real-world data. For instance, in a social network graph where each node represents an individual and edges represent friendships, a vertex mark represents the type of an individual, while edge marks represent shared data between friends. The above notion of entropy can be considered as a natural counterpart for the Shannon entropy rate in the world of graphical data. We illustrate this by introducing a universal compression scheme for marked graphical data. Furthermore, we introduce an algorithm that can perform such a compression with low complexity.

This talk is based on joint work with Venkat Anantharam.

Date and Time: 
Friday, December 6, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "An information-theoretic solution to random access communication"

Topic: 
An information-theoretic solution to random access communication
Abstract / Description: 

The emergence of networks of many devices in the context of cyber-physical systems underscores the need for novel solutions for communication over random access channels. Most protocols currently in place rely on collision avoidance and are woefully ill-suited to handling high variation in the number and variety of communicators. We ask the question of whether it is possible, in a scenario where no one knows how many transmitters are active, for the receiver to almost always recover the messages sent by all active transmitters. Surprisingly, we find that not only is reliable decoding possible in this regime, but it is possible to attain both the capacity and the dispersion of the multiple access channel in operation. Our strategy relies on ''semi-rateless'' codes: decoding attempts are made at a sparse set of pre-determined decoding times, and the decoder is tasked with recovering from the channel output the messages of the active transmitters. Once the decoder determines that reliable decoding can be made, a positive acknowledgement (ACK) is sent to the encoders, indicating the end of that communication epoch and the start of a new one. The feedback ensures that the collection of active transmitters remains fixed during each epoch. These semi-rateless codes work for both channel and source coding; for source coding, our semi-rateless scheme achieves the optimal performance up to the first three order terms in the Gaussian approximation of the optimal rate. The analysis includes a refinement of the random coding argument ensuring the existence of a deterministic code that meets multiple constraints simultaneously, a random-coding union achievability bound for multiple access, and a sharp converse bound for multiple access source coding via a connection to composite hypothesis testing.

Based on joint works (ArXiv:1801.09018, ArXiv:1902.03366) with M. Effros, R. C. Yavas, S. Chen.

Date and Time: 
Friday, November 22, 2019 - 1:15pm
Venue: 
Packard 202

Nonparametric generative modeling via optimal transport and diffusions with provable guarantees

Topic: 
Nonparametric generative modeling via optimal transport and diffusions with provable guarantees
Abstract / Description: 

By building upon the recent theory that established the connection between implicit generative modeling (IGM) and optimal transport, in this study, we propose a novel parameter-free algorithm for learning the underlying distributions of complicated datasets and sampling from them. The proposed algorithm is based on a functional optimization problem, which aims at finding a measure that is close to the data distribution as much as possible and also expressive enough for generative modeling purposes. We formulate the problem as a gradient flow in the space of probability measures. The connections between gradient flows and stochastic differential equations let us develop a computationally efficient algorithm for solving the optimization problem. We provide formal theoretical analysis where we prove finite-time error guarantees for the proposed algorithm. Our experimental results support our theory and show that our algorithm is able to successfully capture the structure of different types of data distributions.

The talk will be based on the following paper:
"Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions", ICML 2019

Date and Time: 
Friday, November 8, 2019 - 1:00pm
Venue: 
Allen 101X

IT Forum & ISL Colloquium presents "The Case Against Reality"

Topic: 
The Case Against Reality
Abstract / Description: 

If I have a visual experience that I describe as a red tomato a meter away, then I am inclined to believe that there is, in fact, a red tomato a meter away, even if I close my eyes. I believe that my perceptions are, in the normal case, veridical—that they accurately depict aspects of the real world. But is my belief supported by our best science? In particular: Does evolution by natural selection favor veridical perceptions? Many scientists and philosophers claim that it does. But this claim, though plausible, has not been properly tested. In this talk I present a new theorem: Veridical perceptual systems are never more fit than non-veridical perceptual systems of equal complexity that are simply tuned to the relevant fitness functions. This entails that perception is not a window on reality; it is more like a windows interface on your laptop. Spacetime is your desktop, and physical objects, like apples and neurons, are simply icons on this desktop. I discuss this interface theory of perception and its implications for one of the most puzzling unsolved problems in science: the relationship between brain activity and conscious experiences.

Date and Time: 
Friday, November 8, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present " Dictionary Compression and its Applications"

Topic: 
Dictionary Compression and its Applications
Abstract / Description: 

Dictionary compression is a known technique, promising to solve the problem of compressing small inputs. However, it has been only available to implementers since relatively recently, as newer compression algorithms shipped dictionary builders alongside their main codec. Due to this recent timeframe, complexities around deploying this solution at larger scales only start to be appreciated. Understanding these difficulties, and finding ways to harness them, is key to target system's performance and reliability. Yet, the price is big, as dictionary compression not only improves compression, it also offers the potential to redesign systems around their capabilities.

We'll cover the benefits, trade-off and operational difficulties of dictionary compression, as well as their important second-order impacts for systems adopting it.

Date and Time: 
Friday, November 1, 2019 - 1:15pm
Venue: 
Packard 202

ISL Colloquium & IT Forum present "Generalized Resilience and Robust Statistics"

Topic: 
Generalized Resilience and Robust Statistics
Abstract / Description: 

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the robust statistics approach to consider perturbations under any Wasserstein distance, and show that robust estimation is possible whenever a distribution's population statistics are robust under a certain family of friendly perturbations. This generalizes a property called resilience previously employed in the special case of mean estimation with outliers. We justify the generalized resilience property by showing that it holds under moment or hypercontractive conditions. Even in the total variation case, these subsume conditions in the literature for mean estimation, regression, and covariance estimation; the resulting analysis simplifies and sometimes improves these known results in both population limit and finite-sample rate. Our robust estimators are based on minimum distance (MD) functionals (Donoho and Liu, 1988), which project onto a set of distributions under a discrepancy related to the perturbation. We present two approaches for designing MD estimators with good finite- sample rates: weakening the discrepancy and expanding the set of distributions. We also present connections to Gao et al. (2019)'s recent analysis of generative adversarial networks for robust estimation.

Joint work with Banghua Zhu and Jacob Steinhardt

Date and Time: 
Friday, October 11, 2019 - 1:15pm
Venue: 
Packard 202

IT-Forum and ISL Colloquium SPECIAL SEMINAR: "Towards an Average-case Complexity of High-dimensional Statistics"

Topic: 
Towards an Average-case Complexity of High-dimensional Statistics
Abstract / Description: 

The prototypical high-dimensional statistical estimation problem
entails finding a structured signal in noise. These problems have
traditionally been studied in isolation, with researchers aiming to
develop statistically and computationally efficient algorithms, as well
as to try to understand the fundamental limits governing the interplay
between statistical and computational cost. In this talk I will
describe a line of work that yields average-case reductions directly
between a number of central high-dimensional statistics problems,
relating two problems by transforming one into the other. It turns out
that several problems described by robust formulations can be addressed
by one set of techniques, and we will focus on these in the talk. In
this direction, we obtain the following average-case lower bounds based
on the planted clique conjecture: a statistical-computational gap in
robust sparse mean estimation, a detection-recovery gap in community
detection, and a universality principle for computational-statistical
gaps in sparse mixture estimation. In addition to showing strong
computational lower bounds tight against what is achievable by
efficient algorithms, the methodology gives insight into the common
features shared by different high-dimensional statistics problems with
similar computational behavior. Joint work with Matthew Brennan.


Special joint seminar:  IT-Forum and The Information Systems Laboratory Colloquium (ISLC)

Date and Time: 
Friday, November 15, 2019 - 1:15pm
Venue: 
Packard 202

ISL Colloquium presents "Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators"

Topic: 
Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators
Abstract / Description: 

Convolutional Neural Networks (CNNs) have emerged as highly successful tools for image generation, recovery, and restoration. This success is often attributed to large amounts of training data.
On the contrary, a number of recent experimental results suggest that a major contributing factor to this success is that convolutional networks impose strong prior assumptions about natural images. A surprising experiment that highlights this structural bias towards simple, natural images is that one can remove various kinds of noise and corruptions from a corrupted natural image by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to this single image. While this over-parameterized model can eventually fit the corrupted image perfectly, surprisingly after a few iterations of gradient descent one obtains the uncorrupted image, without using any training data. This intriguing phenomena has enabled state-of-the-art CNN-based denoising as well as regularization in linear inverse problems such as compressive sensing.
In this talk we take a step towards explaining this experimental phenomena by attributing it to particular architectural choices of convolutional networks. We then characterize the dynamics of fitting a two layer convolutional generator to a noisy signal and prove that early-stopped gradient descent denoises/regularizes. This results relies on showing that convolutional generators fit the structured part of an image significantly faster than the corrupted portion.

Based on joint work with Paul Hand and Mahdi Soltanolkotabi.

 


 

The Information Systems Laboratory Colloquium (ISLC)

is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

To receive email notifications of seminars you can join the ISL mailing list.

Date and Time: 
Thursday, October 3, 2019 - 4:30pm
Venue: 
Packard 101

IT Forum & ISL Colloquium present "Proving Coding Theorems using Poisson Processes"

Topic: 
Proving Coding Theorems using Poisson Processes
Abstract / Description: 

In information theory, coding theorems are usually proved in the asymptotic regime where the blocklength tends to infinity. While there are techniques for finite blocklength analysis, they are often more complex than their asymptotic counterparts. In this talk, we study the use of Poisson processes in proving coding theorems, which not only gives sharp finite blocklength results, but also gives significantly shorter proofs than conventional asymptotic techniques in some settings. Instead of using fixed-size random codebooks, we construct the codebook as a Poisson process. We present a lemma, called the Poisson matching lemma, which can replace both packing and covering lemmas in proofs based on typicality. We then demonstrate its use in settings such as channel coding with channel state information at the encoder, lossy source coding with side information at the decoder, joint source-channel coding, broadcast channels, and distributed lossy source coding. This shows that the Poisson matching lemma is a viable alternative to typicality for most problems in network information theory.

The talk is based on a joint work with Prof. Venkat Anantharam (UC Berkeley).

 

Date and Time: 
Friday, September 27, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "Human-Interpretable Concept Learning via Information Lattices"

Topic: 
Human-Interpretable Concept Learning via Information Lattices
Abstract / Description: 

Is it possible to learn the laws of music theory directly from raw sheet music in the same human-interpretable form as a music theory textbook? How little prior knowledge needs to be encoded to do so? We consider these and similar questions in other topical domains, in developing a general framework for automatic concept learning. The basic idea is an iterative discovery algorithm that has a student-teacher architecture and that operates on a generalization of Shannon's information lattice, which itself encodes a hierarchy of abstractions and is algorithmically constructed from group-theoretic foundations. In particular, learning this hierarchy of invariant concepts involves iterative optimization of Bayesian surprise and entropy. This gives a first step towards a principled and cognitive way of automatic concept learning and knowledge discovery. We further discuss applications in computational creativity, and limit theorems for creativity.

Date and Time: 
Friday, September 6, 2019 - 1:15pm
Venue: 
Packard 202

Pages

Subscribe to RSS - IT-Forum