Information Systems Lab (ISL) Colloquium

ISL & IT Forum present "Building a DNA information storage system from the bottom up"

Topic: 
Building a DNA information storage system from the bottom up
Abstract / Description: 

DNA has emerged as a compelling data storage medium due to its density, longevity, and eternal relevance compared to current memory technologies. However, the high price of synthesizing DNA (list price $3,500 per megabyte) remains a major bottleneck for adoption of this promising storage solution. In this talk, I will present our work towards breaking down this barrier using enzymatic DNA synthesize and a tailored codec for robust data retrieval. I will also touch upon some fundamental considerations when designing DNA information storage systems.

Date and Time: 
Friday, June 21, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "Tensor networks and quantum Markov chain"

Topic: 
Tensor networks and quantum Markov chain
Abstract / Description: 

Tensor networks can often accurately approximate various systems that are of interest to physicists, but the computational cost of contracting the network is often prohibitively demanding. Quantum computer can resolve this bottleneck, but because the size of the computation scales with the size of the network, the existing quantum computers may appear to be far too noisy for this task. We prove a nontrivial error bound on the outcome of the computation that does not scale with the size of the network. This is possible because certain tensor networks are secretly implementing a quantum analogue of a (classical) Markovian dynamics. The long-time stability of the Markovian dynamics gives rise to a nontrivial error bound on the outcome of the computation. This suggests that there may be practical problems of interest that are amenable to relatively noisy quantum computers.

Date and Time: 
Friday, June 7, 2019 - 1:15pm
Venue: 
Packard 202

ISL & Stats present Stability and uncertainty quantification

Topic: 
Bridging convex and nonconvex optimization in noisy matrix completion: Stability and uncertainty quantification
Abstract / Description: 

This talk is concerned with noisy matrix completion: given partial and corrupted entries of a large low-rank matrix, how to estimate and infer the underlying matrix? Arguably one of the most popular paradigms to tackle this problem is convex relaxation, which achieves remarkable efficacy in practice. However, the statistical stability guarantees of this approach is still far from optimal in the noisy setting, falling short of explaining the empirical success. Moreover, it is generally very challenging to pin down the distributions of the convex solution, which presents a major roadblock in assessing the uncertainty, or "confidence", for the obtained estimates–a crucial task at the core of statistical inference. 

Our recent work makes progress towards understanding stability and uncertainty quantification for noisy matrix completion. When the rank of the unknown matrix is a constant: (1) we demonstrate that convex programming achieves near-optimal estimation errors vis-'avis random noise; (2) we develop a de-biased estimator that admits accurate distributional characterizations, thus enabling asymptotically optimal inference. All of this is enabled by bridging convex relaxation with the nonconvex approach, a seemingly distinct algorithmic paradigm that is provably robust against noise.


This is joint work with Cong Ma, Yuling Yan, Yuejie Chi, and Jianqing Fan.

Date and Time: 
Tuesday, May 28, 2019 - 4:30pm
Venue: 
Herrin Hall Room T175

Reinforcement Learning & ISL present "Do Deep Generative Models Know What They Don't Know?"

Topic: 
Do Deep Generative Models Know What They Don't Know?
Abstract / Description: 

A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data. A plethora of work has demonstrated that it is easy to find or synthesize inputs for which a neural network is highly confident yet wrong. Generative models are widely viewed to be robust to such mistaken confidence as modeling the density of the input features can be used to detect novel, out-of-distribution inputs.

In this talk, we challenge this assumption. We find that the density learned by deep generative models (flow-based models, VAEs, and PixelCNNs) cannot distinguish images of common objects such as dogs, trucks, and horses (i.e. CIFAR-10) from those of house numbers (i.e. SVHN), assigning a higher likelihood to the latter when the model is trained on the former. Moreover, we find evidence of this phenomenon when pairing several popular image data sets: FashionMNIST vs MNIST, CelebA vs SVHN, ImageNet vs CIFAR-10 / CIFAR-100 / SVHN. To investigate this curious behavior, we focus analysis on flow-based generative models in particular since they are trained and evaluated via the exact marginal likelihood. We find such behavior persists even when we restrict the flows to constant-volume transformations. These transformations admit some theoretical analysis, and we show that the difference in likelihoods can be explained by the location and variances of the data and the model curvature. Our results caution against using the density estimates from deep generative models to identify inputs similar to the training distribution until their behavior for out-of-distribution inputs is better understood.

Date and Time: 
Tuesday, May 28, 2019 - 4:00pm
Venue: 
Packard 101

ISL & IT Forum present "String reconstruction problems inspired by problems in multiomics data processing"

Topic: 
String reconstruction problems inspired by problems in multiomics data processing
Abstract / Description: 

String reconstruction problems frequently arise in many areas of genomic data processing molecular storage, and synthetic biology. In the most general setting, they may be described as follows: one is given a single or multiple copies of a coded or uncoded string, and the copies are subsequently subjected to some form of (random) processing such as fragmentation or repeated transmission through a noise-inducing channel. The goal of the reconstruction method is to obtain an exact or approximate version of the string based on the processed outputs. Examples of string reconstruction questions include reconstruction from noisy traces, reconstruction from substrings and k-decks and reconstruction from compositional substring information. We review the above and some related problems and then proceed to describe coding methods that lead to strings that can be reconstructed exactly from their noisy traces, substrings and compositions. In particular, we focus on DNA profile codes, hybrid reconstruction from traces and uniquely reconstructable code designs. In the process, we introduce new questions in the areas of restricted de Bruin graphs, counting of rational points in polytopes, and string replacement methods.

This is a joint work with Ryan Gabrys, Han Mao Kiah, Srilakshmi Pattabiraman and Gregory Puleo.

Date and Time: 
Friday, May 31, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum & ISL Colloquium present "Extracting information from cells"

Topic: 
Extracting information from cells
Abstract / Description: 

Genomic approaches to studying the molecular biology of the cell have advanced considerably over the pas few years, and it is now possible to make high-throughput measurements of molecules in individual cells. I will discuss some of the considerations that must be taken into account in designing experiments, and subsequently in extracting the maximum information from them (optimally).

Date and Time: 
Friday, May 24, 2019 - 1:15pm
Venue: 
Packard 202

ISL Colloquium presents "A structural result for Personalized PageRank and its algorithmic consequences"

Topic: 
A structural result for Personalized PageRank and its algorithmic consequences
Abstract / Description: 

Many systems, including the Internet, social networks, and the power grid, can be represented as graphs. When analyzing graphs, it is often useful to compute scores describing the relative importance or distance between nodes. One example is Personalized PageRank (PPR), which assigns to each node v a vector whose i-th entry describes the importance of the i-th node from the perspective of v. PPR has proven useful in many applications, such as recommending who users should follow on social networks (if this i-th entry is large, v may be interested in following the i-th user). Unfortunately, computing n such PPR vectors (where n is the number of nodes) is infeasible for many graphs of interest.

In this work, we argue that the situation is not so dire. Our main result shows that the dimensionality of the set of PPR vectors scales sublinearly in n with high probability, for a certain class of random graphs and for a notion of dimensionality similar to rank. Put differently, we argue that the effective dimension of this set is much less than n, despite the fact that the matrix containing these vectors has rank n. Furthermore, we show this dimensionality measure relates closely to the complexity of a PPR estimation scheme that was proposed (but not analyzed) by Jeh and Widom. This allows us to argue that accurately estimating all n PPR vectors amounts to computing a vanishing fraction of the n^2 vector elements (when the technical assumptions of our main result are satisfied). Finally, we demonstrate empirically that similar conclusions hold when considering real-world networks, despite the assumptions of our theory not holding.

This is joint work with Daniel Vial, University of Michigan and will appear in ACM Sigmetrics 2019.

Date and Time: 
Thursday, May 16, 2019 - 4:15pm
Venue: 
Packard 101

ISL Colloquium presents "Optimization Aspects of Temporal Abstraction in Reinforcement Learning"

Topic: 
Optimization Aspects of Temporal Abstraction in Reinforcement Learning
Abstract / Description: 

Temporal abstraction refers to the idea that complicated sequential decision making problems can sometimes be simplified by considering the "big picture" first. In this talk, I will give an overview of some of my work on learning such temporal abstractions end-to-end within the "option-critic" architecture (Bacon et al., 2017). I will then explain how other related hierarchical RL frameworks, such as Feudal RL by Dayan and Hinton (1993), can also be approached under the same option-critic architecture. However, we will see that that this formulation leads to a so-called "bilevel" optimization problem. While this is a more difficult problem, the good news is that the literature on bilevel optimization is rich and many of its tools have yet to be re-discovered by our community. I will finally show how "iterative differentiation" techniques (Griewankand Walther, 2008) can be applied to our problem while providing a new interpretation to the "inverse RL" approach of Rust (1988).

Date and Time: 
Thursday, May 2, 2019 - 4:15pm
Venue: 
Packard 101

ISL Colloquium presents "Decentralized control over unreliable communication links"

Topic: 
Decentralized control over unreliable communication links
Abstract / Description: 

Decentralized control problems have been a topic of significant research interest due to their relevance to multi-agent systems and large-scale distributed systems.The design of optimal decentralized control strategies has been investigated under various models for inter-controller communication such as graph-based communication models and communication with delays. A common feature of much of the prior work is that the underlying communication structure of the decentralized system is assumed to be fixed and unchanging. For example, several works assume a fixed communication graph among controllers whose edges describe perfect communication links between controllers. Similarly, when the communication graph incorporates delays, the delays are assumed to be fixed and known. This is a key limitation since in many situations communication among controllers may suffer from imperfections such as random packet loss and random packet delays. These imperfections introduce a new layer of uncertainty in the information structure that is not present in the models considered in prior work. In this talk, we will describe a decentralized LQG control problem where some of the communication links suffer from random packet loss. We will first identify optimal decentralized control strategies for finite horizon version of our problem. We will then discuss the infinite horizon problem and show that there are critical thresholds for packet loss probabilities above which no strategy can achieve finite cost and below which optimal strategies can be explicitly identified.

Date and Time: 
Thursday, April 18, 2019 - 4:15pm
Venue: 
Packard 101

#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces

Topic: 
#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces
Abstract / Description: 

Individuals of all genders invited to be a part of:
#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces, where we will feature real stories of harassment at Stanford academic STEM in a conversation with Provost Drell, Dean Minor (SoM), and Dean Graham (SE3). We will have plenty of time for audience discussion on how we can take concrete action to dismantle this culture and actively work towards a more inclusive Stanford for everyone. While our emphasis is on STEM fields, we welcome and encourage participation from students, postdocs, staff, and faculty of all academic disciplines and backgrounds.

Date and Time: 
Friday, April 19, 2019 - 3:30pm
Venue: 
STLC 111

Pages

Subscribe to RSS - Information Systems Lab (ISL) Colloquium