Information Systems Lab (ISL) Colloquium

ISL Colloquium presents "Self-Programming Networks: Applications to Financial Trading Systems"

Topic: 
Self-Programming Networks: Applications to Financial Trading Systems
Abstract / Description: 

We describe Self-Programming Networks (SPNs), an ongoing research effort at Stanford for making cloud computing networks autonomous; that is, to enable the networks to sense and monitor themselves, and program and control themselves. We describe the goals and the architecture of SPNs and present two key outcomes: (i) Huygens, for scalable and accurate clock synchronization, and (ii) Simon, for fine-grained network telemetry using observations from the network’s edge. We also describe the relevance of this work to existing financial trading systems and demonstrate how, in future, it enables financial trading systems in the Cloud.


The Information Systems Laboratory Colloquium (ISLC) is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

Date and Time: 
Thursday, January 23, 2020 - 4:30pm
Venue: 
Packard 101

ISL Colloquium presents "Theoretical Reflections on Quantum Supremacy"

Topic: 
Theoretical Reflections on Quantum Supremacy
Abstract / Description: 

The recent demonstration of quantum supremacy by Google is a first step towards the era of small to medium scale quantum computation. In this talk I will explain what the experiment accomplished and the theoretical work it is based on, as well as what it did not accomplish and the many theoretical and practical challenges that remain. I will also describe recent breakthroughs in the design of protocols for the testing and benchmarking of quantum computers, a task that has deep computational and philosophical implications. Specifically, this leads to protocols for scalable and verifiable quantum supremacy, certifiable quantum random generation and verification of quantum computation.

The talk will be aimed at a broad audience.


The Information Systems Laboratory Colloquium (ISLC) is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

Date and Time: 
Thursday, January 16, 2020 - 4:30pm
Venue: 
Packard 101

ISL Colloquium presents "The Robustness Problem"

Topic: 
The Robustness Problem
Abstract / Description: 

Despite impressive performance on many benchmarks, state-of-the-art machine learning algorithms have been shown to be extremely brittle on out-of-distribution inputs. While there has been a focus in recent years on robustness to small lp-perturbations, this talk will discuss robustness to more general types of corruptions. We will investigate several questions related to robustness: Why are current models so brittle? Is recent work on lp-robustness making progress towards robustness to distribution shift? How should we best measure model robustness to ensure that models can be safely deployed in complex dynamic environments? Additionally, we will present experiments showing how models latch onto spurious correlations in image data, and how data augmentation shifts model bias towards different features in the data, resulting in trade-offs in the robustness properties of the model.


The Information Systems Laboratory Colloquium (ISLC) is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

Date and Time: 
Thursday, January 9, 2020 - 4:30pm
Venue: 
Packard 101

ISL & IT Forum present "Higher Criticism for discriminating frequency-tables and testing authorship"

Topic: 
Higher Criticism for discriminating frequency-tables and testing authorship
Abstract / Description: 

The Higher Criticism (HC) test is a useful tool for detecting the presence of a signal spread across a vast number of features, especially in the sparse setting when only few features are useful while the rest contain only noise. We adapt the HC test to the two-sample setting of detecting changes between two frequency tables. We apply this adaptation to authorship attribution challenges, where the goal is to identify the author of a document using other documents whose authorship is known. The method is simple yet performs well without handcrafting and tuning. Furthermore, as an inherent side effect, the HC calculation identifies a subset of discriminating words, which allow additional interpretation of the results. Our examples include authorship in the Federalist Papers and machine-generated texts.

We take two approaches to analyze the success of our method. First, we show that, in practice, the discriminating words identified by the test: have low variance across documents belonging to a corpus of homogeneous authorship. We conclude that in testing a new document against the corpus of an author, HC is mostly affected by words characteristic of that author and is relatively unaffected by topic structure. Finally, we analyze the power of the test in discriminating two multinomial distributions under sparse and weak perturbations model. We show that our test has maximal power in a wide range of the model parameters, even though these parameters are unknown to the user.

Date and Time: 
Friday, January 10, 2020 - 1:15pm
Venue: 
Packard 202

ISL Colloquium presents "Implicit Regularization for Optimal Sparse Recovery"

Topic: 
Implicit Regularization for Optimal Sparse Recovery
Abstract / Description: 

Ridge regression is a fundamental paradigm in machine learning and statistics, and it has long been known to be closely connected to the implicit regularization properties of gradient descent methods, cf. early stopping. Over the past decade, this connection has sparked research into a variety of directions aimed at developing computationally efficiency estimators, including acceleration, mini-batching, averaging, sketching, sub-sampling, preconditioning, and decentralization. Sparse recovery is another cornerstone of modern statistics and learning frameworks. Yet, here the connection to implicit regularization is not as well developed. Most results in the literature only involve limit statements (holding at convergence, for infinitesimal step sizes), apply to regimes with no (or limited) noise, and do not focus on computational efficiency. In this talk, we address the following question: Can we establish an implicit regularization theory for gradient descent to yield optimal sparse recovery in noisy settings, achieving minimax rates with the same cost of reading the data? We will highlight the key ideas to obtain the first results in this direction, along with a few surprising findings.

Date and Time: 
Friday, December 6, 2019 - 11:00am to Saturday, December 7, 2019 - 10:55am
Venue: 
Packard 202

ISL Colloquium and IT-Forum present "A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression"

Topic: 
A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression
Abstract / Description: 

Many modern data sources arising from social networks, biological data, etc. are best viewed as indexed by combinatorial structures such as graphs, rather than time series. The local weak limit theory for sparse graphs, also known as the objective method, due to Benjamini, Schramm, Aldous, Steele, Lyons and others, provides a framework which enables one to make sense of a stationary process indexed by graphs. The theory of time series is the engine driving an enormous range of applications in areas such as control theory, communications, information theory and signal processing. It is to be expected that a theory of stationary stochastic processes indexed by combinatorial structures, in particular graphs, would eventually have a similarly wide-ranging impact.

Employing the above framework, we introduce a notion of entropy for probability distributions on rooted graphs. This is a generalization of the notion of entropy introduced by Bordenave and Caputo to graphs which carry marks on their vertices and edges. Such marks can represent information on real-world data. For instance, in a social network graph where each node represents an individual and edges represent friendships, a vertex mark represents the type of an individual, while edge marks represent shared data between friends. The above notion of entropy can be considered as a natural counterpart for the Shannon entropy rate in the world of graphical data. We illustrate this by introducing a universal compression scheme for marked graphical data. Furthermore, we introduce an algorithm that can perform such a compression with low complexity.

This talk is based on joint work with Venkat Anantharam.

Date and Time: 
Friday, December 6, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "An information-theoretic solution to random access communication"

Topic: 
An information-theoretic solution to random access communication
Abstract / Description: 

The emergence of networks of many devices in the context of cyber-physical systems underscores the need for novel solutions for communication over random access channels. Most protocols currently in place rely on collision avoidance and are woefully ill-suited to handling high variation in the number and variety of communicators. We ask the question of whether it is possible, in a scenario where no one knows how many transmitters are active, for the receiver to almost always recover the messages sent by all active transmitters. Surprisingly, we find that not only is reliable decoding possible in this regime, but it is possible to attain both the capacity and the dispersion of the multiple access channel in operation. Our strategy relies on ''semi-rateless'' codes: decoding attempts are made at a sparse set of pre-determined decoding times, and the decoder is tasked with recovering from the channel output the messages of the active transmitters. Once the decoder determines that reliable decoding can be made, a positive acknowledgement (ACK) is sent to the encoders, indicating the end of that communication epoch and the start of a new one. The feedback ensures that the collection of active transmitters remains fixed during each epoch. These semi-rateless codes work for both channel and source coding; for source coding, our semi-rateless scheme achieves the optimal performance up to the first three order terms in the Gaussian approximation of the optimal rate. The analysis includes a refinement of the random coding argument ensuring the existence of a deterministic code that meets multiple constraints simultaneously, a random-coding union achievability bound for multiple access, and a sharp converse bound for multiple access source coding via a connection to composite hypothesis testing.

Based on joint works (ArXiv:1801.09018, ArXiv:1902.03366) with M. Effros, R. C. Yavas, S. Chen.

Date and Time: 
Friday, November 22, 2019 - 1:15pm
Venue: 
Packard 202

Nonparametric generative modeling via optimal transport and diffusions with provable guarantees

Topic: 
Nonparametric generative modeling via optimal transport and diffusions with provable guarantees
Abstract / Description: 

By building upon the recent theory that established the connection between implicit generative modeling (IGM) and optimal transport, in this study, we propose a novel parameter-free algorithm for learning the underlying distributions of complicated datasets and sampling from them. The proposed algorithm is based on a functional optimization problem, which aims at finding a measure that is close to the data distribution as much as possible and also expressive enough for generative modeling purposes. We formulate the problem as a gradient flow in the space of probability measures. The connections between gradient flows and stochastic differential equations let us develop a computationally efficient algorithm for solving the optimization problem. We provide formal theoretical analysis where we prove finite-time error guarantees for the proposed algorithm. Our experimental results support our theory and show that our algorithm is able to successfully capture the structure of different types of data distributions.

The talk will be based on the following paper:
"Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions", ICML 2019

Date and Time: 
Friday, November 8, 2019 - 1:00pm to Saturday, November 9, 2019 - 12:55pm
Venue: 
Allen 101X

IT Forum & ISL Colloquium presents "The Case Against Reality"

Topic: 
The Case Against Reality
Abstract / Description: 

If I have a visual experience that I describe as a red tomato a meter away, then I am inclined to believe that there is, in fact, a red tomato a meter away, even if I close my eyes. I believe that my perceptions are, in the normal case, veridical—that they accurately depict aspects of the real world. But is my belief supported by our best science? In particular: Does evolution by natural selection favor veridical perceptions? Many scientists and philosophers claim that it does. But this claim, though plausible, has not been properly tested. In this talk I present a new theorem: Veridical perceptual systems are never more fit than non-veridical perceptual systems of equal complexity that are simply tuned to the relevant fitness functions. This entails that perception is not a window on reality; it is more like a windows interface on your laptop. Spacetime is your desktop, and physical objects, like apples and neurons, are simply icons on this desktop. I discuss this interface theory of perception and its implications for one of the most puzzling unsolved problems in science: the relationship between brain activity and conscious experiences.

Date and Time: 
Friday, November 8, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present " Dictionary Compression and its Applications"

Topic: 
Dictionary Compression and its Applications
Abstract / Description: 

Dictionary compression is a known technique, promising to solve the problem of compressing small inputs. However, it has been only available to implementers since relatively recently, as newer compression algorithms shipped dictionary builders alongside their main codec. Due to this recent timeframe, complexities around deploying this solution at larger scales only start to be appreciated. Understanding these difficulties, and finding ways to harness them, is key to target system's performance and reliability. Yet, the price is big, as dictionary compression not only improves compression, it also offers the potential to redesign systems around their capabilities.

We'll cover the benefits, trade-off and operational difficulties of dictionary compression, as well as their important second-order impacts for systems adopting it.

Date and Time: 
Friday, November 1, 2019 - 1:15pm
Venue: 
Packard 202

Pages

Subscribe to RSS - Information Systems Lab (ISL) Colloquium