EE Student Information

The Department of Electrical Engineering supports Black Lives Matter. Read more.

• • • • •

EE Student Information, Spring Quarter through Academic Year 2020-2021: FAQs and Updated EE Course List.

Updates will be posted on this page, as well as emailed to the EE student mail list.

Please see Stanford University Health Alerts for course and travel updates.

As always, use your best judgement and consider your own and others' well-being at all times.

Information Systems Lab (ISL) Colloquium

ISL Colloquium presents "The Robustness Problem"

Topic: 
The Robustness Problem
Abstract / Description: 

Despite impressive performance on many benchmarks, state-of-the-art machine learning algorithms have been shown to be extremely brittle on out-of-distribution inputs. While there has been a focus in recent years on robustness to small lp-perturbations, this talk will discuss robustness to more general types of corruptions. We will investigate several questions related to robustness: Why are current models so brittle? Is recent work on lp-robustness making progress towards robustness to distribution shift? How should we best measure model robustness to ensure that models can be safely deployed in complex dynamic environments? Additionally, we will present experiments showing how models latch onto spurious correlations in image data, and how data augmentation shifts model bias towards different features in the data, resulting in trade-offs in the robustness properties of the model.


The Information Systems Laboratory Colloquium (ISLC) is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

Date and Time: 
Thursday, January 9, 2020 - 4:30pm
Venue: 
Packard 101

ISL & IT Forum present "Higher Criticism for discriminating frequency-tables and testing authorship"

Topic: 
Higher Criticism for discriminating frequency-tables and testing authorship
Abstract / Description: 

The Higher Criticism (HC) test is a useful tool for detecting the presence of a signal spread across a vast number of features, especially in the sparse setting when only few features are useful while the rest contain only noise. We adapt the HC test to the two-sample setting of detecting changes between two frequency tables. We apply this adaptation to authorship attribution challenges, where the goal is to identify the author of a document using other documents whose authorship is known. The method is simple yet performs well without handcrafting and tuning. Furthermore, as an inherent side effect, the HC calculation identifies a subset of discriminating words, which allow additional interpretation of the results. Our examples include authorship in the Federalist Papers and machine-generated texts.

We take two approaches to analyze the success of our method. First, we show that, in practice, the discriminating words identified by the test: have low variance across documents belonging to a corpus of homogeneous authorship. We conclude that in testing a new document against the corpus of an author, HC is mostly affected by words characteristic of that author and is relatively unaffected by topic structure. Finally, we analyze the power of the test in discriminating two multinomial distributions under sparse and weak perturbations model. We show that our test has maximal power in a wide range of the model parameters, even though these parameters are unknown to the user.

Date and Time: 
Friday, January 10, 2020 - 1:15pm
Venue: 
Packard 202

ISL Colloquium presents "Implicit Regularization for Optimal Sparse Recovery"

Topic: 
Implicit Regularization for Optimal Sparse Recovery
Abstract / Description: 

Ridge regression is a fundamental paradigm in machine learning and statistics, and it has long been known to be closely connected to the implicit regularization properties of gradient descent methods, cf. early stopping. Over the past decade, this connection has sparked research into a variety of directions aimed at developing computationally efficiency estimators, including acceleration, mini-batching, averaging, sketching, sub-sampling, preconditioning, and decentralization. Sparse recovery is another cornerstone of modern statistics and learning frameworks. Yet, here the connection to implicit regularization is not as well developed. Most results in the literature only involve limit statements (holding at convergence, for infinitesimal step sizes), apply to regimes with no (or limited) noise, and do not focus on computational efficiency. In this talk, we address the following question: Can we establish an implicit regularization theory for gradient descent to yield optimal sparse recovery in noisy settings, achieving minimax rates with the same cost of reading the data? We will highlight the key ideas to obtain the first results in this direction, along with a few surprising findings.

Date and Time: 
Friday, December 6, 2019 - 11:00am
Venue: 
Packard 202

ISL Colloquium and IT-Forum present "A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression"

Topic: 
A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression
Abstract / Description: 

Many modern data sources arising from social networks, biological data, etc. are best viewed as indexed by combinatorial structures such as graphs, rather than time series. The local weak limit theory for sparse graphs, also known as the objective method, due to Benjamini, Schramm, Aldous, Steele, Lyons and others, provides a framework which enables one to make sense of a stationary process indexed by graphs. The theory of time series is the engine driving an enormous range of applications in areas such as control theory, communications, information theory and signal processing. It is to be expected that a theory of stationary stochastic processes indexed by combinatorial structures, in particular graphs, would eventually have a similarly wide-ranging impact.

Employing the above framework, we introduce a notion of entropy for probability distributions on rooted graphs. This is a generalization of the notion of entropy introduced by Bordenave and Caputo to graphs which carry marks on their vertices and edges. Such marks can represent information on real-world data. For instance, in a social network graph where each node represents an individual and edges represent friendships, a vertex mark represents the type of an individual, while edge marks represent shared data between friends. The above notion of entropy can be considered as a natural counterpart for the Shannon entropy rate in the world of graphical data. We illustrate this by introducing a universal compression scheme for marked graphical data. Furthermore, we introduce an algorithm that can perform such a compression with low complexity.

This talk is based on joint work with Venkat Anantharam.

Date and Time: 
Friday, December 6, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "An information-theoretic solution to random access communication"

Topic: 
An information-theoretic solution to random access communication
Abstract / Description: 

The emergence of networks of many devices in the context of cyber-physical systems underscores the need for novel solutions for communication over random access channels. Most protocols currently in place rely on collision avoidance and are woefully ill-suited to handling high variation in the number and variety of communicators. We ask the question of whether it is possible, in a scenario where no one knows how many transmitters are active, for the receiver to almost always recover the messages sent by all active transmitters. Surprisingly, we find that not only is reliable decoding possible in this regime, but it is possible to attain both the capacity and the dispersion of the multiple access channel in operation. Our strategy relies on ''semi-rateless'' codes: decoding attempts are made at a sparse set of pre-determined decoding times, and the decoder is tasked with recovering from the channel output the messages of the active transmitters. Once the decoder determines that reliable decoding can be made, a positive acknowledgement (ACK) is sent to the encoders, indicating the end of that communication epoch and the start of a new one. The feedback ensures that the collection of active transmitters remains fixed during each epoch. These semi-rateless codes work for both channel and source coding; for source coding, our semi-rateless scheme achieves the optimal performance up to the first three order terms in the Gaussian approximation of the optimal rate. The analysis includes a refinement of the random coding argument ensuring the existence of a deterministic code that meets multiple constraints simultaneously, a random-coding union achievability bound for multiple access, and a sharp converse bound for multiple access source coding via a connection to composite hypothesis testing.

Based on joint works (ArXiv:1801.09018, ArXiv:1902.03366) with M. Effros, R. C. Yavas, S. Chen.

Date and Time: 
Friday, November 22, 2019 - 1:15pm
Venue: 
Packard 202

Nonparametric generative modeling via optimal transport and diffusions with provable guarantees

Topic: 
Nonparametric generative modeling via optimal transport and diffusions with provable guarantees
Abstract / Description: 

By building upon the recent theory that established the connection between implicit generative modeling (IGM) and optimal transport, in this study, we propose a novel parameter-free algorithm for learning the underlying distributions of complicated datasets and sampling from them. The proposed algorithm is based on a functional optimization problem, which aims at finding a measure that is close to the data distribution as much as possible and also expressive enough for generative modeling purposes. We formulate the problem as a gradient flow in the space of probability measures. The connections between gradient flows and stochastic differential equations let us develop a computationally efficient algorithm for solving the optimization problem. We provide formal theoretical analysis where we prove finite-time error guarantees for the proposed algorithm. Our experimental results support our theory and show that our algorithm is able to successfully capture the structure of different types of data distributions.

The talk will be based on the following paper:
"Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions", ICML 2019

Date and Time: 
Friday, November 8, 2019 - 1:00pm
Venue: 
Allen 101X

IT Forum & ISL Colloquium presents "The Case Against Reality"

Topic: 
The Case Against Reality
Abstract / Description: 

If I have a visual experience that I describe as a red tomato a meter away, then I am inclined to believe that there is, in fact, a red tomato a meter away, even if I close my eyes. I believe that my perceptions are, in the normal case, veridical—that they accurately depict aspects of the real world. But is my belief supported by our best science? In particular: Does evolution by natural selection favor veridical perceptions? Many scientists and philosophers claim that it does. But this claim, though plausible, has not been properly tested. In this talk I present a new theorem: Veridical perceptual systems are never more fit than non-veridical perceptual systems of equal complexity that are simply tuned to the relevant fitness functions. This entails that perception is not a window on reality; it is more like a windows interface on your laptop. Spacetime is your desktop, and physical objects, like apples and neurons, are simply icons on this desktop. I discuss this interface theory of perception and its implications for one of the most puzzling unsolved problems in science: the relationship between brain activity and conscious experiences.

Date and Time: 
Friday, November 8, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present " Dictionary Compression and its Applications"

Topic: 
Dictionary Compression and its Applications
Abstract / Description: 

Dictionary compression is a known technique, promising to solve the problem of compressing small inputs. However, it has been only available to implementers since relatively recently, as newer compression algorithms shipped dictionary builders alongside their main codec. Due to this recent timeframe, complexities around deploying this solution at larger scales only start to be appreciated. Understanding these difficulties, and finding ways to harness them, is key to target system's performance and reliability. Yet, the price is big, as dictionary compression not only improves compression, it also offers the potential to redesign systems around their capabilities.

We'll cover the benefits, trade-off and operational difficulties of dictionary compression, as well as their important second-order impacts for systems adopting it.

Date and Time: 
Friday, November 1, 2019 - 1:15pm
Venue: 
Packard 202

ISL Colloquium & IT Forum present "Generalized Resilience and Robust Statistics"

Topic: 
Generalized Resilience and Robust Statistics
Abstract / Description: 

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the robust statistics approach to consider perturbations under any Wasserstein distance, and show that robust estimation is possible whenever a distribution's population statistics are robust under a certain family of friendly perturbations. This generalizes a property called resilience previously employed in the special case of mean estimation with outliers. We justify the generalized resilience property by showing that it holds under moment or hypercontractive conditions. Even in the total variation case, these subsume conditions in the literature for mean estimation, regression, and covariance estimation; the resulting analysis simplifies and sometimes improves these known results in both population limit and finite-sample rate. Our robust estimators are based on minimum distance (MD) functionals (Donoho and Liu, 1988), which project onto a set of distributions under a discrepancy related to the perturbation. We present two approaches for designing MD estimators with good finite- sample rates: weakening the discrepancy and expanding the set of distributions. We also present connections to Gao et al. (2019)'s recent analysis of generative adversarial networks for robust estimation.

Joint work with Banghua Zhu and Jacob Steinhardt

Date and Time: 
Friday, October 11, 2019 - 1:15pm
Venue: 
Packard 202

ISL Colloquium welcomes Fanny Yang

Topic: 
TBA
Abstract / Description: 

TBA


The Information Systems Laboratory Colloquium (ISLC)

is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.

The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.

To receive email notifications of seminars you can join the ISL mailing list.

Date and Time: 
Thursday, December 5, 2019 - 4:30pm
Venue: 
Packard 101

Pages

Subscribe to RSS - Information Systems Lab (ISL) Colloquium