Statistics and Probability Seminars

Statistics Seminar: Inference, Computation, and Visualization for Convex Clustering and Biclustering

Topic: 
Inference, Computation, and Visualization for Convex Clustering and Biclustering
Abstract / Description: 

Hierarchical clustering enjoys wide popularity because of its fast computation, ease of interpretation, and appealing visualizations via the dendogram and cluster heatmap. Recently, several have proposed and studied convex clustering and biclustering which, similar in spirit to hierarchical clustering, achieve cluster merges via convex fusion penalties. While these techniques enjoy superior statistical performance, they suffer from slower computation and are not generally conducive to representation as a dendogram. In the first part of the talk, we present new convex (bi)clustering methods and fast algorithms that inherit all of the advantages of hierarchical clustering. Specifically, we develop a new fast approximation and variation of the convex (bi)clustering solution path that can be represented as a dendogram or cluster heatmap. Also, as one tuning parameter indexes the sequence of convex (bi)clustering solutions, we can use these to develop interactive and dynamic visualization strategies that allow one to watch data form groups as the tuning parameter varies. In the second part of this talk, we consider how to conduct inference for convex clustering solutions that addresses questions like: Are there clusters in my data set? Or, should two clusters be merged into one? To achieve this, we develop a new geometric representation of Hotelling's T2-test that allows us to use the selective inference paradigm to test multivariate hypotheses for the first time. We can use this approach to test hypotheses and calculate confidence ellipsoids on the cluster means resulting from convex clustering. We apply these techniques to examples from text mining and cancer genomics.

This is joint work with John Nagorski and Frederick Campbell.


The Statistics Seminars for Winter Quarter will be held in Room 380Y of the Sloan Mathematics Center in the Main Quad at 4:30pm on Tuesdays. 

Date and Time: 
Tuesday, March 13, 2018 - 4:30pm
Venue: 
Sloan Mathematics Building, Room 380Y

Statistics Seminar: Understanding rare events in models of statistical mechanics

Topic: 
Understanding rare events in models of statistical mechanics
Abstract / Description: 

Statistical mechanics models are ubiquitous at the interface of probability theory, information theory, and inference problems in high dimensions. To develop a refined understanding of such models, one often needs to study not only typical fluctuation theory but also the realm of atypical events. In this talk, we will focus on sparse networks and polymer models on lattices. In particular we will consider the rare events that a sparse random network has an atypical number of certain local structures, and that a polymer in random media has atypical weight. The random geometry associated with typical instances of these rare events is an important topic of inquiry: this geometry can involve merely local structures, or more global ones. We will discuss recent solutions to certain longstanding questions and connections to stochastic block models, exponential random graphs, eigenvalues of random matrices, and fundamental growth models.

Date and Time: 
Tuesday, January 30, 2018 - 4:30pm
Venue: 
Sloan Mathematics Building, Room 380Y

New Directions in Management Science & Engineering: A Brief History of the Virtual Lab

Topic: 
New Directions in Management Science & Engineering: A Brief History of the Virtual Lab
Abstract / Description: 

Lab experiments have long played an important role in behavioral science, in part because they allow for carefully designed tests of theory, and in part because randomized assignment facilitates identification of causal effects. At the same time, lab experiments have traditionally suffered from numerous constraints (e.g. short duration, small-scale, unrepresentative subjects, simplistic design, etc.) that limit their external validity. In this talk I describe how the web in general—and crowdsourcing sites like Amazon's Mechanical Turk in particular—allow researchers to create "virtual labs" in which they can conduct behavioral experiments of a scale, duration, and realism that far exceed what is possible in physical labs. To illustrate, I describe some recent experiments that showcase the advantages of virtual labs, as well as some of the limitations. I then discuss how this relatively new experimental capability may unfold in the future, along with some implications for social and behavioral science.

Date and Time: 
Thursday, March 16, 2017 - 12:15pm
Venue: 
Packard 101

Statistics Seminar

Topic: 
Brownian Regularity for the Airy Line Ensemble
Abstract / Description: 

The Airy line ensemble is a positive-integer indexed ordered system of continuous random curves on the real line whose finite dimensional distributions are given by the multi-line Airy process. It is a natural object in the KPZ universality class: for example, its highest curve, the Airy2 process, describes after the subtraction of a parabola the limiting law of the scaled weight of a geodesic running from the origin to a variable point on an anti-diagonal line in such problems as Poissonian last passage percolation. The Airy line ensemble enjoys a simple and explicit spatial Markov property, the Brownian Gibbs property.


In this talk, I will discuss how this resampling property may be used to analyse the Airy line ensemble. Arising results include a close comparison between the ensemble's curves after affine shift and Brownian bridge. The Brownian Gibbs technique is also used to compute the value of a natural exponent describing the decay in probability for the existence of several near geodesics with common endpoints in Brownian last passage percolation, where the notion of "near" refers to a small deficit in scaled geodesic weight, with the parameter specifying this nearness tending to zero.

Date and Time: 
Monday, September 26, 2016 - 4:30pm
Venue: 
Sequoia Hall, room 200

Claude E. Shannon's 100th Birthday

Topic: 
Centennial year of the 'Father of the Information Age'
Abstract / Description: 

From UCLA Shannon Centennial Celebration website:

Claude Shannon was an American mathematician, electrical engineer, and cryptographer known as "the father of information theory". Shannon founded information theory and is perhaps equally well known for founding both digital computer and digital circuit design theory. Shannon also laid the foundations of cryptography and did basic work on code breaking and secure telecommunications.

 

Events taking place around the world are listed at IEEE Information Theory Society.

Date and Time: 
Saturday, April 30, 2016 - 12:00pm
Venue: 
N/A

Pages

Subscribe to RSS - Statistics and Probability Seminars