Statistics and Probability Seminars

Statistics Department Seminar presents "Backfitting for large-scale crossed random effects regressions"

Topic: 
Backfitting for large-scale crossed random effects regressions
Abstract / Description: 

Large-scale genomic and electronic commerce data sets often have a crossed random effects structure, arising from genotypes x environments or customers x products. Naive methods of handling such data will produce inferences that do not generalize. Regression models that properly account for crossed random effects can be very expensive to compute. The cost of both generalized least squares and Gibbs sampling can easily grow as N^(3/2) (or worse) for N observations. Papaspiliopoulos, Roberts and Zanella (2020) present a collapsed Gibbs sampler that costs O(N), but under an extremely stringent sampling model. We propose a backfitting algorithm to compute a generalized least squares estimate and prove that it costs O(N) under greatly relaxed though still strict sampling assumptions. Empirically, the backfitting algorithm costs O(N) under further relaxed assumptions. We illustrate the new algorithm on a ratings data set from Stitch Fix.

This is based on joint with Swarnadip Ghosh and Trevor Hastie of Stanford University.

Date and Time: 
Tuesday, October 13, 2020 - 4:30pm

Statistics Department Seminar presents "Berry–Esseen bounds for Chernoff-type nonstandard asymptotics in isotonic regression"

Topic: 
Berry–Esseen bounds for Chernoff-type nonstandard asymptotics in isotonic regression
Abstract / Description: 

A Chernoff-type distribution is a non-normal distribution defined by the slope at zero of the greatest convex minorant of a two-sided Brownian motion with a polynomial drift. While a Chernoff-type distribution appears as the distributional limit in many nonregular estimation problems, the accuracy of Chernoff-type approximations has been largely unknown. In this talk, I will discuss Berry–Esseen bounds for Chernoff-type limit distributions in the canonical nonregular statistical estimation problem of isotonic (or monotone) regression. The derived Berry–Esseen bounds match those of the oracle local average estimator with optimal bandwidth in each scenario of possibly different Chernoff-type asymptotics, up to multiplicative logarithmic factors. Our method of proof differs from standard techniques on Berry–Esseen bounds, and relies on new localization techniques in isotonic regression and an anti-concentration inequality for the supremum of a Brownian motion with a Lipschitz drift.

This talk is based on joint work with Qiyang Han.

Date and Time: 
Tuesday, October 6, 2020 - 4:30pm

Probability Seminar presents "Non-stationary fluctuations for some non-integrable models"

Topic: 
Non-stationary fluctuations for some non-integrable models
Abstract / Description: 

The Kardar-Parisi-Zhang (KPZ) equation is a conjecturally universal model for dynamics of fluctuating interfaces such as fires and epidemic fronts. The universality was originally justified by Kardar, Parisi, and Zhang via non-rigorous renormalization group calculations. In this talk, we introduce some mathematically rigorous results and take a step towards this universality in the context of some non-integrable interacting particle systems outside their respective invariant measures.

Date and Time: 
Monday, November 16, 2020 - 4:00pm to Tuesday, November 17, 2020 - 3:55pm

Probability Seminar presents "Fast and memory-optimal dimension reduction using Kac's walk"

Topic: 
Fast and memory-optimal dimension reduction using Kac's walk
Abstract / Description: 

Introduced in the 1950s by Mark Kac as a toy model for a one-dimensional Boltzmann gas, the Kac walk is the following simple and well-studied Markov chain on the special orthogonal group: at every time step, sample two distinct uniform coordinates $i,j$ and a uniform angle $\theta$, and rotate in the $(i,j)$-plane by $\theta$. In this talk, I will discuss how the Kac walk can be used for the purpose of dimensionality reduction, specifically, for the design of linear transformations with optimal Johnson–Lindenstrauss and Restricted Isometry properties, and which support memory-optimal fast matrix-vector multiplication. I will also discuss the performance of a variant of the Kac walk, for which $\theta = \pi/4$ at every time step

This is joint work with Natesh S. Pillai (Harvard), Ashwin Sah (MIT), Mehtaab Sawhney (MIT), and Aaron Smith (U Ottawa).

Date and Time: 
Monday, October 26, 2020 - 4:00pm to Tuesday, October 27, 2020 - 3:55pm
Venue: 
Zoom

Probability Seminar presents "Replica symmetry breaking for random regular NAE-SAT"

Topic: 
Replica symmetry breaking for random regular NAE-SAT
Abstract / Description: 

In a wide class of random constraint satisfaction problems, ideas from statistical physics predict that there is a rich set of phase transitions governed by one-step replica symmetry breaking (1RSB). In particular, it is conjectured that there is a condensation regime below the satisfiability threshold, where the solution space condenses into large clusters. We establish this phenomenon for the random regular NAE-SAT model by showing that most of the solutions lie in a bounded number of clusters and the overlap of two independent solutions concentrates on two points. Central to the proof is to calculate the moments of the number of clusters whose size is in an O(1) window.

This is joint work with Danny Nam and Allan Sly.

Date and Time: 
Monday, October 19, 2020 - 4:00pm to Tuesday, October 20, 2020 - 3:55pm
Venue: 
Zoom

Probability Seminar presents Emergence of communication through reinforcement learning with invention

Topic: 
Emergence of communication through reinforcement learning with invention
Abstract / Description: 

Reinforcement learning in a two-player Lewis signaling game is a simple model to study the emergence of communication in cooperative multi-agent systems. When there are a fixed number of states and signals there is a positive probability that a successful communication system does not emerge. If the learning dynamics are modified to include invention – rather than fixing the number of signals, at each step there is always a chance to introduce a new signal – then the system converges to successful signaling almost surely. The reinforcement process can be modeled as an interacting urn system, and the proof uses a combination of stochastic approximation techniques and comparison with simpler urn models.

Date and Time: 
Monday, October 12, 2020 - 4:00pm to Tuesday, October 13, 2020 - 3:55pm
Venue: 
Zoom

Probability Seminar presents "Random matrix statistics though pseudo-randomness"

Topic: 
Random matrix statistics though pseudo-randomness
Abstract / Description: 

We introduce the $N\times N$ random matrices $X_{j,k}=\exp(2\pi i \sum_{q=1}^d \omega_{j,q} k^q)$ with i.i.d. random variables $\omega_{j,q}$ for $1\leq j\leq N$ and $1\leq q\leq d}$, where $d$ is a fixed integer. We prove that the distribution of their singular values converges to the local Marchenko-Pastur law at scales $N^{-\theta_d}$ for an explicit, small $\theta_d>0$, as long as $d\geq 18$. To our knowledge, this is the first instance of a random matrix ensemble that is explicitly defined in terms of only $O(N)$ random variables exhibiting a universal local spectral law. Our main technical contribution is to derive concentration bounds for the Stieltjes transform that simultaneously take into account stochastic and oscillatory cancellations. Important ingredients in our proof are strong estimates on the number of solutions to Diophantine equations (in the form of Vinogradov's main conjecture recently proved by Bourgain-Demeter-Guth) and a pigeonhole argument that combines the Ward identity with an algebraic uniqueness condition for Diophantine equations derived from the Newton-Girard identities.

This is joint work with Marius Lemm.

Date and Time: 
Monday, October 5, 2020 - 4:00pm to Tuesday, October 6, 2020 - 3:55pm
Venue: 
Zoom

Probability Seminar & Applied Math present "Shapes of equilibrium capillary drops on a rough surface"

Topic: 
Shapes of equilibrium capillary drops on a rough surface
Abstract / Description: 

I will discuss some simplified models for the shape of liquid droplets on rough solid surfaces. These are elliptic free boundary problems with oscillatory coefficients. I will talk about the large-scale effects of small-scale surface roughness, e.g., contact line pinning, hysteresis, and formation of flat parts (facets) in the contact line, and how to understand these by homogenization theory. I will also mention a connection with the continuum scaling limit of an abelian sandpile-type model and some results in that setting.

Date and Time: 
Wednesday, September 30, 2020 - 12:00pm to Thursday, October 1, 2020 - 11:55am
Venue: 
Zoom

Pages

Subscribe to RSS - Statistics and Probability Seminars