EE Student Information

Statistics and Probability Seminars

Probability Seminar: Nonlinear large deviations: Mean-field and beyond

Topic: 
Nonlinear large deviations: Mean-field and beyond
Abstract / Description: 

Large deviations of nonlinear functions of adjacency matrices of sparse random graphs have gained considerable interest over the last decade. This includes popular examples like subgraph count, or the extreme eigenvalues. For the first half of the talk, we will discuss how the upper tail large deviation problem of subgraph count in a random regular graph can be reduced to a variational problem and how to solve such optimization. Next, we consider Erdos–Renyi graph $\mathcal{G}(n,p)$ in the regime of $p$ where largest eigenvalue is governed by localized statistics, such as high degree vertices. In particular, for $r \geq 1$ fixed, we will discuss the upper and lower tail probabilities of top $r$ eigenvalues jointly.

This talk is based on joint works with Bhaswar B. Bhattacharya, Amir Dembo, and Shirshendu Ganguly.

Date and Time: 
Monday, October 11, 2021 - 4:00pm
Venue: 
Sequoia 200

Probability Seminar: Hit and run algorithms and Mallows permutation models with L1 and L2 distances

Topic: 
Hit and run algorithms and Mallows permutation models with L1 and L2 distances
Abstract / Description: 

Introduced by Mallows in statistical ranking theory, Mallows permutation model is a class of non-uniform probability measures on the symmetric group that are biased towards the identity. The general model depends on a distance metric that can be chosen from a host of metrics on permutations. In this talk, I will focus on Mallows permutation models with L1 and L2 distances — respectively known as Spearman's footrule and Spearman's rank correlation in the statistics literature. The models have been widely applied in statistics and physics, but have mostly resisted analysis because the normalizing constants are "uncomputable". In the first part of the talk, I will explain how we can sample from both the L1 and L2 models using hit and run algorithms, which are a unifying class of Markov chain Monte Carlo algorithms that include the celebrated Swendsen–Wang algorithm. A natural question from probabilistic combinatorics is: pick a random permutation from either of the models, what does it "look like"? This may involve various features of the permutation, such as the band structure and the length of the longest increasing subsequence. In the second part of the talk, I will explain how multi-scale analysis and hit and run algorithms can be used to prove theorems regarding such questions.

Date and Time: 
Monday, September 27, 2021 - 4:00pm
Venue: 
Sequoia 200

Workshop in Biostatistics: Big data from tiny microbes across Earth’s ecosystems

Topic: 
Big data from tiny microbes across Earth’s ecosystems
Abstract / Description: 

Genome-resolved metagenomics has enabled unprecedented insights into the ecology and evolution of environmental and host-associated microbiomes. This powerful approach is scalable and was applied to over 10,000 metagenomes collected from diverse habitats to generate an extensive catalog of microbial diversity. In collaboration with a large research consortium, we highlight how this genomic catalog can support discovery of new biosynthetic gene clusters and associating environmental viruses to their microbial hosts.

Date and Time: 
Thursday, September 30, 2021 - 1:30pm

Workshop in Biostatistics: Genetic variants across human populations—how similarities and differences play a role in our understanding of the genetic basis of traits

Topic: 
Genetic variants across human populations—how similarities and differences play a role in our understanding of the genetic basis of traits
Abstract / Description: 

Identifying which genetic variants influence medically relevant phenotypes is an important task both for therapeutic development and for risk prediction. In the last decade, genome wide association studies have been the most widely-used instrument to tackle this question. One challenge that they encounter is in the interplay between genetic variability and the structure of human populations. In this talk, we will focus on some opportunities that arise when one collects data from diverse populations and present statistical methods that allow us to leverage them.

The presentation will be based on joint work with M. Sesia, S. Li, Z. Ren, Y. Romano and E. Candes.

Suggested Reading:
● "What Happens When Geneticists Talk Sloppily About Race"
● "FDR control in GWAS with population structure"
● "Causal inference by using invariant prediction: identification and confidence
intervals"


*Because the Biostatistics Workshop doubles as a class, the current university response to the pandemic requires us to restrict in-person attendance to Stanford students, faculty and staff. We hope to be able to revise these restrictions soon and welcome back all our biostatistics workshop community.

Date and Time: 
Thursday, September 23, 2021 - 1:30pm
Venue: 
MSOB X303* + Zoom

Pages

Subscribe to RSS - Statistics and Probability Seminars