Statistics and Probability Seminars

Statistics Department Seminar presents "The blessings of multiple causes"

Topic: 
The blessings of multiple causes
Abstract / Description: 

Causal inference from observational data is a vital problem, but it comes with strong assumptions. Most methods assume that we observe all confounders, variables that affect both the causal variables and the outcome variables. But whether we have observed all confounders is a famously untestable assumption. We describe the deconfounder, a way to do causal inference from observational data allowing for unobserved confounding. How does the deconfounder work? The deconfounder is designed for problems of multiple causal inferences: scientific studies that involve many causes whose effects are simultaneously of interest. The deconfounder uses the correlation among causes as evidence for unobserved confounders, combining unsupervised machine learning and predictive model checking to perform causal inference. We study the theoretical requirements for the deconfounder to provide unbiased causal estimates, along with its limitations and tradeoffs. We demonstrate the deconfounder on real-world data and simulation studies.

Date and Time: 
Thursday, February 13, 2020 - 4:30pm
Venue: 
Sloan Mathematics Center, Room 380C

Statistics Department Seminar presents "Fréchet change point detection"

Topic: 
Fréchet change point detection
Abstract / Description: 

Change point detection is a popular tool for identifying locations in a data sequence where an abrupt change occurs in the data distribution and has been widely studied for Euclidean data. Modern data very often is non-Euclidean, for example, distribution valued data or network data. Change point detection is a challenging problem when the underlying data space is a metric space where one does not have basic algebraic operations like addition of the data points and scalar multiplication. 

 
In this talk, I propose a method to infer the presence and location of change points in the distribution of a sequence of independent data taking values in a general metric space. Change points are viewed as locations at which the distribution of the data sequence changes abruptly in terms of either its Fr´echet mean or Fréchet variance or both. The proposed method is based on comparisons of Fréchet variances before and after putative change point locations. First, I will establish that under the null hypothesis of no change point the limit distribution of the proposed scan function is the square of a standardized Brownian Bridge. It is well known that such convergence is rather slow in moderate to high dimensions. For more accurate results in finite sample applications, I will provide a theoretically justified bootstrap-based scheme for testing the presence of change points. Next, I will show that when a change point exists, (1) the proposed test is consistent under contiguous alternatives and (2) the estimated location of the change point is consistent. All of the above results hold for a broad class of metric spaces under mild entropy conditions. Examples include the space of univariate probability distributions and the space of graph Laplacians for networks. I will illustrate the efficacy of the proposed approach in empirical studies and in real-data applications with sequences of maternal fertility distributions. Finally, I will talk about some future extensions and other related research directions, for instance, when one has samples of dynamic metric space data. 
 
This talk is based on joint work with Professor Hans-Georg Müller.
Date and Time: 
Tuesday, February 4, 2020 - 4:00pm
Venue: 
Sloan Mathematics Center, Room 380C

Statistics Department Seminar welcomes Sourav Chatterjee

Topic: 
Feature Ordering by Conditional Independence
Abstract / Description: 

I will talk about a coefficient of conditional dependence between two random variables Y and Z given a set of other variables X1, . . . , Xp, based on an i.i.d. sample. The coefficient has a long list of desirable properties, the most important of which is that under absolutely no distributional assumptions, it converges to a limit in [0, 1], where the limit is 0 if and only if Y and Z are conditionally independent given X1, . . . , Xp, and is 1 if and only if Y is equal to a measurable function of Z given X1, . . . , Xp. I will then present a new variable selection algorithm based on this statistic, called Feature Ordering by Conditional Independence (FOCI), which is model-free, has no tuning parameters, and is provably consistent under sparsity assumptions.

This is based on joint work with Mona Azadkia


The Statistics Seminars for Winter Quarter will be held in Room 380C of Sloan Mathematics Center in the Main Quad at 4:30pm on Tuesdays. Refreshments are served at 4pm in the Lounge on the first floor of Sequoia Hall.

Date and Time: 
Tuesday, February 25, 2020 - 4:30pm
Venue: 
Sloan Mathematics Center, Room 380C

Statistics Department Seminar presents "Causal learning: Excursions in double robustness"

Topic: 
Causal learning: Excursions in double robustness
Abstract / Description: 

Recent progress in machine learning provides many potentially effective tools to learn estimates or make predictions from datasets of ever-increasing sizes. Can we trust such tools in clinical and highly-sensitive systems? If a learning algorithm predicts an effect of a new policy to be positive, what guarantees do we have concerning the accuracy of this prediction? The talk introduces new statistical ideas to ensure that the learned estimates satisfy some fundamental properties: especially causality and robustness. The talk will discuss potential connections and departures between causality and robustness.


The Statistics Seminars for Winter Quarter will be held in Room 380C of Sloan Mathematics Center in the Main Quad at 4:30pm on Tuesdays. Refreshments are served at 4pm in the Lounge on the first floor of Sequoia Hall.

Date and Time: 
Tuesday, February 18, 2020 - 4:30pm
Venue: 
Sloan Mathematics Center, Room 380C

Probability Seminar presents "On the edge-statistics conjecture"

Topic: 
On the edge-statistics conjecture
Abstract / Description: 

Suppose we are given integers k ≥ 1 and 0 < ` < k 2  . When sampling a k-vertex subset uniformly at random from a (very large) n-vertex graph G, how large can the probability be that there are exactly ` edges within the sampled k-vertex subset? Let ind(k, `) be the limit of this maximum possible probability as n goes to infinity. Alon, Hefetz, Krivelevich and Tyomkyn conjectured that ind(k, `) ≤ e −1 + o(1) for all k ≥ 1 and 0 < ` < k 2  . The constant e −1 in this conjecture is best-possible, since for ` = 1 and ` = k −1 one can easily show that ind(k, `) ≥ e −1 − o(1). Kwan, Sudakov and Tran proved the conjecture in the case Ω(k) ≤ ` ≤ k 2  − Ω(k). In joint work with Jacob Fox, we solved the remaining cases of the conjecture. This talk will discuss our results, as well as our proof for the case ` = 1 (which is one of the cases in which the conjecture is tigh


 

- Probability Seminar

Date and Time: 
Monday, February 24, 2020 - 4:00pm
Venue: 
Sequoia Hall Room 200

Probability Seminar presents "Improving constant in L1 Poincaré inequality on Hamming cube and related subjects"

Topic: 
Improving constant in L1 Poincaré inequality on Hamming cube and related subjects
Abstract / Description: 

We improve the constant π/2 in L 1 -Poincar´e inequality on Hamming cube. For Gaussian space the sharp constant in L 1 inequality is known, and it is the square root of π/2 (Maurey– Pisier). For Hamming cube the sharp constant is not known, and the square root of π/2 gives an estimate from below for this sharp constant. On the other hand, Ben Efraim and Lust-Piquard have shown an estimate from above: π/2 without the square root. There are at least two other independent proofs of the same estimate from above. Since those proofs are very different from the proof of Ben Efraim and Lust-Piquard but gave the same constant, that might have indicated that their constant is sharp. But here we give a better estimate from above, showing that Poincar´e constant C1 is strictly smaller than Ben Efraim and Lust-Piquard constant π/2. It is still not clear whether C1 > pπ/2. The proof of Ben Efraim and Lust-Piquard used non-commutative harmonic analysis to prove their estimate. Our approach is different. We discuss this circle of questions, their relation with isoperimetric inequalities on Hamming cube and with Margoulis sharp threshold network theorem.

Date and Time: 
Monday, February 10, 2020 - 4:30pm
Venue: 
Sequoia Hall Room 200

EXTRA Probability Seminar presents "Topological phase transitions in random geometric complexes"

Topic: 
Topological phase transitions in random geometric complexes
Abstract / Description: 

Connectivity and percolation are two well studied phenomena in random graphs. In this talk we will discuss higher-dimensional analogues of connectivity and percolation that occur in random simplicial complexes. Simplicial complexes are a natural generalization of graphs, that consist of vertices, edges, triangles, tetrahedra, and higher-dimensional simplexes. We will mainly focus on random geometric complexes. These complexes are generated by taking the vertices to be a random point process, and adding simplexes according to their geometric configuration. Our generalized notions of connectivity and percolation use the language of homology–an algebraic-topological structure representing cycles of different dimensions. In this talk we will discuss recent results analyzing phase transitions related to these topological phenomena

Date and Time: 
Monday, February 10, 2020 - 3:15pm
Venue: 
Sequoia Hall Room 200

Probability Seminar presents "Fluctuations in mean field Ising models"

Topic: 
Fluctuations in mean field Ising models
Abstract / Description: 

We study fluctuations of the magnetization (average of spins) in an Ising model on a sequence of "well-connected" approximately dn regular graphs on n vertices. We show that if dn  √ n, then the fluctuations are universal, and the same as that of the Curie Weiss model, in the entire Ferromagnetic parameter regime. We then give a counterexample to show that dn  √ n is actually tight, in the sense that the limiting distribution changes if dn ∼ √ n except in the high temperature regime. By refining our argument, we show that in the high temperature regime universality holds for dn  n 1/3. As a by product of our proof technique, we prove rates of convergence, as well as exponential concentration for the sum of spins, and tight estimates for several statistics of interest.

This is based on joint work with Nabarun Deb at Columbia University.

Date and Time: 
Monday, February 3, 2020 - 4:00pm
Venue: 
Sequoia Hall Room 200

Pages

Subscribe to RSS - Statistics and Probability Seminars