Statistics and Probability Seminars

Statistics Seminar: How much data is sufficient to learn high-performing algorithms?

Topic: 
How much data is sufficient to learn high-performing algorithms?
Abstract / Description: 

Algorithms often have tunable parameters that have a considerable impact on their runtime and solution quality. A growing body of research has demonstrated that data-driven algorithm design can lead to significant gains in runtime and solution quality. Data-driven algorithm design uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees for data-driven algorithm design, which bound the difference between the algorithm's expected performance and its average performance over the training set.

The challenge is that for many combinatorial algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a cascade of changes in the algorithm’s behavior. Prior research has proved generalization bounds by employing case-by-case analyses of parameterized greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove very general guarantees, yet we recover the bounds from prior research. Our guarantees apply whenever an algorithm's performance is a piecewise-constant, -linear, or — more generally — piecewise-structured function of its parameters. As we demonstrate, our theory also implies novel bounds for dynamic programming algorithms used in computational biology and voting mechanisms from economics.

This talk is based on joint work with Nina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, and Tuomas Sandholm.

Date and Time: 
Tuesday, July 20, 2021 - 4:30pm

Statistics Department Seminar: High-dimensional and nonparametric estimation under 'local' information constraints

Topic: 
High-dimensional and nonparametric estimation under 'local' information constraints
Abstract / Description: 

Details unavailable at time of publishing. Please check Statistics pages for updated information.

Date and Time: 
Tuesday, June 29, 2021 - 4:30pm

Statistics Department Seminar: Balancing covariates in randomized experiments with the Gram-Schmidt Walk design

Topic: 
Balancing covariates in randomized experiments with the Gram-Schmidt Walk design
Abstract / Description: 

Details unavailable at time of publishing. Please check Statistics pages for updated information.

Date and Time: 
Tuesday, June 15, 2021 - 4:30pm

Statistics Department Seminar: Provable guarantees for self-supervised deep learning with spectral contrastive loss

Topic: 
Provable guarantees for self-supervised deep learning with spectral contrastive loss
Abstract / Description: 

Details unavailable at time of publishing. Please check Statistics pages for updated information.

Date and Time: 
Tuesday, June 1, 2021 - 4:30pm

Pages

Subscribe to RSS - Statistics and Probability Seminars