Image
statistics image

Predicting and optimizing the behavior of large-scale models

Summary
Andrew Ilyas (Stanford Statistics)
Sloan 380Y
Oct
22
This event ended 489 days ago.
Date(s)
Content

In this talk, we study the problem of estimating (and optimizing) the counter-factual behavior of large-scale predictive models. We start by focusing on "data counter-factuals", where the goal is to estimate the effect of modifying a training dataset on the resulting machine learning outputs. For many classes of statistical models, the influence function is a powerful tool for solving this problem. Yet, the (supposed) intractability of the influence function for large-scale predictive models has necessitated a parallel line of work in machine learning: we begin with an overview.

We then introduce a method that (almost) perfectly estimates how changes to training data change large-scale model behavior. Key to the method is a computational algorithm for computing the exact influence function, and a diagnostic for ensuring its utility on the scale of large-scale machine learning models. This method unlocks some new applied possibilities and open questions: their discussion forms our conclusion.