Algorithms for statistics on corrupted or heavy-tailed data have seen a flurry activity in the last few years. (Indeed, even the last few months!) I will survey some recent developments, and then zoom in on joint work with Yihe Dong and Jerry Li in which we focus on translating the progress in polynomial-time algorithms into something practical. In particular, we obtain the first nearly-linear time algorithm for robust mean estimation in high dimensions, where the goal is to estimate the mean of a random vector from independent samples of which a constant fraction have been maliciously corrupted. Our algorithm is sufficiently practical that our implementation scales to thousands of dimensions and tens/hundreds of thousands of samples on laptop hardware; I will discuss some experimental validations of our theoretical results.
Based on "Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection," to appear in NeurIPS 2019. https://arxiv.org/pdf/1906.11366.pdf
The Information Systems Laboratory Colloquium (ISLC)
is typically held in Packard 101 every Thursday at 4:30 pm during the academic year. Coffee and refreshments are served at 4pm in the second floor kitchen of Packard Bldg.
The Colloquium is organized by graduate students Joachim Neu, Tavor Baharav and Kabir Chandrasekher. To suggest speakers, please contact any of the students.
To receive email notifications of seminars you can join the ISL mailing list.