Many modern applications require us to very quickly find relevant results from an enormous output space of potential candidates, for example, finding the best matching product from a large catalog or suggesting related search phrases on a search engine. The size of the output space for these problems can be in the millions to billions. Moreover, observational or training data is often limited for many of the so-called "long-tail" of items in the output space. Given the inherent paucity of training data for most of the items in the output space, developing machine learning models that perform well for spaces of this size is a contemporary challenge. In this talk, I will present a multi-scale machine learning framework called Prediction for Enormous and Correlated Output Spaces (PECOS). PECOS proceeds by first building a hierarchy over the output space using unsupervised learning, and then learning a machine learning model that makes predictions at each level of the hierarchy. Finally, the multi-scale predictions are combined to obtain an overall predictive model. This leads to an inference method that scales logarithmically with the size of the output space. A key to obtaining high performance is leveraging sparsity and developing highly efficient sparse matrix routines.
Bio: Inderjit Dhillon is the Gottesman Family Centennial Professor of Computer Science and Mathematics at UT Austin, where he is also the Director of the ICES Center for Big Data Analytics. Currently he is on leave from UT Austin and heads the Amazon Research Lab in Berkeley, California, where he is developing and deploying state-of-the-art machine learning methods for Amazon Search. His main research interests are in big data, deep learning, machine learning, network analysis, linear algebra and optimization. He received his B.Tech. degree from IIT Bombay, and Ph.D. from UC Berkeley. Inderjit has received several awards, including the ICES Distinguished Research Award, the SIAM Outstanding Paper Prize, the Moncrief Grand Challenge Award, the SIAM Linear Algebra Prize, the University Research Excellence Award, and the NSF Career Award. He has published over 200 journal and conference papers, and has served on the Editorial Board of the Journal of Machine Learning Research, the IEEE Transactions of Pattern Analysis and Machine Intelligence, Foundations and Trends in Machine Learning and the SIAM Journal for Matrix Analysis and Applications. Inderjit is an ACM Fellow, an IEEE Fellow, a SIAM Fellow and an AAAS Fellow.