The performance of machine learning systems depends critically on tuning parameters that are difficult to set by standard optimization techniques. Such "hyperparameters"---including model architecture, regularization, and learning rates---are often tuned in an outer loop by black-box search methods evaluating performance on a holdout set. We formulate such hyperparameter tuning as a pure-exploration problem of deciding how many resources should be allocated to particular hyperparameter configurations. I will introduce our Hyperband algorithm for this framework and a theoretical analysis that demonstrates its ability to adapt to uncertain convergence rates and the dependency of hyperparameters on the validation loss. I will close with several experimental validations of Hyperband, including experiments on training deep networks where Hyperband outperforms state-of-the-art Bayesian optimization methods by an order of magnitude.
The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:00 pm during the academic year.
The Information Theory Forum is organized by graduate students Jiantao Jiao and Yanjun Han. To suggest speakers, please contact any of the students.