The problem of learning good treatment assignment rules from observational data lies at the heart of many challenges in data-driven decision making. While there is a growing body of literature devoted to this problem, most existing results are focused on the binary-action case (i.e., where one action corresponds to assignment to control and to assignment to treatment). In this paper, we study the offline multi-action policy learning problem with observational data and, building on the theory of efficient semi-parametric inference, propose and implement a policy learning algorithm that achieves asymptotically minimax-optimal regret. To the best of our knowledge, this is the first result of this type in the multi-action setup and provides a substantial performance improvement over the existing learning algorithms. We additionally investigate the application aspects of policy learning by working with decision trees, and discuss two different approaches for solving the key step of the learning algorithm to exact optimality, one using a mixed integer program formulation and the other using a tree-search based algorithm.
This is joint work with Susan Athey and Stefan Wager.
The Information Theory Forum (IT-Forum) at Stanford ISL is an interdisciplinary academic forum which focuses on mathematical aspects of information processing. With a primary emphasis on information theory, we also welcome researchers from signal processing, learning and statistical inference, control and optimization to deliver talks at our forum. We also warmly welcome industrial affiliates in the above fields. The forum is typically held in Packard 202 every Friday at 1:15 pm during the academic year.
The Information Theory Forum is organized by graduate students Yanjun Han and Yihui Quek. To suggest speakers, please contact any of the students.
Zhengyuan Zhou is a 6th-year PhD candidate in Electrical Engineering and has received a B.E. in Electrical Engineering and Computer Sciences and a B.A. in mathematics from UC Berkeley. His research interests include learning, optimization, control, game theory and applied probability.