Reasoning at multiple levels of temporal abstraction is one of the key abilities for artificial intelligence. In the reinforcement learning problem, this is often instantiated with the options framework. Options allow agents to make predictions and to operate at different levels of abstraction within an environment. Nevertheless, when a reasonable set of options is not known beforehand, there are no definitive answers for characterizing which options one should consider. Recently, a new paradigm for option discovery has been introduced. This paradigm is based on the successor representation (SR), which defines state generalization in terms of how similar successor states are. In this talk I'll discuss the existing methods from this paradigm, providing a big picture look at how the SR can be used in the options framework. I'll present methods for discovering "bottleneck" options, as well as options that improve an agent's exploration capabilities. I'll also discuss the option keyboard, which uses the SR to extend a finite set of options to a combinatorially large counterpart without additional learning.
Marlos C. Machado is a research scientist at Google Brain in Montreal, Canada. Marlos holds a B.Sc. and M.Sc. in Computer Science from the Universidade Federal de Minas Gerais, Brazil, and a Ph.D. in Computing Science from the University of Alberta, Canada. During his Ph.D. Marlos also interned at Microsoft Research, IBM Research, and DeepMind. His research interests lie broadly in artificial intelligence and particularly focus on reinforcement learning, with emphasis on generalization, exploration, and options.