Traditional reinforcement learning (RL) algorithms are purely data-driven and operate without any a priori knowledge about the nature of the available actions, the system’s state transition dynamics, and its cost/reward function. This allows them to solve a wide variety of problems, but severely penalizes their ability to meet critical requirements of emerging wireless applications, due to the inefficiency with which the algorithms learn from their interactions with the environment. In this presentation, we describe foundational advances in system-aware RL that are achieved by systematically integrating basic system models into the learning process. These solutions use real-time data in conjunction with basic knowledge of the underlying communication system, and can achieve orders of magnitude improvement in key performance metrics, such as sample, compute, and memory complexity, compared to well-established RL benchmarks. Integration of this framework with deep RL and its further acceleration via stochastic computing and hardware optimization are also discussed.
Nicholas Mastronarde received the Ph.D. degree in electrical engineering from the University of California, Los Angeles, CA, USA, in 2011. He is currently an Associate Professor with the Department of Electrical Engineering, University at Buffalo, Buffalo, NY, USA. His research interests include reinforcement learning, Markov decision processes, and resource allocation and scheduling in wireless networks.
Jacob Chakareski is Panasonic Professor of Computing at NJIT. He completed his Ph.D. degree in Electrical and Computer Engineering at Rice University and Stanford University. His research interests span networked virtual and augmented reality, UAV-IoT sensing and networking, fast reinforcement learning, 5G wireless edge computing/caching, and ubiquitous immersive communication.