Information Systems Lab Colloquium: Information Relaxations and Duality in Stochastic Dynamic Programs
In this talk, we discuss the information relaxation approach for obtaining bounds on the performance of optimal policies in stochastic dynamic programs (DP). This approach involves relaxing the DP information structure and incorporating a penalty that punishes the use of additional information. We first provide an overview of some basic theory for the general approach. We then discuss how to apply the method in two broad classes of problems. For DPs with a convex structure, we show how to use gradient penalties and convex optimization to apply the method, and illustrate with an application in network revenue management. For infinite horizon MDPs, we show how to apply the method with change of measure techniques, and illustrate with an application in service allocation for a multiclass queue with convex delay costs. As we discuss, in both cases, the method provides tighter bounds than bounds from other relaxation methods, such as Lagrangian relaxations.