Markov decision processes (MDPs) capture some of the most important aspects of decision making under uncertainty and as such they are at the heart of many efforts to decision making under uncertainty. However, MDPs are "flat" with no structure and as such, planning and learning in MDPs with multidimensional state spaces, common in applications, is provably intractable. Yet, reinforcement learning methods have been quite successful in providing strong solutions to some of these seemingly intractable problems. In this talk I will present my view of how to think about these successes by presenting a framework where the key idea is to give algorithms hints that can create backdoors to crack otherwise intractable problems. The talk will then dive into categorizing hints based on whether they can indeed succeed at doing this for the special case when the hints are given in the form of constraints on how value functions look like in the context of planning with generative models, also known as simulation optimization. As we shall see, seemingly minor differences between hints can cause some hints to work, while others fail.