5.3.1 Overview

Course subject(s) Module 5. Introduction to Reinforcement Learning

In this subsection we discuss why tabular RL does not scale to interesting applications, demonstrate how Q-learning can be scaled up to play video games, learn how neural networks represent continuous policies and how these policies can be trained with the REINFORCE algorithm.

After completing this subsection, you should be able to:

discuss the extension of Q-learning with function approximation.
reproduce the REINFORCE algorithm’s loss.
explain how REINFORCE finds the optimal policy.

AI Skills: Introduction to Unsupervised, Deep and Reinforcement Learning by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://online-learning.tudelft.nl/courses/ai-skills-introduction-to-unsupervised-deep-and-reinforcement-learning/