Deep Reinforcement Learning
This course is taken almost verbatim from CS 294-112 Deep Reinforcement Learning – Sergey Levine’s course at UC Berkeley. We are following his course’s formulation and selection of papers, with the permission of Levine. This is a section of the CS 6101 Exploration of Computer Science Research at NUS. CS 6101 is a 4 modular credit pass/fail module for new incoming graduate programme students to obtain background in an area with an instructor’s support. It is designed as a “lab rotation” to familiarize students with the methods and ways of research in a particular research area.
24 students in 11 teams
1378 guests registered
Due to a mismatch between simulation and real-world domains (commonly termed the ‘reality gap’), robotic agents trained in simulation often may not perform successfully when transferred to the real-world. This problem may be addressed through domain randomisation, such that a single universal policy is trained to adapt to different environmental dynamics. However, existing methods do not prescribe a priori the order in which tasks are presented to the learner. We present a curriculum-based approach where the difficulty level of training corresponds to the number of variables that are randomised. Experiments are conducted on the inverted pendulum, a classic control problem.
Model-based reinforcement learning refers to learning a model of the environmental by taking action and observing the results including the next state and the immediate rewards, and indirectly learning the optimal behavior. The model predicts the outcome of the action and is used to replace or supplement the interaction with the environment to learn the optimal policy. Model-based reinforcement learning consists of two main parts: learning a dynamics model, and using a controller to plan and execute actions that minimize a cost function. In this project, we will explore model-based reinforcement learning in terms of dynamics models and control logics.