Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/28 | Lecture #1 : Introduction to Reinforcement and Representation Learning [ slides ] |
| ||
W 08/30 | Lecture #2 : Multi-armed Bandits [ slides ] |
| ||
F 09/01 | Recitation #1: Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits [ slides ] |
| ||
M 09/04 | Labor Day - No Classes | |||
W 09/06 | Lecture #3 : Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
| HW1 out (tentative) | |
F 09/08 | Recitation #2: Bandits, MDPs & HW1 [ slides ] | |||
M 09/11 | Lecture #4 : Monte Carlo Learning and Temporal Difference Learning [ slides ] |
| ||
W 09/13 | Lecture #5 : Monte Carlo Learning and Temporal Difference Learning (Cont.) [ slides ] |
| ||
F 09/15 | Lecture #6 : Planning, Monte Carlo Tree search [ slides ] |
| ||
M 09/18 | Lecture #7 : Function approximation in prediction and control, Deep Q-learning [ slides ] |
| ||
W 09/20 | Lecture #8 : Policy gradients, REINFORCE, Actor-Critic methods [ slides ] | |||
F 09/22 | Lecture #9 : Natural PG, PPO, TRPO [ slides ] |
| ||
M 09/25 | Lecture #10 : Deterministic Policy gradient, re-parametrized PG [ slides ] |
| HW1 due 11:59pm, HW2 out (tentative) | |
W 09/27 | Lecture #11 : Evolutionary methods for policy search [ slides ] |
| ||
F 09/29 | Recitation #3: MCTS, TD Learning, Deep Q Learning, HW2 [ slides ] | |||
M 10/02 | Recitation #4: Quiz 1 Review [ slides ] | |||
W 10/04 | Recitation #5: Large OH/Quiz 1 Recitation [ slides ] | |||
F 10/06 | Quiz 1 | |||
M 10/09 | Lecture #12 : Imitation learning, behavior cloning [ slides ] |
| ||
W 10/11 | Lecture #13 : Imitation learning with generative models [ slides | slides 2 ] |
| HW2 due 11:59PM | |
F 10/13 | Recitation #6: Solutions to Quiz 1 [ slides ] | |||
M 10/16 | Fall Break - No Classes | |||
W 10/18 | Fall Break - No Classes | |||
F 10/20 | Fall Break - No Classes | |||
M 10/23 | Lecture #14 : Imitation learning with generative models (cond. ), multigoal imitation learning and reinforcement learning [ slides ] |
| HW3 out (tentative) | |
W 10/25 | Lecture #15 : AlphaGo, AlphaGoZero, AlphaZero [ slides | slides 2 ] |
| ||
F 10/27 | Recitation #7: HW3, Gaussian Processes, Bayes Optimization [ slides | slides 2 ] | |||
M 10/30 | Lecture #16 : MBRL in explicit and observable low-dimensional state spaces [ slides ] | |||
W 11/01 | Lecture #17 : MBRL from sensory input, planning in sensory space, planning in a latent state space [ slides ] |
| ||
F 11/03 | Lecture #18 : MBRL (cont.) Stochastic latent dynamics models [ slides ] |
| ||
M 11/06 | Recitation #8: Quiz 2 Review & HW4 [ slides ] | HW4 out (tentative), HW3 due 11:59PM | ||
W 11/08 | Lecture #19 : Intelligent Exploration [ slides ] |
| ||
F 11/10 | Quiz 2 | |||
M 11/13 | Lecture #20 : Offline RL, Learning by Observation [ slides ] | |||
W 11/15 | Lecture #21 (Aviral Kumar): Offline RL [ slides ] | |||
F 11/17 | Recitation #9: Homework 5 [ slides ] | HW5 out (tentative), HW4 due 11:59PM | ||
M 11/20 | Lecture #22 : Sim2Real Transfer [ slides ] |
| ||
W 11/22 | Thanksgiving Break - No Classes | |||
F 11/24 | Thanksgiving Break - No Classes | |||
M 11/27 | Lecture #23 : Visual Imitation Learning [ slides ] |
| ||
W 11/29 | Lecture #24 : Language and Robot Control [ slides ] |
| ||
F 12/01 | Recitation #10: Recitation [ slides ] | |||
M 12/04 | Lecture #25 : Self-Supervised Visual Learning [ slides ] | HW5 due 11:59PM | ||
W 12/06 | Lecture #26 : Multimodal Policies , Control and 3D Spatial Representations [ slides | slides 2 ] | |||
F 12/08 | Recitation #11: Quiz 3 Review Part II [ slides ] | |||
12/12 | Quiz 3, 5:30pm - 8:30pm, SH 105 (Scaife Hall) |