Week |
Date |
Lecture |
Material |
Assignment |
1 |
Aug 20 |
Introduction
(Part A) |
|
|
1 |
Aug 22 |
Introduction
(Multi-arm Bandits) |
Sutton's book,
Chapter 2. |
|
2 |
Aug 27 |
Markov
Decision Processes (Part A)
|
|
|
2 |
Aug 29 |
Markov
Decision Processes (Part B) |
|
|
3 |
Sep 3 |
Solving MDPs:
Dynamic Programming (Part A)
|
|
|
3 |
Sep 5 |
Solving MDPs:
Dynamic Programming (Part B)
|
|
|
4 |
Sep 10 |
Project 1: Lab |
|
Due HW1 (Sep 13) |
4 |
Sep 12 |
Model-Free
Prediction: Monte Carlo and Temporal Difference (Part A)
|
|
|
5 |
Sep 17 |
Model-Free
Prediction: Monte Carlo and Temporal Difference (Part B) |
|
|
5 |
Sep 19 |
Model-Free
Control: Off-Policy & On-Policy (Part A) |
|
|
|
Sep 20 |
|
|
Due of Project 1
|
6 |
Sep 24 |
Model-Free Control: Off-Policy & On-Policy (Part B) |
|
|
6 |
Sep 26 |
Value Function Approximation (Part A) |
|
|
7 |
Oct 1 |
Value Function Approximation (Part B) |
|
Due of HW2 (Oct 2) |
7 |
Oct 3 |
Policy Gradient Methods (Part A) |
|
|
|
Oct 4 |
|
|
Selection of Final Project Topics/Groups |
8 |
Oct 8 |
Fall Break - No Class!
|
|
|
8 |
Oct 10 |
Policy Gradient Methods (Part B) |
|
|
9 |
Oct 15 |
Project 2: Lab |
|
|
9 |
Oct 17 |
Actor-Critic Algorithms (Part A) |
|
|
10 |
Oct 22 |
Actor-Critic Algorithms (Part B) |
|
|
10 |
Oct 24 |
Deep RL: DQN, PPO, TRPO, DDPG |
|
|
|
Oct 25 |
|
|
Due of Project 2 |
11 |
Oct 29 |
Exploration and Exploitation (Part A) |
|
|
11 |
Oct 31 |
Guest Lecture: Research in RL |
|
Due of HW3 |
12 |
Nov 5 |
No Class!
|
|
|
12 |
Nov 7 |
Exploration and Exploitation (Part B) |
|
|
13 |
Nov 12 |
Final Project: Lab |
|
|
13 |
Nov 14 |
Optimal Control and Planning (Part A) |
|
|
14 |
Nov 19 |
Optimal Control and Planning (Part B) |
|
|
14 |
Nov 21 |
Guest Lecture: Research in RL |
|
Due of Final Project's Code |
15 |
Nov 26 |
Final Project Presentations |
|
|
15 |
Nov 28 |
Happy Thanksgiving!
|
|
|
16 |
Dec 3 |
Closing Remarks |
|
Due of Final Project's Report |