Assignments (Spring 2021)

Assignments#

This is a new class and there is no textbook. We will post relevant reading material. The assignments are as follows, and will be released one week before they are due:

HW1: Bandits and Contextual Bandits
HW2: Value and Policy Iteration
HW3: Deep Q-Learning
HW4: Policy Gradients
HW5: Reward Function Design
HW6: Sim-to-Real
HW7: Learning from Demonstrations
HW8: Model-based Learning
HW9: Offline Reinforcement Learning (Optional)

Submission of Assignments#

We'll be using Gradescope for problem set submission and grading. Each problem set is weighted equally. The login code for this class will be posted on Piazza -- please create an account and add yourself to this class using that code only if you are taking this class for credit. Grading will rely on review of the submitted code and writeup. More details will be provided when assignments are released. Assignments are due one week after the assignment release. Late assignment submissions will be penalized 10% every 24 hours.

Collaboration Policy#

Collaboration is encouraged, but the work you submit for assignments is expected to be entirely your own. That is, the writing and code must be yours, and you must fully understand everything that you hand in. Discussing the details of how to solve a problem is fine, but you must write the solution yourself. To avoid plagiarizing, you shouldn't be looking at someone else's solution while you write down your own. If you collaborated significantly (use your own discretion for "significantly") on a problem, list the people you collaborated with next to your solution.

Final Projects#

The final project will be your opportunity to explore some of the topics introduced in the course more deeply.

Your project should be related to the course.
You are welcome to work on the project in a team (3 people at most).

	Deadline
Project proposal	03/18/2021
Project midterm report	04/15/2021
Project final report	05/21/2021