ICTS: Reinforcement Learning Bootcamp, Fall 2025 (Aug 4 - Aug 7)
- Lecturer: Gaurav Mahajan (gaurav.mahajan@yale.edu)
- Lecture Notes: (draft notes)
Description
The course will cover the basics of reinforcement learning theory. We will start by implementing simple gradient-based algorithms in PyTorch and using them to solve standard control problems like CartPole and the Atari 2600 game Pong. Along the way, we will explore how to optimize both the sample complexity (the number of interactions with the environment) and the computational complexity (GPU hours) needed to learn an optimal policy.
Lectures
Day 1: Basics of Reinforcement Learning (notes)
- Exploration vs Exploitation, and Credit Assignment
- Markov Decision Process, Value Functions
Day 2: Policy Gradient Methods (notes)
Setup Instructions
Step 1: Create a virtual environment
python3 -m venv .venv source .venv/bin/activate # on Linux/macOS .venv\Scripts\activate.bat # on Windows
Step 2: Install required packages
pip install --upgrade pip pip install torch pip install "gymnasium[classic-control]"
Step 3: Verify installation
python -c "import torch; print(torch.__version__)" python -c "import gymnasium as gym; env = gym.make('CartPole-v1'); print(env)"
- Environments: CartPole and Pong
- Vanilla Policy Gradient Algorithm
- Implementing in Python (cartpole.py)
Day 3: Data Efficient RL
- Bellman Equations and Optimism
- Algorithm
- Optimization Constraint in Linear Form
- Exploration: Bounding the Number of Rounds
Day 4: Computational Complexity
- Complexity Problems
- Linear Infinite-Horizon MDP
- Linear Finite-Horizon MDP