Skip to main content

Monte Carlo Prediction

Estimating value functions from sampled episodes

Temporal Difference Learning

Bootstrapping value estimates from experience

Model-free Control

Learning optimal policies without a model