Monte Carlo Prediction
Estimating value functions from sampled episodes
Temporal Difference Learning
Bootstrapping value estimates from experience
Model-free Control
Learning optimal policies without a model
Monte Carlo prediction, temporal difference learning, and model-free control.