Skip to main content
Andrej Karpathy’s ReinforceJS demo runs policy evaluation, policy iteration, and value iteration on a gridworld MDP directly in your browser. Use it as a companion to the algorithm pages above: watch state values propagate with each sweep, see the greedy policy refine, and toggle between evaluation-only and full iteration to feel the difference.
Source: cs.stanford.edu/people/karpathy/reinforcejs/gridworld_dp.html. If the embedded frame above is blocked by your browser or network, open the link in a new tab.