Trace an episode table by hand - On Policy Distillation | Zoonk
Trace an episode table by hand
Fill a small step table by hand with observation, action, reward, next observation, and done status. Use the table to trace a full episode without yet using trajectories, returns, or probability calculations.