19. Distill policies that remember
Apply distillation to policies with memory, attention, and long context. This chapter covers recurrent policies, transformer policies, sequence decision models, and the extra care needed when the teacher’s behavior depends on history.