23. Prove the student is good enough
Measure whether the distilled policy really works with returns, success rates, robustness tests, calibration, latency, and cost. This chapter also covers statistical confidence, ablations, and fair comparisons against the teacher.