23. Synthetic data for training and testing
Use model-generated data to improve training, evaluation, and task coverage. This chapter covers synthetic instructions, self-play, data filtering, teacher-student setups, and the risks of feedback loops and low-diversity data.