May I ask how much did the training cost you?
I hope there is something in the report for everyone, we included a fair bit on the actual training and data infrastructure usually not written about much, that I think will be interesting to people here. There's more that didn't fit, happy to answer questions!
We recommend training off the undistilled, Raw checkpoint, and then applying the LoRA to the Turbo model for inference.