Training on TPUs
Last updated
Last updated
Note: Most of the strategies introduced in the (such as mixed precision training or gradient accumulation) and are generic and apply to training models in general so make sure to have a look at it before diving into this section.
This document will be completed soon with information on how to train on TPUs.