Write portable code with AutoClass
Last updated
Last updated
Load pretrained instances with an AutoClass
With so many different Transformer architectures, it can be challenging to create one for your checkpoint. As a part of π Transformers core philosophy to make the library easy, simple and flexible to use, an AutoClass
automatically infers and loads the correct architecture from a given checkpoint. The from_pretrained()
method lets you quickly load a pretrained model for any architecture so you donβt have to devote time and resources to train a model from scratch. Producing this type of checkpoint-agnostic code means if your code works for one checkpoint, it will work with another checkpoint - as long as it was trained for a similar task - even if the architecture is different.
Remember, architecture refers to the skeleton of the model and checkpoints are the weights for a given architecture. For example, is an architecture, while bert-base-uncased
is a checkpoint. Model is a general term that can mean either architecture or checkpoint.
In this tutorial, learn to:
Load a pretrained tokenizer.
Load a pretrained image processor
Load a pretrained feature extractor.
Load a pretrained processor.
Load a pretrained model.
Nearly every NLP task begins with a tokenizer. A tokenizer converts your input into a format that can be processed by the model.
Load a tokenizer with :
Copied
Then tokenize your input as shown below:
Copied
For vision tasks, an image processor processes the image into the correct input format.
Copied
For audio tasks, a feature extractor processes the audio signal the correct input format.
Copied
Copied
PytorchHide Pytorch content
Copied
Easily reuse the same checkpoint to load an architecture for a different task:
Copied
TensorFlow and Flax checkpoints are not affected, and can be loaded within PyTorch architectures using the from_tf
and from_flax
kwargs for the from_pretrained
method to circumvent this issue.
TensorFlowHide TensorFlow content
Copied
Easily reuse the same checkpoint to load an architecture for a different task:
Copied
Load a feature extractor with :
Multimodal tasks require a processor that combines two types of preprocessing tools. For example, the model requires an image processor to handle images and a tokenizer to handle text; a processor combines both of them.
Load a processor with :
Finally, the AutoModelFor
classes let you load a pretrained model for a given task (see for a complete list of available tasks). For example, load a model for sequence classification with :
For PyTorch models, the from_pretrained()
method uses torch.load()
which internally uses pickle
and is known to be insecure. In general, never load a model that could have come from an untrusted source, or that could have been tampered with. This security risk is partially mitigated for public models hosted on the BOINC AI Hub, which are at each commit. See the for best practices like with GPG.
Generally, we recommend using the AutoTokenizer
class and the AutoModelFor
class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next , learn how to use your newly loaded tokenizer, image processor, feature extractor and processor to preprocess a dataset for fine-tuning.
Finally, the TFAutoModelFor
classes let you load a pretrained model for a given task (see for a complete list of available tasks). For example, load a model for sequence classification with :
Generally, we recommend using the AutoTokenizer
class and the TFAutoModelFor
class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next , learn how to use your newly loaded tokenizer, image processor, feature extractor and processor to preprocess a dataset for fine-tuning.