Quicktour

Quicktour

🌍 PEFT contains parameter-efficient finetuning methods for training large pretrained models. The traditional paradigm is to finetune all of a model’s parameters for each downstream task, but this is becoming exceedingly costly and impractical because of the enormous number of parameters in models today. Instead, it is more efficient to train a smaller number of prompt parameters or use a reparametrization method like low-rank adaptation (LoRA) to reduce the number of trainable parameters.

This quicktour will show you 🌍PEFT’s main features and help you train large pretrained models that would typically be inaccessible on consumer devices. You’ll see how to train the 1.2B parameter bigscience/mt0-largearrow-up-right model with LoRA to generate a classification label and use it for inference.

PeftConfig

Each 🌍 PEFT method is defined by a PeftConfigarrow-up-right class that stores all the important parameters for building a PeftModelarrow-up-right.

Because you’re going to use LoRA, you’ll need to load and create a LoraConfigarrow-up-right class. Within LoraConfig, specify the following parameters:

  • the task_type, or sequence-to-sequence language modeling in this case

  • inference_mode, whether you’re using the model for inference or not

  • r, the dimension of the low-rank matrices

  • lora_alpha, the scaling factor for the low-rank matrices

  • lora_dropout, the dropout probability of the LoRA layers

Copied

from peft import LoraConfig, TaskType

peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)

πŸ’‘ See the LoraConfigarrow-up-right reference for more details about other parameters you can adjust.

PeftModel

A PeftModelarrow-up-right is created by the get_peft_model() function. It takes a base model - which you can load from the 🌍 Transformers library - and the PeftConfigarrow-up-right containing the instructions for how to configure a model for a specific 🌍 PEFT method.

Start by loading the base model you want to finetune.

Copied

Wrap your base model and peft_config with the get_peft_model function to create a PeftModelarrow-up-right. To get a sense of the number of trainable parameters in your model, use the print_trainable_parameters method. In this case, you’re only training 0.19% of the model’s parameters! 🀏

Copied

That is it πŸŽ‰! Now you can train the model using the 🌍 Transformers Trainerarrow-up-right, 🌍 Accelerate, or any custom PyTorch training loop.

Save and load a model

After your model is finished training, you can save your model to a directory using the save_pretrainedarrow-up-right function. You can also save your model to the Hub (make sure you log in to your BOINC AI account first) with the push_to_hubarrow-up-right function.

Copied

This only saves the incremental 🌍 PEFT weights that were trained, meaning it is super efficient to store, transfer, and load. For example, this bigscience/T0_3Barrow-up-right model trained with LoRA on the twitter_complaintsarrow-up-right subset of the RAFT datasetarrow-up-right only contains two files: adapter_config.json and adapter_model.bin. The latter file is just 19MB!

Easily load your model for inference using the from_pretrainedarrow-up-right function:

Copied

Easy loading with Auto classes

If you have saved your adapter locally or on the Hub, you can leverage the AutoPeftModelForxxx classes and load any PEFT model with a single line of code:

Copied

Currently, supported auto classes are: AutoPeftModelForCausalLM, AutoPeftModelForSequenceClassification, AutoPeftModelForSeq2SeqLM, AutoPeftModelForTokenClassification, AutoPeftModelForQuestionAnswering and AutoPeftModelForFeatureExtraction. For other tasks (e.g. Whisper, StableDiffusion), you can load the model with:

Copied

Next steps

Now that you’ve seen how to train a model with one of the 🌍 PEFT methods, we encourage you to try out some of the other methods like prompt tuning. The steps are very similar to the ones shown in this quickstart; prepare a PeftConfigarrow-up-right for a 🌍 PEFT method, and use the get_peft_model to create a PeftModelarrow-up-right from the configuration and base model. Then you can train it however you like!

Feel free to also take a look at the task guides if you’re interested in training a model with a 🌍PEFT method for a specific task such as semantic segmentation, multilingual automatic speech recognition, DreamBooth, and token classification.

Last updated