Quicktour

Quicktour

๐ŸŒ PEFT contains parameter-efficient finetuning methods for training large pretrained models. The traditional paradigm is to finetune all of a modelโ€™s parameters for each downstream task, but this is becoming exceedingly costly and impractical because of the enormous number of parameters in models today. Instead, it is more efficient to train a smaller number of prompt parameters or use a reparametrization method like low-rank adaptation (LoRA) to reduce the number of trainable parameters.

This quicktour will show you ๐ŸŒPEFTโ€™s main features and help you train large pretrained models that would typically be inaccessible on consumer devices. Youโ€™ll see how to train the 1.2B parameter bigscience/mt0-large model with LoRA to generate a classification label and use it for inference.

PeftConfig

Each ๐ŸŒ PEFT method is defined by a PeftConfig class that stores all the important parameters for building a PeftModel.

Because youโ€™re going to use LoRA, youโ€™ll need to load and create a LoraConfig class. Within LoraConfig, specify the following parameters:

  • the task_type, or sequence-to-sequence language modeling in this case

  • inference_mode, whether youโ€™re using the model for inference or not

  • r, the dimension of the low-rank matrices

  • lora_alpha, the scaling factor for the low-rank matrices

  • lora_dropout, the dropout probability of the LoRA layers

Copied

from peft import LoraConfig, TaskType

peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)

๐Ÿ’ก See the LoraConfig reference for more details about other parameters you can adjust.

PeftModel

A PeftModel is created by the get_peft_model() function. It takes a base model - which you can load from the ๐ŸŒ Transformers library - and the PeftConfig containing the instructions for how to configure a model for a specific ๐ŸŒ PEFT method.

Start by loading the base model you want to finetune.

Copied

Wrap your base model and peft_config with the get_peft_model function to create a PeftModel. To get a sense of the number of trainable parameters in your model, use the print_trainable_parameters method. In this case, youโ€™re only training 0.19% of the modelโ€™s parameters! ๐Ÿค

Copied

That is it ๐ŸŽ‰! Now you can train the model using the ๐ŸŒ Transformers Trainer, ๐ŸŒ Accelerate, or any custom PyTorch training loop.

Save and load a model

After your model is finished training, you can save your model to a directory using the save_pretrained function. You can also save your model to the Hub (make sure you log in to your BOINC AI account first) with the push_to_hub function.

Copied

This only saves the incremental ๐ŸŒ PEFT weights that were trained, meaning it is super efficient to store, transfer, and load. For example, this bigscience/T0_3B model trained with LoRA on the twitter_complaints subset of the RAFT dataset only contains two files: adapter_config.json and adapter_model.bin. The latter file is just 19MB!

Easily load your model for inference using the from_pretrained function:

Copied

Easy loading with Auto classes

If you have saved your adapter locally or on the Hub, you can leverage the AutoPeftModelForxxx classes and load any PEFT model with a single line of code:

Copied

Currently, supported auto classes are: AutoPeftModelForCausalLM, AutoPeftModelForSequenceClassification, AutoPeftModelForSeq2SeqLM, AutoPeftModelForTokenClassification, AutoPeftModelForQuestionAnswering and AutoPeftModelForFeatureExtraction. For other tasks (e.g. Whisper, StableDiffusion), you can load the model with:

Copied

Next steps

Now that youโ€™ve seen how to train a model with one of the ๐ŸŒ PEFT methods, we encourage you to try out some of the other methods like prompt tuning. The steps are very similar to the ones shown in this quickstart; prepare a PeftConfig for a ๐ŸŒ PEFT method, and use the get_peft_model to create a PeftModel from the configuration and base model. Then you can train it however you like!

Feel free to also take a look at the task guides if youโ€™re interested in training a model with a ๐ŸŒPEFT method for a specific task such as semantic segmentation, multilingual automatic speech recognition, DreamBooth, and token classification.

Last updated