Image classification using LoRA

Image classification using LoRA

This guide demonstrates how to use LoRA, a low-rank approximation technique, to fine-tune an image classification model. By using LoRA from ๐ŸŒ PEFT, we can reduce the number of trainable parameters in the model to only 0.77% of the original.

LoRA achieves this reduction by adding low-rank โ€œupdate matricesโ€ to specific blocks of the model, such as the attention blocks. During fine-tuning, only these matrices are trained, while the original model parameters are left unchanged. At inference time, the update matrices are merged with the original model parameters to produce the final classification result.

For more information on LoRA, please refer to the original LoRA paper.

Install dependencies

Install the libraries required for model training:

Copied

!pip install transformers accelerate evaluate datasets peft -q

Check the versions of all required libraries to make sure you are up to date:

Copied

import transformers
import accelerate
import peft

print(f"Transformers version: {transformers.__version__}")
print(f"Accelerate version: {accelerate.__version__}")
print(f"PEFT version: {peft.__version__}")
"Transformers version: 4.27.4"
"Accelerate version: 0.18.0"
"PEFT version: 0.2.0"

Authenticate to share your model

To share the fine-tuned model at the end of the training with the community, authenticate using your ๐ŸŒ token. You can obtain your token from your account settings.

Copied

Select a model checkpoint to fine-tune

Choose a model checkpoint from any of the model architectures supported for image classification. When in doubt, refer to the image classification task guide in ๐ŸŒ Transformers documentation.

Copied

Load a dataset

To keep this exampleโ€™s runtime short, letโ€™s only load the first 5000 instances from the training set of the Food-101 dataset:

Copied

Dataset preparation

To prepare the dataset for training and evaluation, create label2id and id2label dictionaries. These will come in handy when performing inference and for metadata information:

Copied

Next, load the image processor of the model youโ€™re fine-tuning:

Copied

The image_processor contains useful information on which size the training and evaluation images should be resized to, as well as values that should be used to normalize the pixel values. Using the image_processor, prepare transformation functions for the datasets. These functions will include data augmentation and pixel scaling:

Copied

Split the dataset into training and validation sets:

Copied

Finally, set the transformation functions for the datasets accordingly:

Copied

Load and prepare a model

Before loading the model, letโ€™s define a helper function to check the total number of parameters a model has, as well as how many of them are trainable.

Copied

Itโ€™s important to initialize the original model correctly as it will be used as a base to create the PeftModel youโ€™ll actually fine-tune. Specify the label2id and id2label so that AutoModelForImageClassification can append a classification head to the underlying model, adapted for this dataset. You should see the following output:

Copied

Copied

Before creating a PeftModel, you can check the number of trainable parameters in the original model:

Copied

Next, use get_peft_model to wrap the base model so that โ€œupdateโ€ matrices are added to the respective places.

Copied

Letโ€™s unpack whatโ€™s going on here. To use LoRA, you need to specify the target modules in LoraConfig so that get_peft_model() knows which modules inside our model need to be amended with LoRA matrices. In this example, weโ€™re only interested in targeting the query and value matrices of the attention blocks of the base model. Since the parameters corresponding to these matrices are โ€œnamedโ€ โ€œqueryโ€ and โ€œvalueโ€ respectively, we specify them accordingly in the target_modules argument of LoraConfig.

We also specify modules_to_save. After wrapping the base model with get_peft_model() along with the config, we get a new model where only the LoRA parameters are trainable (so-called โ€œupdate matricesโ€) while the pre-trained parameters are kept frozen. However, we want the classifier parameters to be trained too when fine-tuning the base model on our custom dataset. To ensure that the classifier parameters are also trained, we specify modules_to_save. This also ensures that these modules are serialized alongside the LoRA trainable parameters when using utilities like save_pretrained() and push_to_hub().

Hereโ€™s what the other parameters mean:

  • r: The dimension used by the LoRA update matrices.

  • alpha: Scaling factor.

  • bias: Specifies if the bias parameters should be trained. None denotes none of the bias parameters will be trained.

r and alpha together control the total number of final trainable parameters when using LoRA, giving you the flexibility to balance a trade-off between end performance and compute efficiency.

By looking at the number of trainable parameters, you can see how many parameters weโ€™re actually training. Since the goal is to achieve parameter-efficient fine-tuning, you should expect to see fewer trainable parameters in the lora_model in comparison to the original model, which is indeed the case here.

Define training arguments

For model fine-tuning, use Trainer. It accepts several arguments which you can wrap using TrainingArguments.

Copied

Compared to non-PEFT methods, you can use a larger batch size since there are fewer parameters to train. You can also set a larger learning rate than the normal (1e-5 for example).

This can potentially also reduce the need to conduct expensive hyperparameter tuning experiments.

Prepare evaluation metric

Copied

The compute_metrics function takes a named tuple as input: predictions, which are the logits of the model as Numpy arrays, and label_ids, which are the ground-truth labels as Numpy arrays.

Define collation function

A collation function is used by Trainer to gather a batch of training and evaluation examples and prepare them in a format that is acceptable by the underlying model.

Copied

Train and evaluate

Bring everything together - model, training arguments, data, collation function, etc. Then, start the training!

Copied

In just a few minutes, the fine-tuned model shows 96% validation accuracy even on this small subset of the training dataset.

Copied

Share your model and run inference

Once the fine-tuning is done, share the LoRA parameters with the community like so:

Copied

When calling push_to_hub on the lora_model, only the LoRA parameters along with any modules specified in modules_to_save are saved. Take a look at the trained LoRA parameters. Youโ€™ll see that itโ€™s only 2.6 MB! This greatly helps with portability, especially when using a very large model to fine-tune (such as BLOOM).

Next, letโ€™s see how to load the LoRA updated parameters along with our base model for inference. When you wrap a base model with PeftModel, modifications are done in-place. To mitigate any concerns that might stem from in-place modifications, initialize the base model just like you did earlier and construct the inference model.

Copied

Letโ€™s now fetch an example image for inference.

Copied

image of beignets

First, instantiate an image_processor from the underlying model repo.

Copied

Then, prepare the example for inference.

Copied

Finally, run inference!

Copied

Last updated