PEFT
  • 🌍GET STARTED
    • BOINC AI PEFT
    • Quicktour
    • Installation
  • 🌍TASK GUIDES
    • Image classification using LoRA
    • Prefix tuning for conditional generation
    • Prompt tuning for causal language modeling
    • Semantic segmentation using LoRA
    • P-tuning for sequence classification
    • Dreambooth fine-tuning with LoRA
    • LoRA for token classification
    • int8 training for automatic speech recognition
    • Semantic similarity with LoRA
  • 🌍DEVELOPER GUIDES
    • Working with custom models
    • PEFT low level API
    • Contributing to PEFT
    • Troubleshooting
  • 🌍ACCELERATE INTEGRATIONS
    • DeepSpeed
    • PagFully Sharded Data Parallele 2
  • 🌍CONCEPTUAL GUIDES
    • LoRA
    • Prompting
    • IA3
  • 🌍REFERENCE
    • PEFT model
    • Configuration
    • Tuners
Powered by GitBook
On this page
  • Tuners
  • LoRA
  • P-tuning
  • Prefix tuning
  • Prompt tuning
  • IA3
  1. REFERENCE

Tuners

PreviousConfiguration

Last updated 1 year ago

Tuners

Each tuner (or PEFT method) has a configuration and model.

LoRA

For finetuning a model with LoRA.

class peft.LoraConfig

( peft_type: typing.Union[str, peft.utils.peft_types.PeftType] = Noneauto_mapping: typing.Optional[dict] = Nonebase_model_name_or_path: str = Nonerevision: str = Nonetask_type: typing.Union[str, peft.utils.peft_types.TaskType] = Noneinference_mode: bool = Falser: int = 8target_modules: typing.Union[str, typing.List[str], NoneType] = Nonelora_alpha: int = 8lora_dropout: float = 0.0fan_in_fan_out: bool = Falsebias: str = 'none'modules_to_save: typing.Optional[typing.List[str]] = Noneinit_lora_weights: bool = Truelayers_to_transform: typing.Union[int, typing.List[int], NoneType] = Nonelayers_pattern: typing.Union[str, typing.List[str], NoneType] = Nonerank_pattern: typing.Optional[dict] = <factory>alpha_pattern: typing.Optional[dict] = <factory> )

Parameters

  • r (int) — Lora attention dimension.

  • target_modules (Union[List[str],str]) — The names of the modules to apply Lora to.

  • lora_alpha (int) — The alpha parameter for Lora scaling.

  • lora_dropout (float) — The dropout probability for Lora layers.

  • fan_in_fan_out (bool) — Set this to True if the layer to replace stores weight like (fan_in, fan_out). For example, gpt-2 uses Conv1D which stores weights like (fan_in, fan_out) and hence this should be set to True.

  • bias (str) — Bias type for Lora. Can be ‘none’, ‘all’ or ‘lora_only’. If ‘all’ or ‘lora_only’, the corresponding biases will be updated during training. Be aware that this means that, even when disabling the adapters, the model will not produce the same output as the base model would have without adaptation.

  • modules_to_save (List[str]) —List of modules apart from LoRA layers to be set as trainable and saved in the final checkpoint.

  • layers_to_transform (Union[List[int],int]) — The layer indexes to transform, if this argument is specified, it will apply the LoRA transformations on the layer indexes that are specified in this list. If a single integer is passed, it will apply the LoRA transformations on the layer at this index.

  • layers_pattern (str) — The layer pattern name, used only if layers_to_transform is different from None and if the layer pattern is not in the common layers pattern.

  • rank_pattern (dict) — The mapping from layer names or regexp expression to ranks which are different from the default rank specified by r.

  • alpha_pattern (dict) — The mapping from layer names or regexp expression to alphas which are different from the default alpha specified by lora_alpha.

class peft.LoraModel

( modelconfigadapter_name ) → torch.nn.Module

Parameters

  • adapter_name (str) — The name of the adapter, defaults to "default".

Returns

torch.nn.Module

The Lora model.

Creates Low Rank Adapter (Lora) model from a pretrained transformers model.

Example:

Copied

>>> from transformers import AutoModelForSeq2SeqLM
>>> from peft import LoraModel, LoraConfig

>>> config = LoraConfig(
...     task_type="SEQ_2_SEQ_LM",
...     r=8,
...     lora_alpha=32,
...     target_modules=["q", "v"],
...     lora_dropout=0.01,
... )

>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")
>>> lora_model = LoraModel(model, config, "default")

Copied

>>> import transformers
>>> from peft import LoraConfig, PeftModel, get_peft_model, prepare_model_for_int8_training

>>> target_modules = ["q_proj", "k_proj", "v_proj", "out_proj", "fc_in", "fc_out", "wte"]
>>> config = LoraConfig(
...     r=4, lora_alpha=16, target_modules=target_modules, lora_dropout=0.1, bias="none", task_type="CAUSAL_LM"
... )

>>> model = transformers.GPTJForCausalLM.from_pretrained(
...     "kakaobrain/kogpt",
...     revision="KoGPT6B-ryan1.5b-float16",  # or float32 version: revision=KoGPT6B-ryan1.5b
...     pad_token_id=tokenizer.eos_token_id,
...     use_cache=False,
...     device_map={"": rank},
...     torch_dtype=torch.float16,
...     load_in_8bit=True,
... )
>>> model = prepare_model_for_int8_training(model)
>>> lora_model = get_peft_model(model, config)

Attributes:

add_weighted_adapter

( adaptersweightsadapter_namecombination_type = 'svd'svd_rank = Nonesvd_clamp = Nonesvd_full_matrices = Truesvd_driver = None )

Parameters

  • adapters (list) — List of adapter names to be merged.

  • weights (list) — List of weights for each adapter.

  • adapter_name (str) — Name of the new adapter.

  • combination_type (str) — Type of merging. Can be one of [svd, linear, cat]. When using the cat combination_type you should be aware that rank of the resulting adapter will be equal to the sum of all adapters ranks. So it’s possible that the mixed adapter may become too big and result in OOM errors.

  • svd_rank (int, optional) — Rank of output adapter for svd. If None provided, will use max rank of merging adapters.

  • svd_clamp (float, optional) — A quantile threshold for clamping SVD decomposition output. If None is provided, do not perform clamping. Defaults to None.

  • svd_full_matrices (bool, optional) — Controls whether to compute the full or reduced SVD, and consequently, the shape of the returned tensors U and Vh. Defaults to True.

  • svd_driver (str, optional) — Name of the cuSOLVER method to be used. This keyword argument only works when merging on CUDA. Can be one of [None, gesvd, gesvdj, gesvda]. For more info please refer to torch.linalg.svd documentation. Defaults to None.

This method adds a new adapter by merging the given adapters with the given weights.

When using the cat combination_type you should be aware that rank of the resulting adapter will be equal to the sum of all adapters ranks. So it’s possible that the mixed adapter may become too big and result in OOM errors.

delete_adapter

( adapter_name: str )

Parameters

  • adapter_name (str) — Name of the adapter to be deleted.

Deletes an existing adapter.

merge_and_unload

( progressbar: bool = Falsesafe_merge: bool = False )

Parameters

  • progressbar (bool) — whether to show a progressbar indicating the unload and merge process

  • safe_merge (bool) — whether to activate the safe merging check to check if there is any potential Nan in the adapter weights

This method merges the LoRa layers into the base model. This is needed if someone wants to use the base model as a standalone model.

Example:

Copied

>>> from transformers import AutoModelForCausalLM
>>> from peft import PeftModel

>>> base_model = AutoModelForCausalLM.from_pretrained("tiiuae/falcon-40b")
>>> peft_model_id = "smangrul/falcon-40B-int4-peft-lora-sfttrainer-sample"
>>> model = PeftModel.from_pretrained(base_model, peft_model_id)
>>> merged_model = model.merge_and_unload()

unload

( )

Gets back the base model by removing all the lora modules without merging. This gives back the original base model.

class peft.tuners.lora.LoraLayer

( in_features: intout_features: int**kwargs )

class peft.tuners.lora.Linear

( adapter_name: strin_features: intout_features: intr: int = 0lora_alpha: int = 1lora_dropout: float = 0.0fan_in_fan_out: bool = Falseis_target_conv_1d_layer: bool = False**kwargs )

get_delta_weight

( adapter )

Parameters

  • adapter (str) — The name of the adapter for which the delta weight should be computed.

Compute the delta weight for the given adapter.

merge

( safe_merge: bool = False )

Parameters

  • safe_merge (bool, optional) — If True, the merge operation will be performed in a copy of the original weights and check for NaNs before merging the weights. This is useful if you want to check if the merge operation will produce NaNs. Defaults to False.

Merge the active adapter weights into the base weights

P-tuning

class peft.PromptEncoderConfig

( peft_type: typing.Union[str, peft.utils.peft_types.PeftType] = Noneauto_mapping: typing.Optional[dict] = Nonebase_model_name_or_path: str = Nonerevision: str = Nonetask_type: typing.Union[str, peft.utils.peft_types.TaskType] = Noneinference_mode: bool = Falsenum_virtual_tokens: int = Nonetoken_dim: int = Nonenum_transformer_submodules: typing.Optional[int] = Nonenum_attention_heads: typing.Optional[int] = Nonenum_layers: typing.Optional[int] = Noneencoder_reparameterization_type: typing.Union[str, peft.tuners.p_tuning.config.PromptEncoderReparameterizationType] = <PromptEncoderReparameterizationType.MLP: 'MLP'>encoder_hidden_size: int = Noneencoder_num_layers: int = 2encoder_dropout: float = 0.0 )

Parameters

  • encoder_reparameterization_type (Union[PromptEncoderReparameterizationType, str]) — The type of reparameterization to use.

  • encoder_hidden_size (int) — The hidden size of the prompt encoder.

  • encoder_num_layers (int) — The number of layers of the prompt encoder.

  • encoder_dropout (float) — The dropout probability of the prompt encoder.

class peft.PromptEncoder

( config )

Parameters

The prompt encoder network that is used to generate the virtual token embeddings for p-tuning.

Example:

Copied

>>> from peft import PromptEncoder, PromptEncoderConfig

>>> config = PromptEncoderConfig(
...     peft_type="P_TUNING",
...     task_type="SEQ_2_SEQ_LM",
...     num_virtual_tokens=20,
...     token_dim=768,
...     num_transformer_submodules=1,
...     num_attention_heads=12,
...     num_layers=12,
...     encoder_reparameterization_type="MLP",
...     encoder_hidden_size=768,
... )

>>> prompt_encoder = PromptEncoder(config)

Attributes:

  • embedding (torch.nn.Embedding) — The embedding layer of the prompt encoder.

  • mlp_head (torch.nn.Sequential) — The MLP head of the prompt encoder if inference_mode=False.

  • lstm_head (torch.nn.LSTM) — The LSTM head of the prompt encoder if inference_mode=False and encoder_reparameterization_type="LSTM".

  • token_dim (int) — The hidden embedding dimension of the base transformer model.

  • input_size (int) — The input size of the prompt encoder.

  • output_size (int) — The output size of the prompt encoder.

  • hidden_size (int) — The hidden size of the prompt encoder.

  • total_virtual_tokens (int): The total number of virtual tokens of the prompt encoder.

  • encoder_type (Union[PromptEncoderReparameterizationType, str]): The encoder type of the prompt encoder.

Input shape: (batch_size, total_virtual_tokens)

Output shape: (batch_size, total_virtual_tokens, token_dim)

Prefix tuning

class peft.PrefixTuningConfig

( peft_type: typing.Union[str, peft.utils.peft_types.PeftType] = Noneauto_mapping: typing.Optional[dict] = Nonebase_model_name_or_path: str = Nonerevision: str = Nonetask_type: typing.Union[str, peft.utils.peft_types.TaskType] = Noneinference_mode: bool = Falsenum_virtual_tokens: int = Nonetoken_dim: int = Nonenum_transformer_submodules: typing.Optional[int] = Nonenum_attention_heads: typing.Optional[int] = Nonenum_layers: typing.Optional[int] = Noneencoder_hidden_size: int = Noneprefix_projection: bool = False )

Parameters

  • encoder_hidden_size (int) — The hidden size of the prompt encoder.

  • prefix_projection (bool) — Whether to project the prefix embeddings.

class peft.PrefixEncoder

( config )

Parameters

The torch.nn model to encode the prefix.

Example:

Copied

>>> from peft import PrefixEncoder, PrefixTuningConfig

>>> config = PrefixTuningConfig(
...     peft_type="PREFIX_TUNING",
...     task_type="SEQ_2_SEQ_LM",
...     num_virtual_tokens=20,
...     token_dim=768,
...     num_transformer_submodules=1,
...     num_attention_heads=12,
...     num_layers=12,
...     encoder_hidden_size=768,
... )
>>> prefix_encoder = PrefixEncoder(config)

Attributes:

  • embedding (torch.nn.Embedding) — The embedding layer of the prefix encoder.

  • transform (torch.nn.Sequential) — The two-layer MLP to transform the prefix embeddings if prefix_projection is True.

  • prefix_projection (bool) — Whether to project the prefix embeddings.

Input shape: (batch_size, num_virtual_tokens)

Output shape: (batch_size, num_virtual_tokens, 2*layers*hidden)

Prompt tuning

class peft.PromptTuningConfig

( peft_type: typing.Union[str, peft.utils.peft_types.PeftType] = Noneauto_mapping: typing.Optional[dict] = Nonebase_model_name_or_path: str = Nonerevision: str = Nonetask_type: typing.Union[str, peft.utils.peft_types.TaskType] = Noneinference_mode: bool = Falsenum_virtual_tokens: int = Nonetoken_dim: int = Nonenum_transformer_submodules: typing.Optional[int] = Nonenum_attention_heads: typing.Optional[int] = Nonenum_layers: typing.Optional[int] = Noneprompt_tuning_init: typing.Union[peft.tuners.prompt_tuning.config.PromptTuningInit, str] = <PromptTuningInit.RANDOM: 'RANDOM'>prompt_tuning_init_text: typing.Optional[str] = Nonetokenizer_name_or_path: typing.Optional[str] = None )

Parameters

  • prompt_tuning_init (Union[PromptTuningInit, str]) — The initialization of the prompt embedding.

  • prompt_tuning_init_text (str, optional) — The text to initialize the prompt embedding. Only used if prompt_tuning_init is TEXT.

  • tokenizer_name_or_path (str, optional) — The name or path of the tokenizer. Only used if prompt_tuning_init is TEXT.

class peft.PromptEmbedding

( configword_embeddings )

Parameters

  • word_embeddings (torch.nn.Module) — The word embeddings of the base transformer model.

The model to encode virtual tokens into prompt embeddings.

Attributes:

  • embedding (torch.nn.Embedding) — The embedding layer of the prompt embedding.

Example:

Copied

>>> from peft import PromptEmbedding, PromptTuningConfig

>>> config = PromptTuningConfig(
...     peft_type="PROMPT_TUNING",
...     task_type="SEQ_2_SEQ_LM",
...     num_virtual_tokens=20,
...     token_dim=768,
...     num_transformer_submodules=1,
...     num_attention_heads=12,
...     num_layers=12,
...     prompt_tuning_init="TEXT",
...     prompt_tuning_init_text="Predict if sentiment of this review is positive, negative or neutral",
...     tokenizer_name_or_path="t5-base",
... )

>>> # t5_model.shared is the word embeddings of the base model
>>> prompt_embedding = PromptEmbedding(config, t5_model.shared)

Input Shape: (batch_size, total_virtual_tokens)

Output Shape: (batch_size, total_virtual_tokens, token_dim)

IA3

class peft.IA3Config

( peft_type: typing.Union[str, peft.utils.peft_types.PeftType] = Noneauto_mapping: typing.Optional[dict] = Nonebase_model_name_or_path: str = Nonerevision: str = Nonetask_type: typing.Union[str, peft.utils.peft_types.TaskType] = Noneinference_mode: bool = Falsetarget_modules: typing.Union[str, typing.List[str], NoneType] = Nonefeedforward_modules: typing.Union[str, typing.List[str], NoneType] = Nonefan_in_fan_out: bool = Falsemodules_to_save: typing.Optional[typing.List[str]] = Noneinit_ia3_weights: bool = True )

Parameters

  • target_modules (Union[List[str],str]) — The names of the modules to apply (IA)^3 to.

  • feedforward_modules (Union[List[str],str]) — The names of the modules to be treated as feedforward modules, as in the original paper.

  • fan_in_fan_out (bool) — Set this to True if the layer to replace stores weight like (fan_in, fan_out). For example, gpt-2 uses Conv1D which stores weights like (fan_in, fan_out) and hence this should be set to True.

  • modules_to_save (List[str]) — List of modules apart from (IA)^3 layers to be set as trainable and saved in the final checkpoint.

  • init_ia3_weights (bool) — Whether to initialize the vectors in the (IA)^3 layers, defaults to True.

class peft.IA3Model

( modelconfigadapter_name ) → torch.nn.Module

Parameters

  • adapter_name (str) — The name of the adapter, defaults to "default".

Returns

torch.nn.Module

The (IA)^3 model.

Example:

Copied

>>> from transformers import AutoModelForSeq2SeqLM, ia3Config
>>> from peft import IA3Model, IA3Config

>>> config = IA3Config(
...     peft_type="IA3",
...     task_type="SEQ_2_SEQ_LM",
...     target_modules=["k", "v", "w0"],
...     feedforward_modules=["w0"],
... )

>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")
>>> ia3_model = IA3Model(config, model)

Attributes:

  • peft_config (ia3Config): The configuration of the (IA)^3 model.

merge_and_unload

( safe_merge: bool = False )

Parameters

  • safe_merge (bool, optional, defaults to False) — If True, the merge operation will be performed in a copy of the original weights and check for NaNs before merging the weights. This is useful if you want to check if the merge operation will produce NaNs. Defaults to False.

This method merges the (IA)^3 layers into the base model. This is needed if someone wants to use the base model as a standalone model.

This is the configuration class to store the configuration of a .

model () — The model to be adapted.

config () — The configuration of the Lora model.

model () — The model to be adapted.

peft_config (): The configuration of the Lora model.

This is the configuration class to store the configuration of a .

config () — The configuration of the prompt encoder.

This is the configuration class to store the configuration of a .

config () — The configuration of the prefix encoder.

This is the configuration class to store the configuration of a .

config () — The configuration of the prompt embedding.

This is the configuration class to store the configuration of a .

model () — The model to be adapted.

config () — The configuration of the (IA)^3 model.

Creates a Infused Adapter by Inhibiting and Amplifying Inner Activations ((IA)^3) model from a pretrained transformers model. The method is described in detail in

model () — The model to be adapted.

🌍
<source>
LoraModel
<source>
PreTrainedModel
LoraConfig
PreTrainedModel
LoraConfig
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
PromptEncoder
<source>
PromptEncoderConfig
<source>
PrefixEncoder
<source>
PrefixTuningConfig
<source>
PromptEmbedding
<source>
PromptTuningConfig
<source>
IA3Model
<source>
PreTrainedModel
IA3Config
https://arxiv.org/abs/2205.05638
PreTrainedModel
<source>