Attention Processor
Attention Processor
An attention processor is a class for applying different types of attention mechanisms.
AttnProcessor
class diffusers.models.attention_processor.AttnProcessor
( )
Default processor for performing attention-related computations.
AttnProcessor2_0
class diffusers.models.attention_processor.AttnProcessor2_0
( )
Processor for implementing scaled dot-product attention (enabled by default if youβre using PyTorch 2.0).
LoRAAttnProcessor
class diffusers.models.attention_processor.LoRAAttnProcessor
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None**kwargs )
Parameters
hidden_size (
int, optional) β The hidden size of the attention layer.cross_attention_dim (
int, optional) β The number of channels in theencoder_hidden_states.rank (
int, defaults to 4) β The dimension of the LoRA update matrices.network_alpha (
int, optional) β Equivalent toalphabut itβs usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism.
LoRAAttnProcessor2_0
class diffusers.models.attention_processor.LoRAAttnProcessor2_0
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None**kwargs )
Parameters
hidden_size (
int) β The hidden size of the attention layer.cross_attention_dim (
int, optional) β The number of channels in theencoder_hidden_states.rank (
int, defaults to 4) β The dimension of the LoRA update matrices.network_alpha (
int, optional) β Equivalent toalphabut itβs usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism using PyTorch 2.0βs memory-efficient scaled dot-product attention.
CustomDiffusionAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionAttnProcessor
( train_kv = Truetrain_q_out = Truehidden_size = Nonecross_attention_dim = Noneout_bias = Truedropout = 0.0 )
Parameters
train_kv (
bool, defaults toTrue) β Whether to newly train the key and value matrices corresponding to the text features.train_q_out (
bool, defaults toTrue) β Whether to newly train query matrices corresponding to the latent image features.hidden_size (
int, optional, defaults toNone) β The hidden size of the attention layer.cross_attention_dim (
int, optional, defaults toNone) β The number of channels in theencoder_hidden_states.out_bias (
bool, defaults toTrue) β Whether to include the bias parameter intrain_q_out.dropout (
float, optional, defaults to 0.0) β The dropout probability to use.
Processor for implementing attention for the Custom Diffusion method.
AttnAddedKVProcessor
class diffusers.models.attention_processor.AttnAddedKVProcessor
( )
Processor for performing attention-related computations with extra learnable key and value matrices for the text encoder.
AttnAddedKVProcessor2_0
class diffusers.models.attention_processor.AttnAddedKVProcessor2_0
( )
Processor for performing scaled dot-product attention (enabled by default if youβre using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
LoRAAttnAddedKVProcessor
class diffusers.models.attention_processor.LoRAAttnAddedKVProcessor
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None )
Parameters
hidden_size (
int, optional) β The hidden size of the attention layer.cross_attention_dim (
int, optional, defaults toNone) β The number of channels in theencoder_hidden_states.rank (
int, defaults to 4) β The dimension of the LoRA update matrices.
Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text encoder.
XFormersAttnProcessor
class diffusers.models.attention_processor.XFormersAttnProcessor
( attention_op: typing.Optional[typing.Callable] = None )
Parameters
attention_op (
Callable, optional, defaults toNone) β The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers.
LoRAXFormersAttnProcessor
class diffusers.models.attention_processor.LoRAXFormersAttnProcessor
( hidden_sizecross_attention_dimrank = 4attention_op: typing.Optional[typing.Callable] = Nonenetwork_alpha = None**kwargs )
Parameters
hidden_size (
int, optional) β The hidden size of the attention layer.cross_attention_dim (
int, optional) β The number of channels in theencoder_hidden_states.rank (
int, defaults to 4) β The dimension of the LoRA update matrices.attention_op (
Callable, optional, defaults toNone) β The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.network_alpha (
int, optional) β Equivalent toalphabut itβs usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers.
CustomDiffusionXFormersAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessor
( train_kv = Truetrain_q_out = Falsehidden_size = Nonecross_attention_dim = Noneout_bias = Truedropout = 0.0attention_op: typing.Optional[typing.Callable] = None )
Parameters
train_kv (
bool, defaults toTrue) β Whether to newly train the key and value matrices corresponding to the text features.train_q_out (
bool, defaults toTrue) β Whether to newly train query matrices corresponding to the latent image features.hidden_size (
int, optional, defaults toNone) β The hidden size of the attention layer.cross_attention_dim (
int, optional, defaults toNone) β The number of channels in theencoder_hidden_states.out_bias (
bool, defaults toTrue) β Whether to include the bias parameter intrain_q_out.dropout (
float, optional, defaults to 0.0) β The dropout probability to use.attention_op (
Callable, optional, defaults toNone) β The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
SlicedAttnProcessor
class diffusers.models.attention_processor.SlicedAttnProcessor
( slice_size )
Parameters
slice_size (
int, optional) β The number of steps to compute attention. Uses as many slices asattention_head_dim // slice_size, andattention_head_dimmust be a multiple of theslice_size.
Processor for implementing sliced attention.
SlicedAttnAddedKVProcessor
class diffusers.models.attention_processor.SlicedAttnAddedKVProcessor
( slice_size )
Parameters
slice_size (
int, optional) β The number of steps to compute attention. Uses as many slices asattention_head_dim // slice_size, andattention_head_dimmust be a multiple of theslice_size.
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.
Last updated