Attention Processor
Attention Processor
An attention processor is a class for applying different types of attention mechanisms.
AttnProcessor
class diffusers.models.attention_processor.AttnProcessor
( )
Default processor for performing attention-related computations.
AttnProcessor2_0
class diffusers.models.attention_processor.AttnProcessor2_0
( )
Processor for implementing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0).
LoRAAttnProcessor
class diffusers.models.attention_processor.LoRAAttnProcessor
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None**kwargs )
Parameters
hidden_size (
int
, optional) — The hidden size of the attention layer.cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
.rank (
int
, defaults to 4) — The dimension of the LoRA update matrices.network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism.
LoRAAttnProcessor2_0
class diffusers.models.attention_processor.LoRAAttnProcessor2_0
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None**kwargs )
Parameters
hidden_size (
int
) — The hidden size of the attention layer.cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
.rank (
int
, defaults to 4) — The dimension of the LoRA update matrices.network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism using PyTorch 2.0’s memory-efficient scaled dot-product attention.
CustomDiffusionAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionAttnProcessor
( train_kv = Truetrain_q_out = Truehidden_size = Nonecross_attention_dim = Noneout_bias = Truedropout = 0.0 )
Parameters
train_kv (
bool
, defaults toTrue
) — Whether to newly train the key and value matrices corresponding to the text features.train_q_out (
bool
, defaults toTrue
) — Whether to newly train query matrices corresponding to the latent image features.hidden_size (
int
, optional, defaults toNone
) — The hidden size of the attention layer.cross_attention_dim (
int
, optional, defaults toNone
) — The number of channels in theencoder_hidden_states
.out_bias (
bool
, defaults toTrue
) — Whether to include the bias parameter intrain_q_out
.dropout (
float
, optional, defaults to 0.0) — The dropout probability to use.
Processor for implementing attention for the Custom Diffusion method.
AttnAddedKVProcessor
class diffusers.models.attention_processor.AttnAddedKVProcessor
( )
Processor for performing attention-related computations with extra learnable key and value matrices for the text encoder.
AttnAddedKVProcessor2_0
class diffusers.models.attention_processor.AttnAddedKVProcessor2_0
( )
Processor for performing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
LoRAAttnAddedKVProcessor
class diffusers.models.attention_processor.LoRAAttnAddedKVProcessor
( hidden_sizecross_attention_dim = Nonerank = 4network_alpha = None )
Parameters
hidden_size (
int
, optional) — The hidden size of the attention layer.cross_attention_dim (
int
, optional, defaults toNone
) — The number of channels in theencoder_hidden_states
.rank (
int
, defaults to 4) — The dimension of the LoRA update matrices.
Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text encoder.
XFormersAttnProcessor
class diffusers.models.attention_processor.XFormersAttnProcessor
( attention_op: typing.Optional[typing.Callable] = None )
Parameters
attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers.
LoRAXFormersAttnProcessor
class diffusers.models.attention_processor.LoRAXFormersAttnProcessor
( hidden_sizecross_attention_dimrank = 4attention_op: typing.Optional[typing.Callable] = Nonenetwork_alpha = None**kwargs )
Parameters
hidden_size (
int
, optional) — The hidden size of the attention layer.cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
.rank (
int
, defaults to 4) — The dimension of the LoRA update matrices.attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator.network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers.
CustomDiffusionXFormersAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessor
( train_kv = Truetrain_q_out = Falsehidden_size = Nonecross_attention_dim = Noneout_bias = Truedropout = 0.0attention_op: typing.Optional[typing.Callable] = None )
Parameters
train_kv (
bool
, defaults toTrue
) — Whether to newly train the key and value matrices corresponding to the text features.train_q_out (
bool
, defaults toTrue
) — Whether to newly train query matrices corresponding to the latent image features.hidden_size (
int
, optional, defaults toNone
) — The hidden size of the attention layer.cross_attention_dim (
int
, optional, defaults toNone
) — The number of channels in theencoder_hidden_states
.out_bias (
bool
, defaults toTrue
) — Whether to include the bias parameter intrain_q_out
.dropout (
float
, optional, defaults to 0.0) — The dropout probability to use.attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
SlicedAttnProcessor
class diffusers.models.attention_processor.SlicedAttnProcessor
( slice_size )
Parameters
slice_size (
int
, optional) — The number of steps to compute attention. Uses as many slices asattention_head_dim // slice_size
, andattention_head_dim
must be a multiple of theslice_size
.
Processor for implementing sliced attention.
SlicedAttnAddedKVProcessor
class diffusers.models.attention_processor.SlicedAttnAddedKVProcessor
( slice_size )
Parameters
slice_size (
int
, optional) — The number of steps to compute attention. Uses as many slices asattention_head_dim // slice_size
, andattention_head_dim
must be a multiple of theslice_size
.
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.
Last updated