Processor for performing scaled dot-product attention (enabled by default if youβre using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
LoRAAttnAddedKVProcessor
class diffusers.models.attention_processor.LoRAAttnAddedKVProcessor
attention_op (Callable, optional, defaults to None) β The base operator to use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers.
LoRAXFormersAttnProcessor
class diffusers.models.attention_processor.LoRAXFormersAttnProcessor
hidden_size (int, optional) β The hidden size of the attention layer.
cross_attention_dim (int, optional) β The number of channels in the encoder_hidden_states.
rank (int, defaults to 4) β The dimension of the LoRA update matrices.
attention_op (Callable, optional, defaults to None) β The base operator to use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best operator.
network_alpha (int, optional) β Equivalent to alpha but itβs usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers.
CustomDiffusionXFormersAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessor
train_kv (bool, defaults to True) β Whether to newly train the key and value matrices corresponding to the text features.
train_q_out (bool, defaults to True) β Whether to newly train query matrices corresponding to the latent image features.
hidden_size (int, optional, defaults to None) β The hidden size of the attention layer.
cross_attention_dim (int, optional, defaults to None) β The number of channels in the encoder_hidden_states.
out_bias (bool, defaults to True) β Whether to include the bias parameter in train_q_out.
dropout (float, optional, defaults to 0.0) β The dropout probability to use.
attention_op (Callable, optional, defaults to None) β The base operator to use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
SlicedAttnProcessor
class diffusers.models.attention_processor.SlicedAttnProcessor
slice_size (int, optional) β The number of steps to compute attention. Uses as many slices as attention_head_dim // slice_size, and attention_head_dim must be a multiple of the slice_size.
Processor for implementing sliced attention.
SlicedAttnAddedKVProcessor
class diffusers.models.attention_processor.SlicedAttnAddedKVProcessor
slice_size (int, optional) β The number of steps to compute attention. Uses as many slices as attention_head_dim // slice_size, and attention_head_dim must be a multiple of the slice_size.
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.