ControlNet
Last updated
Last updated
(ControlNet) by Lvmin Zhang and Maneesh Agrawala.
This example is based on the . It trains a ControlNet to fill circles using a .
Before running the scripts, make sure to install the library’s training dependencies.
To successfully run the latest versions of the example scripts, we highly recommend installing from source and keeping the installation up to date. We update the example scripts frequently and install example-specific requirements.
To do this, execute the following steps in a new virtual environment:
Copied
Then navigate into the
Copied
Now run:
Copied
Copied
Or for a default 🌍 Accelerate configuration without answering questions about your environment:
Copied
Or if your environment doesn’t support an interactive shell like a notebook:
Copied
Download the following images to condition our training with:
Copied
The training script creates and saves a diffusion_pytorch_model.bin
file in your repository.
Copied
This default configuration requires ~38GB VRAM.
By default, the training script logs outputs to tensorboard. Pass --report_to wandb
to use Weights & Biases.
Gradient accumulation with a smaller batch size can be used to reduce training requirements to ~20 GB VRAM.
Copied
Copied
After 300 steps with batch size 8
red circle with blue background
cyan circle with brown floral background
After 6000 steps with batch size 8:
red circle with blue background
cyan circle with brown floral background
Enable the following optimizations to train on a 16GB GPU:
Gradient checkpointing
Now you can launch the training script:
Copied
Enable the following optimizations to train on a 12GB GPU:
Gradient checkpointing
set gradients to None
Copied
When using enable_xformers_memory_efficient_attention
, please make sure to install xformers
by pip install xformers
.
We have not exhaustively tested DeepSpeed support for ControlNet. While the configuration does save memory, we have not confirmed whether the configuration trains successfully. You will very likely have to make changes to the config to have a successful training run.
Enable the following optimizations to train on a 8GB GPU:
Gradient checkpointing
set gradients to None
DeepSpeed stage 2 with parameter and optimizer offloading
fp16 mixed precision
You’ll have to configure your environment with accelerate config
to enable DeepSpeed stage 2.
The configuration file should look like this:
Copied
Changing the default Adam optimizer to DeepSpeed’s Adam deepspeed.ops.adam.DeepSpeedCPUAdam
gives a substantial speedup but it requires a CUDA toolchain with the same version as PyTorch. 8-bit optimizer does not seem to be compatible with DeepSpeed at the moment.
Copied
Copied
And initialize an 🌍 environment with:
The original dataset is hosted in the ControlNet , but we re-uploaded it to be compatible with 🌍 Datasets so that it can handle the data loading within the training script.
Our training examples use because that is what the original set of ControlNet models was trained on. However, ControlNet can be trained to augment any compatible Stable Diffusion model (such as ) or .
To use your own dataset, take a look at the guide.
Specify the MODEL_NAME
environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the argument.
accelerate
allows for seamless multi-GPU training. Follow the instructions for running distributed training with accelerate
. Here is an example command:
bitsandbyte’s 8-bit optimizer (take a look at the [installation](() instructions if you don’t already have it installed)
bitsandbyte’s 8-bit optimizer (take a look at the [installation](() instructions if you don’t already have it installed)
xFormers (take a look at the instructions if you don’t already have it installed)
bitsandbyte’s 8-bit optimizer (take a look at the [installation](() instructions if you don’t already have it installed)
xFormers (take a look at the instructions if you don’t already have it installed)
can offload tensors from VRAM to either CPU or NVME. This requires significantly more RAM (about 25 GB).
See for more DeepSpeed configuration options.
The trained model can be run with the . Set base_model_path
and controlnet_path
to the values --pretrained_model_name_or_path
and --output_dir
were respectively set to in the training script.
Training with is also supported via the train_controlnet_sdxl.py
script. Please refer to the docs .