Diffusers BOINC AI docs
  • ๐ŸŒGET STARTED
    • Diffusers
    • Quicktour
    • Effective and efficient diffusion
    • Installation
  • ๐ŸŒTUTORIALS
    • Overview
    • Understanding models and schedulers
    • AutoPipeline
    • Train a diffusion model
  • ๐ŸŒUSING DIFFUSERS
    • ๐ŸŒLOADING & HUB
      • Overview
      • Load pipelines, models, and schedulers
      • Load and compare different schedulers
      • Load community pipelines
      • Load safetensors
      • Load different Stable Diffusion formats
      • Push files to the Hub
    • ๐ŸŒTASKS
      • Unconditional image generation
      • Text-to-image
      • Image-to-image
      • Inpainting
      • Depth-to-image
    • ๐ŸŒTECHNIQUES
      • Textual inversion
      • Distributed inference with multiple GPUs
      • Improve image quality with deterministic generation
      • Control image brightness
      • Prompt weighting
    • ๐ŸŒPIPELINES FOR INFERENCE
      • Overview
      • Stable Diffusion XL
      • ControlNet
      • Shap-E
      • DiffEdit
      • Distilled Stable Diffusion inference
      • Create reproducible pipelines
      • Community pipelines
      • How to contribute a community pipeline
    • ๐ŸŒTRAINING
      • Overview
      • Create a dataset for training
      • Adapt a model to a new task
      • Unconditional image generation
      • Textual Inversion
      • DreamBooth
      • Text-to-image
      • Low-Rank Adaptation of Large Language Models (LoRA)
      • ControlNet
      • InstructPix2Pix Training
      • Custom Diffusion
      • T2I-Adapters
    • ๐ŸŒTAKING DIFFUSERS BEYOND IMAGES
      • Other Modalities
  • ๐ŸŒOPTIMIZATION/SPECIAL HARDWARE
    • Overview
    • Memory and Speed
    • Torch2.0 support
    • Stable Diffusion in JAX/Flax
    • xFormers
    • ONNX
    • OpenVINO
    • Core ML
    • MPS
    • Habana Gaudi
    • Token Merging
  • ๐ŸŒCONCEPTUAL GUIDES
    • Philosophy
    • Controlled generation
    • How to contribute?
    • Diffusers' Ethical Guidelines
    • Evaluating Diffusion Models
  • ๐ŸŒAPI
    • ๐ŸŒMAIN CLASSES
      • Attention Processor
      • Diffusion Pipeline
      • Logging
      • Configuration
      • Outputs
      • Loaders
      • Utilities
      • VAE Image Processor
    • ๐ŸŒMODELS
      • Overview
      • UNet1DModel
      • UNet2DModel
      • UNet2DConditionModel
      • UNet3DConditionModel
      • VQModel
      • AutoencoderKL
      • AsymmetricAutoencoderKL
      • Tiny AutoEncoder
      • Transformer2D
      • Transformer Temporal
      • Prior Transformer
      • ControlNet
    • ๐ŸŒPIPELINES
      • Overview
      • AltDiffusion
      • Attend-and-Excite
      • Audio Diffusion
      • AudioLDM
      • AudioLDM 2
      • AutoPipeline
      • Consistency Models
      • ControlNet
      • ControlNet with Stable Diffusion XL
      • Cycle Diffusion
      • Dance Diffusion
      • DDIM
      • DDPM
      • DeepFloyd IF
      • DiffEdit
      • DiT
      • IF
      • PaInstructPix2Pix
      • Kandinsky
      • Kandinsky 2.2
      • Latent Diffusionge
      • MultiDiffusion
      • MusicLDM
      • PaintByExample
      • Parallel Sampling of Diffusion Models
      • Pix2Pix Zero
      • PNDM
      • RePaint
      • Score SDE VE
      • Self-Attention Guidance
      • Semantic Guidance
      • Shap-E
      • Spectrogram Diffusion
      • ๐ŸŒSTABLE DIFFUSION
        • Overview
        • Text-to-image
        • Image-to-image
        • Inpainting
        • Depth-to-image
        • Image variation
        • Safe Stable Diffusion
        • Stable Diffusion 2
        • Stable Diffusion XL
        • Latent upscaler
        • Super-resolution
        • LDM3D Text-to-(RGB, Depth)
        • Stable Diffusion T2I-adapter
        • GLIGEN (Grounded Language-to-Image Generation)
      • Stable unCLIP
      • Stochastic Karras VE
      • Text-to-image model editing
      • Text-to-video
      • Text2Video-Zero
      • UnCLIP
      • Unconditional Latent Diffusion
      • UniDiffuser
      • Value-guided sampling
      • Versatile Diffusion
      • VQ Diffusion
      • Wuerstchen
    • ๐ŸŒSCHEDULERS
      • Overview
      • CMStochasticIterativeScheduler
      • DDIMInverseScheduler
      • DDIMScheduler
      • DDPMScheduler
      • DEISMultistepScheduler
      • DPMSolverMultistepInverse
      • DPMSolverMultistepScheduler
      • DPMSolverSDEScheduler
      • DPMSolverSinglestepScheduler
      • EulerAncestralDiscreteScheduler
      • EulerDiscreteScheduler
      • HeunDiscreteScheduler
      • IPNDMScheduler
      • KarrasVeScheduler
      • KDPM2AncestralDiscreteScheduler
      • KDPM2DiscreteScheduler
      • LMSDiscreteScheduler
      • PNDMScheduler
      • RePaintScheduler
      • ScoreSdeVeScheduler
      • ScoreSdeVpScheduler
      • UniPCMultistepScheduler
      • VQDiffusionScheduler
Powered by GitBook
On this page
  • Load different Stable Diffusion formats
  • PyTorch .ckpt
  • Keras .pb or .h5
  • A1111 LoRA files
  1. USING DIFFUSERS
  2. LOADING & HUB

Load different Stable Diffusion formats

PreviousLoad safetensorsNextPush files to the Hub

Last updated 1 year ago

Load different Stable Diffusion formats

Stable Diffusion models are available in different formats depending on the framework theyโ€™re trained and saved with, and where you download them from. Converting these formats for use in ๐ŸŒ Diffusers allows you to use all the features supported by the library, such as for inference, , and a variety of techniques and methods for .

We highly recommend using the .safetensors format because it is more secure than traditional pickled files which are vulnerable and can be exploited to execute any code on your machine (learn more in the guide).

This guide will show you how to convert other Stable Diffusion formats to be compatible with ๐ŸŒ Diffusers.

PyTorch .ckpt

The checkpoint - or .ckpt - format is commonly used to store and save models. The .ckpt file contains the entire model and is typically several GBs in size. While you can load and use a .ckpt file directly with the method, it is generally better to convert the .ckpt file to ๐ŸŒ Diffusers so both formats are available.

There are two options for converting a .ckpt file; use a Space to convert the checkpoint or convert the .ckpt file with a script.

Convert with a Space

The easiest and most convenient way to convert a .ckpt file is to use the Space. You can follow the instructions on the Space to convert the .ckpt file.

This approach works well for basic models, but it may struggle with more customized models. Youโ€™ll know the Space failed if it returns an empty pull request or error. In this case, you can try converting the .ckpt file with a script.

Convert with a script

๐ŸŒ Diffusers provides a for converting .ckpt files. This approach is more reliable than the Space above.

Before you start, make sure you have a local clone of ๐ŸŒ Diffusers to run the script and log in to your BOINC AI account so you can open pull requests and push your converted model to the Hub.

Copied

huggingface-cli login

To use the script:

Copied

git lfs install
git clone https://huggingface.co/CiaraRowles/TemporalNet
  1. Open a pull request on the repository where youโ€™re converting the checkpoint from:

Copied

cd TemporalNet && git fetch origin refs/pr/13:pr/13
git checkout pr/13
  1. There are several input arguments to configure in the conversion script, but the most important ones are:

    • checkpoint_path: the path to the .ckpt file to convert.

    • original_config_file: a YAML file defining the configuration of the original architecture. If you canโ€™t find this file, try searching for the YAML file in the GitHub repository where you found the .ckpt file.

    • dump_path: the path to the converted model.

  2. Now you can run the script to convert the .ckpt file:

Copied

python ../diffusers/scripts/convert_original_stable_diffusion_to_diffusers.py --checkpoint_path temporalnetv3.ckpt --original_config_file cldm_v15.yaml --dump_path ./ --controlnet

Copied

git push origin pr/13:refs/pr/13

Keras .pb or .h5

๐Ÿงช This is an experimental feature. Only Stable Diffusion v1 checkpoints are supported by the Convert KerasCV Space at the moment.

The Convert KerasCV Space allows you to input the following:

  • Your BOINC AI token.

  • Paths to download the UNet and text encoder weights from. Depending on how the model was trained, you donโ€™t necessarily need to provide the paths to both the UNet and text encoder. For example, Textual Inversion only requires the embeddings from the text encoder and a text-to-image model only requires the UNet weights.

  • Placeholder token is only applicable for textual inversion models.

  • The output_repo_prefix is the name of the repository where the converted model is stored.

Click the Submit button to automatically convert the KerasCV checkpoint! Once the checkpoint is successfully converted, youโ€™ll see a link to the new repository containing the converted checkpoint. Follow the link to the new repository, and youโ€™ll see the Convert KerasCV Space generated a model card with an inference widget to try out the converted model.

If you prefer to run inference with code, click on the Use in Diffusers button in the upper right corner of the model card to copy and paste the code snippet:

Copied

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "sayakpaul/textual-inversion-cat-kerascv_sd_diffusers_pipeline", use_safetensors=True
)

Then you can generate an image like:

Copied

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "sayakpaul/textual-inversion-cat-kerascv_sd_diffusers_pipeline", use_safetensors=True
)
pipeline.to("cuda")

placeholder_token = "<my-funny-cat-token>"
prompt = f"two {placeholder_token} getting married, photorealistic, high quality"
image = pipeline(prompt, num_inference_steps=50).images[0]

A1111 LoRA files

Copied

from diffusers import DiffusionPipeline, UniPCMultistepScheduler
import torch

pipeline = DiffusionPipeline.from_pretrained(
    "andite/anything-v4.0", torch_dtype=torch.float16, safety_checker=None
).to("cuda")
pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config)

Copied

# uncomment to download the safetensor weights
#!wget https://civitai.com/api/download/models/19998 -O howls_moving_castle.safetensors

Copied

pipeline.load_lora_weights(".", weight_name="howls_moving_castle.safetensors")

Now you can use the pipeline to generate images:

Copied

prompt = "masterpiece, illustration, ultra-detailed, cityscape, san francisco, golden gate bridge, california, bay area, in the snow, beautiful detailed starry sky"
negative_prompt = "lowres, cropped, worst quality, low quality, normal quality, artifacts, signature, watermark, username, blurry, more than one bridge, bad architecture"

images = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=512,
    height=512,
    num_inference_steps=25,
    num_images_per_prompt=4,
    generator=torch.manual_seed(0),
).images

Display the images:

Copied

from diffusers.utils import make_image_grid

make_image_grid(images, 2, 2)

Git clone the repository containing the .ckpt file you want to convert. For this example, letโ€™s convert this .ckpt file:

For example, you can take the cldm_v15.yaml file from the repository because the TemporalNet model is a Stable Diffusion v1.5 and ControlNet model.

Once the conversion is done, upload your converted model and test out the resulting !

supports training for v1 and v2. However, it offers limited support for experimenting with Stable Diffusion models for inference and deployment whereas ๐ŸŒ Diffusers has a more complete set of features for this purpose, such as different , , and .

The Space converts .pb or .h5 files to PyTorch, and then wraps them in a so it is ready for inference. The converted checkpoint is stored in a repository on the BOINC AI Hub.

For this example, letโ€™s convert the checkpoint which was trained with Textual Inversion. It uses the special token <my-funny-cat> to personalize images with cats.

(A1111) is a popular web UI for Stable Diffusion that supports model sharing platforms like . Models trained with the Low-Rank Adaptation (LoRA) technique are especially popular because theyโ€™re fast to train and have a much smaller file size than a fully finetuned model. ๐ŸŒ Diffusers supports loading A1111 LoRA checkpoints with :

Download a LoRA checkpoint from Civitai; this example uses the checkpoint, but feel free to try out any LoRA checkpoint!

Load the LoRA checkpoint into the pipeline with the method:

๐ŸŒ
๐ŸŒ
TemporalNet
ControlNet
pull request
KerasCV
Stable Diffusion
noise schedulers
flash attention
other optimization techniques
Convert KerasCV
StableDiffusionPipeline
sayakpaul/textual-inversion-kerasio
Automatic1111
Civitai
load_lora_weights()
Howls Moving Castle,Interior/Scenery LoRA (Ghibli Stlye)
load_lora_weights()
using different schedulers
building your custom pipeline
optimizing inference speed
Load safetensors
from_single_file()
SD to Diffusers
conversion script