Stable Diffusion 2

Stable Diffusion 2

Stable Diffusion 2 is a text-to-image latent diffusion model built upon the work of the original Stable Diffusion, and it was led by Robin Rombach and Katherine Crowson from Stability AI and LAION.

The Stable Diffusion 2.0 release includes robust text-to-image models trained using a brand new text encoder (OpenCLIP), developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases. The text-to-image models in this release can generate images with default resolutions of both 512x512 pixels and 768x768 pixels. These models are trained on an aesthetic subset of the LAION-5B dataset created by the DeepFloyd team at Stability AI, which is then further filtered to remove adult content using LAION’s NSFW filter.

For more details about how Stable Diffusion 2 works and how it differs from the original Stable Diffusion, please refer to the official announcement post.

The architecture of Stable Diffusion 2 is more or less identical to the original Stable Diffusion model so check out it’s API documentation for how to use Stable Diffusion 2. We recommend using the DPMSolverMultistepScheduler as it’s currently the fastest scheduler.

Stable Diffusion 2 is available for tasks like text-to-image, inpainting, super-resolution, and depth-to-image:

Here are some examples for how to use Stable Diffusion 2 for each task:

Make sure to check out the Stable Diffusion Tips section to learn how to explore the tradeoff between scheduler speed and quality, and how to reuse pipeline components efficiently!

If you’re interested in using one of the official checkpoints for a task, explore the CompVis, Runway, and Stability AI Hub organizations!

Text-to-image

Copied

Inpainting

Copied

Super-resolution

Copied

Depth-to-image

Copied

Last updated