Diffusers BOINC AI docs
  • 🌍GET STARTED
    • Diffusers
    • Quicktour
    • Effective and efficient diffusion
    • Installation
  • 🌍TUTORIALS
    • Overview
    • Understanding models and schedulers
    • AutoPipeline
    • Train a diffusion model
  • 🌍USING DIFFUSERS
    • 🌍LOADING & HUB
      • Overview
      • Load pipelines, models, and schedulers
      • Load and compare different schedulers
      • Load community pipelines
      • Load safetensors
      • Load different Stable Diffusion formats
      • Push files to the Hub
    • 🌍TASKS
      • Unconditional image generation
      • Text-to-image
      • Image-to-image
      • Inpainting
      • Depth-to-image
    • 🌍TECHNIQUES
      • Textual inversion
      • Distributed inference with multiple GPUs
      • Improve image quality with deterministic generation
      • Control image brightness
      • Prompt weighting
    • 🌍PIPELINES FOR INFERENCE
      • Overview
      • Stable Diffusion XL
      • ControlNet
      • Shap-E
      • DiffEdit
      • Distilled Stable Diffusion inference
      • Create reproducible pipelines
      • Community pipelines
      • How to contribute a community pipeline
    • 🌍TRAINING
      • Overview
      • Create a dataset for training
      • Adapt a model to a new task
      • Unconditional image generation
      • Textual Inversion
      • DreamBooth
      • Text-to-image
      • Low-Rank Adaptation of Large Language Models (LoRA)
      • ControlNet
      • InstructPix2Pix Training
      • Custom Diffusion
      • T2I-Adapters
    • 🌍TAKING DIFFUSERS BEYOND IMAGES
      • Other Modalities
  • 🌍OPTIMIZATION/SPECIAL HARDWARE
    • Overview
    • Memory and Speed
    • Torch2.0 support
    • Stable Diffusion in JAX/Flax
    • xFormers
    • ONNX
    • OpenVINO
    • Core ML
    • MPS
    • Habana Gaudi
    • Token Merging
  • 🌍CONCEPTUAL GUIDES
    • Philosophy
    • Controlled generation
    • How to contribute?
    • Diffusers' Ethical Guidelines
    • Evaluating Diffusion Models
  • 🌍API
    • 🌍MAIN CLASSES
      • Attention Processor
      • Diffusion Pipeline
      • Logging
      • Configuration
      • Outputs
      • Loaders
      • Utilities
      • VAE Image Processor
    • 🌍MODELS
      • Overview
      • UNet1DModel
      • UNet2DModel
      • UNet2DConditionModel
      • UNet3DConditionModel
      • VQModel
      • AutoencoderKL
      • AsymmetricAutoencoderKL
      • Tiny AutoEncoder
      • Transformer2D
      • Transformer Temporal
      • Prior Transformer
      • ControlNet
    • 🌍PIPELINES
      • Overview
      • AltDiffusion
      • Attend-and-Excite
      • Audio Diffusion
      • AudioLDM
      • AudioLDM 2
      • AutoPipeline
      • Consistency Models
      • ControlNet
      • ControlNet with Stable Diffusion XL
      • Cycle Diffusion
      • Dance Diffusion
      • DDIM
      • DDPM
      • DeepFloyd IF
      • DiffEdit
      • DiT
      • IF
      • PaInstructPix2Pix
      • Kandinsky
      • Kandinsky 2.2
      • Latent Diffusionge
      • MultiDiffusion
      • MusicLDM
      • PaintByExample
      • Parallel Sampling of Diffusion Models
      • Pix2Pix Zero
      • PNDM
      • RePaint
      • Score SDE VE
      • Self-Attention Guidance
      • Semantic Guidance
      • Shap-E
      • Spectrogram Diffusion
      • 🌍STABLE DIFFUSION
        • Overview
        • Text-to-image
        • Image-to-image
        • Inpainting
        • Depth-to-image
        • Image variation
        • Safe Stable Diffusion
        • Stable Diffusion 2
        • Stable Diffusion XL
        • Latent upscaler
        • Super-resolution
        • LDM3D Text-to-(RGB, Depth)
        • Stable Diffusion T2I-adapter
        • GLIGEN (Grounded Language-to-Image Generation)
      • Stable unCLIP
      • Stochastic Karras VE
      • Text-to-image model editing
      • Text-to-video
      • Text2Video-Zero
      • UnCLIP
      • Unconditional Latent Diffusion
      • UniDiffuser
      • Value-guided sampling
      • Versatile Diffusion
      • VQ Diffusion
      • Wuerstchen
    • 🌍SCHEDULERS
      • Overview
      • CMStochasticIterativeScheduler
      • DDIMInverseScheduler
      • DDIMScheduler
      • DDPMScheduler
      • DEISMultistepScheduler
      • DPMSolverMultistepInverse
      • DPMSolverMultistepScheduler
      • DPMSolverSDEScheduler
      • DPMSolverSinglestepScheduler
      • EulerAncestralDiscreteScheduler
      • EulerDiscreteScheduler
      • HeunDiscreteScheduler
      • IPNDMScheduler
      • KarrasVeScheduler
      • KDPM2AncestralDiscreteScheduler
      • KDPM2DiscreteScheduler
      • LMSDiscreteScheduler
      • PNDMScheduler
      • RePaintScheduler
      • ScoreSdeVeScheduler
      • ScoreSdeVpScheduler
      • UniPCMultistepScheduler
      • VQDiffusionScheduler
Powered by GitBook
On this page
  • CMStochasticIterativeScheduler
  • CMStochasticIterativeScheduler
  • CMStochasticIterativeSchedulerOutput
  1. API
  2. SCHEDULERS

CMStochasticIterativeScheduler

PreviousOverviewNextDDIMInverseScheduler

Last updated 1 year ago

CMStochasticIterativeScheduler

by Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever introduced a multistep and onestep scheduler (Algorithm 1) that is capable of generating good samples in one or a small number of steps.

The abstract from the paper is:

Diffusion models have made significant breakthroughs in image, audio, and video generation, but they depend on an iterative generation process that causes slow sampling speed and caps their potential for real-time applications. To overcome this limitation, we propose consistency models, a new family of generative models that achieve high sample quality without adversarial training. They support fast one-step generation by design, while still allowing for few-step sampling to trade compute for sample quality. They also support zero-shot data editing, like image inpainting, colorization, and super-resolution, without requiring explicit training on these tasks. Consistency models can be trained either as a way to distill pre-trained diffusion models, or as standalone generative models. Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step generation. For example, we achieve the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained as standalone generative models, consistency models also outperform single-step, non-adversarial generative models on standard benchmarks like CIFAR-10, ImageNet 64x64 and LSUN 256x256.

The original codebase can be found at .

CMStochasticIterativeScheduler

class diffusers.CMStochasticIterativeScheduler

( num_train_timesteps: int = 40sigma_min: float = 0.002sigma_max: float = 80.0sigma_data: float = 0.5s_noise: float = 1.0rho: float = 7.0clip_denoised: bool = True )

Parameters

  • num_train_timesteps (int, defaults to 40) β€” The number of diffusion steps to train the model.

  • sigma_min (float, defaults to 0.002) β€” Minimum noise magnitude in the sigma schedule. Defaults to 0.002 from the original implementation.

  • sigma_max (float, defaults to 80.0) β€” Maximum noise magnitude in the sigma schedule. Defaults to 80.0 from the original implementation.

  • sigma_data (float, defaults to 0.5) β€” The standard deviation of the data distribution from the EDM . Defaults to 0.5 from the original implementation.

  • s_noise (float, defaults to 1.0) β€” The amount of additional noise to counteract loss of detail during sampling. A reasonable range is [1.000, 1.011]. Defaults to 1.0 from the original implementation.

  • rho (float, defaults to 7.0) β€” The parameter for calculating the Karras sigma schedule from the EDM . Defaults to 7.0 from the original implementation.

  • clip_denoised (bool, defaults to True) β€” Whether to clip the denoised outputs to (-1, 1).

  • timesteps (List or np.ndarray or torch.Tensor, optional) β€” An explicit timestep schedule that can be optionally specified. The timesteps are expected to be in increasing order.

Multistep and onestep sampling for consistency models.

get_scalings_for_boundary_condition

( sigma ) β†’ tuple

Parameters

  • sigma (torch.FloatTensor) β€” The current sigma in the Karras sigma schedule.

Returns

tuple

A two-element tuple where c_skip (which weights the current sample) is the first element and c_out (which weights the consistency model output) is the second element.

epsilon in the equations for c_skip and c_out is set to sigma_min.

scale_model_input

( sample: FloatTensortimestep: typing.Union[float, torch.FloatTensor] ) β†’ torch.FloatTensor

Parameters

  • sample (torch.FloatTensor) β€” The input sample.

  • timestep (float or torch.FloatTensor) β€” The current timestep in the diffusion chain.

Returns

torch.FloatTensor

A scaled input sample.

Scales the consistency model input by (sigma**2 + sigma_data**2) ** 0.5.

set_timesteps

( num_inference_steps: typing.Optional[int] = Nonedevice: typing.Union[str, torch.device] = Nonetimesteps: typing.Optional[typing.List[int]] = None )

Parameters

  • num_inference_steps (int) β€” The number of diffusion steps used when generating samples with a pre-trained model.

  • device (str or torch.device, optional) β€” The device to which the timesteps should be moved to. If None, the timesteps are not moved.

  • timesteps (List[int], optional) β€” Custom timesteps used to support arbitrary spacing between timesteps. If None, then the default timestep spacing strategy of equal spacing between timesteps is used. If timesteps is passed, num_inference_steps must be None.

Sets the timesteps used for the diffusion chain (to be run before inference).

sigma_to_t

( sigmas: typing.Union[float, numpy.ndarray] ) β†’ float or np.ndarray

Parameters

  • sigmas (float or np.ndarray) β€” A single Karras sigma or an array of Karras sigmas.

Returns

float or np.ndarray

A scaled input timestep or scaled input timestep array.

Gets scaled timesteps from the Karras sigmas for input to the consistency model.

step

Parameters

  • model_output (torch.FloatTensor) β€” The direct output from the learned diffusion model.

  • timestep (float) β€” The current timestep in the diffusion chain.

  • sample (torch.FloatTensor) β€” A current instance of a sample created by the diffusion process.

  • generator (torch.Generator, optional) β€” A random number generator.

Returns

Predict the sample from the previous timestep by reversing the SDE. This function propagates the diffusion process from the learned model outputs (most often the predicted noise).

CMStochasticIterativeSchedulerOutput

class diffusers.schedulers.scheduling_consistency_models.CMStochasticIterativeSchedulerOutput

( prev_sample: FloatTensor )

Parameters

  • prev_sample (torch.FloatTensor of shape (batch_size, num_channels, height, width) for images) β€” Computed sample (x_{t-1}) of previous timestep. prev_sample should be used as next model input in the denoising loop.

Output class for the scheduler’s step function.

This model inherits from and . Check the superclass documentation for the generic methods the library implements for all schedulers such as loading and saving.

Gets the scalings used in the consistency model parameterization (from Appendix C of the ) to enforce boundary condition.

( model_output: FloatTensortimestep: typing.Union[float, torch.FloatTensor]sample: FloatTensorgenerator: typing.Optional[torch._C.Generator] = Nonereturn_dict: bool = True ) β†’ or tuple

return_dict (bool, optional, defaults to True) β€” Whether or not to return a or tuple.

or tuple

If return_dict is True, is returned, otherwise a tuple is returned where the first element is the sample tensor.

🌍
🌍
Consistency Models
openai/consistency_models
<source>
paper
paper
SchedulerMixin
ConfigMixin
<source>
paper
<source>
<source>
<source>
<source>
CMStochasticIterativeSchedulerOutput
CMStochasticIterativeSchedulerOutput
CMStochasticIterativeSchedulerOutput
CMStochasticIterativeSchedulerOutput
<source>