DPMSolverSinglestepScheduler
Last updated
Last updated
DPMSolverSinglestepScheduler
is a single step scheduler from and by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
DPMSolver (and the improved version DPMSolver++) is a fast dedicated high-order solver for diffusion ODEs with convergence order guarantee. Empirically, DPMSolver sampling with only 20 steps can generate high-quality samples, and it can generate quite good samples even in 10 steps.
The original implementation can be found at .
It is recommended to set solver_order
to 2 for guide sampling, and solver_order=3
for unconditional sampling.
Dynamic thresholding from Imagen () is supported, and for pixel-space diffusion models, you can set both algorithm_type="dpmsolver++"
and thresholding=True
to use dynamic thresholding. This thresholding method is unsuitable for latent-space diffusion models such as Stable Diffusion.
( num_train_timesteps: int = 1000beta_start: float = 0.0001beta_end: float = 0.02beta_schedule: str = 'linear'trained_betas: typing.Optional[numpy.ndarray] = Nonesolver_order: int = 2prediction_type: str = 'epsilon'thresholding: bool = Falsedynamic_thresholding_ratio: float = 0.995sample_max_value: float = 1.0algorithm_type: str = 'dpmsolver++'solver_type: str = 'midpoint'lower_order_final: bool = Trueuse_karras_sigmas: typing.Optional[bool] = Falselambda_min_clipped: float = -infvariance_type: typing.Optional[str] = None )
Parameters
num_train_timesteps (int
, defaults to 1000) β The number of diffusion steps to train the model.
beta_start (float
, defaults to 0.0001) β The starting beta
value of inference.
beta_end (float
, defaults to 0.02) β The final beta
value.
beta_schedule (str
, defaults to "linear"
) β The beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from linear
, scaled_linear
, or squaredcos_cap_v2
.
trained_betas (np.ndarray
, optional) β Pass an array of betas directly to the constructor to bypass beta_start
and beta_end
.
solver_order (int
, defaults to 2) β The DPMSolver order which can be 1
or 2
or 3
. It is recommended to use solver_order=2
for guided sampling, and solver_order=3
for unconditional sampling.
prediction_type (str
, defaults to epsilon
, optional) β Prediction type of the scheduler function; can be epsilon
(predicts the noise of the diffusion process), sample
(directly predicts the noisy sample) or
v_prediction` (see section 2.4 of paper).
thresholding (bool
, defaults to False
) β Whether to use the βdynamic thresholdingβ method. This is unsuitable for latent-space diffusion models such as Stable Diffusion.
dynamic_thresholding_ratio (float
, defaults to 0.995) β The ratio for the dynamic thresholding method. Valid only when thresholding=True
.
sample_max_value (float
, defaults to 1.0) β The threshold value for dynamic thresholding. Valid only when thresholding=True
and algorithm_type="dpmsolver++"
.
algorithm_type (str
, defaults to dpmsolver++
) β Algorithm type for the solver; can be dpmsolver
, dpmsolver++
, sde-dpmsolver
or sde-dpmsolver++
. The dpmsolver
type implements the algorithms in the paper, and the dpmsolver++
type implements the algorithms in the paper. It is recommended to use dpmsolver++
or sde-dpmsolver++
with solver_order=2
for guided sampling like in Stable Diffusion.
solver_type (str
, defaults to midpoint
) β Solver type for the second-order solver; can be midpoint
or heun
. The solver type slightly affects the sample quality, especially for a small number of steps. It is recommended to use midpoint
solvers.
lower_order_final (bool
, defaults to True
) β Whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. This can stabilize the sampling of DPMSolver for steps < 15, especially for steps <= 10.
use_karras_sigmas (bool
, optional, defaults to False
) β Whether to use Karras sigmas for step sizes in the noise schedule during the sampling process. If True
, the sigmas are determined according to a sequence of noise levels {Οi}.
lambda_min_clipped (float
, defaults to -inf
) β Clipping threshold for the minimum value of lambda(t)
for numerical stability. This is critical for the cosine (squaredcos_cap_v2
) noise schedule.
variance_type (str
, optional) β Set to βlearnedβ or βlearned_rangeβ for diffusion models that predict variance. If set, the modelβs output contains the predicted Gaussian variance.
DPMSolverSinglestepScheduler
is a fast dedicated high-order solver for diffusion ODEs.
convert_model_output
( model_output: FloatTensortimestep: intsample: FloatTensor ) β torch.FloatTensor
Parameters
model_output (torch.FloatTensor
) β The direct output from the learned diffusion model.
timestep (int
) β The current discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by the diffusion process.
Returns
torch.FloatTensor
The converted model output.
Convert the model output to the corresponding type the DPMSolver/DPMSolver++ algorithm needs. DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to discretize an integral of the data prediction model.
The algorithm and model type are decoupled. You can use either DPMSolver or DPMSolver++ for both noise prediction and data prediction models.
dpm_solver_first_order_update
( model_output: FloatTensortimestep: intprev_timestep: intsample: FloatTensor ) β torch.FloatTensor
Parameters
model_output (torch.FloatTensor
) β The direct output from the learned diffusion model.
timestep (int
) β The current discrete timestep in the diffusion chain.
prev_timestep (int
) β The previous discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by the diffusion process.
Returns
torch.FloatTensor
The sample tensor at the previous timestep.
One step for the first-order DPMSolver (equivalent to DDIM).
get_order_list
( num_inference_steps: int )
Parameters
num_inference_steps (int
) β The number of diffusion steps used when generating samples with a pre-trained model.
Computes the solver order at each time step.
scale_model_input
( sample: FloatTensor*args**kwargs ) β torch.FloatTensor
Parameters
sample (torch.FloatTensor
) β The input sample.
Returns
torch.FloatTensor
A scaled input sample.
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
( num_inference_steps: intdevice: typing.Union[str, torch.device] = None )
Parameters
num_inference_steps (int
) β The number of diffusion steps used when generating samples with a pre-trained model.
device (str
or torch.device
, optional) β The device to which the timesteps should be moved to. If None
, the timesteps are not moved.
Sets the discrete timesteps used for the diffusion chain (to be run before inference).
singlestep_dpm_solver_second_order_update
( model_output_list: typing.List[torch.FloatTensor]timestep_list: typing.List[int]prev_timestep: intsample: FloatTensor ) β torch.FloatTensor
Parameters
model_output_list (List[torch.FloatTensor]
) β The direct outputs from learned diffusion model at current and latter timesteps.
timestep (int
) β The current and latter discrete timestep in the diffusion chain.
prev_timestep (int
) β The previous discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by the diffusion process.
Returns
torch.FloatTensor
The sample tensor at the previous timestep.
One step for the second-order singlestep DPMSolver that computes the solution at time prev_timestep
from the time timestep_list[-2]
.
singlestep_dpm_solver_third_order_update
( model_output_list: typing.List[torch.FloatTensor]timestep_list: typing.List[int]prev_timestep: intsample: FloatTensor ) β torch.FloatTensor
Parameters
model_output_list (List[torch.FloatTensor]
) β The direct outputs from learned diffusion model at current and latter timesteps.
timestep (int
) β The current and latter discrete timestep in the diffusion chain.
prev_timestep (int
) β The previous discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by diffusion process.
Returns
torch.FloatTensor
The sample tensor at the previous timestep.
One step for the third-order singlestep DPMSolver that computes the solution at time prev_timestep
from the time timestep_list[-3]
.
singlestep_dpm_solver_update
( model_output_list: typing.List[torch.FloatTensor]timestep_list: typing.List[int]prev_timestep: intsample: FloatTensororder: int ) β torch.FloatTensor
Parameters
model_output_list (List[torch.FloatTensor]
) β The direct outputs from learned diffusion model at current and latter timesteps.
timestep (int
) β The current and latter discrete timestep in the diffusion chain.
prev_timestep (int
) β The previous discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by diffusion process.
order (int
) β The solver order at this step.
Returns
torch.FloatTensor
The sample tensor at the previous timestep.
One step for the singlestep DPMSolver.
step
Parameters
model_output (torch.FloatTensor
) β The direct output from learned diffusion model.
timestep (int
) β The current discrete timestep in the diffusion chain.
sample (torch.FloatTensor
) β A current instance of a sample created by the diffusion process.
Returns
Predict the sample from the previous timestep by reversing the SDE. This function propagates the sample with the singlestep DPMSolver.
( prev_sample: FloatTensor )
Parameters
prev_sample (torch.FloatTensor
of shape (batch_size, num_channels, height, width)
for images) β Computed sample (x_{t-1})
of previous timestep. prev_sample
should be used as next model input in the denoising loop.
Base class for the output of a schedulerβs step
function.
This model inherits from and . Check the superclass documentation for the generic methods the library implements for all schedulers such as loading and saving.
( model_output: FloatTensortimestep: intsample: FloatTensorreturn_dict: bool = True ) β or tuple
return_dict (bool
) β Whether or not to return a or tuple
.
or tuple
If return_dict is True
, is returned, otherwise a tuple is returned where the first element is the sample tensor.