Score SDE VE
Last updated
Last updated
(Score SDE) is by Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon and Ben Poole. This pipeline implements the variance expanding (VE) variant of the stochastic differential equation method.
The abstract from the paper is:
Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Crucially, the reverse-time SDE depends only on the time-dependent gradient field (\aka, score) of the perturbed data distribution. By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks, and use numerical SDE solvers to generate samples. We show that this framework encapsulates previous approaches in score-based generative modeling and diffusion probabilistic modeling, allowing for new sampling procedures and new modeling capabilities. In particular, we introduce a predictor-corrector framework to correct errors in the evolution of the discretized reverse-time SDE. We also derive an equivalent neural ODE that samples from the same distribution as the SDE, but additionally enables exact likelihood computation, and improved sampling efficiency. In addition, we provide a new way to solve inverse problems with score-based models, as demonstrated with experiments on class-conditional generation, image inpainting, and colorization. Combined with multiple architectural improvements, we achieve record-breaking performance for unconditional image generation on CIFAR-10 with an Inception score of 9.89 and FID of 2.20, a competitive likelihood of 2.99 bits/dim, and demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
The original codebase can be found at .
Make sure to check out the Schedulers to learn how to explore the tradeoff between scheduler speed and quality, and see the section to learn how to efficiently load the same components into multiple pipelines.
( unet: UNet2DModelscheduler: ScoreSdeVeScheduler )
Parameters
unet () — A UNet2DModel
to denoise the encoded image.
scheduler () — A ScoreSdeVeScheduler
to be used in combination with unet
to denoise the encoded image.
Pipeline for unconditional image generation.
This model inherits from . Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, running on a particular device, etc.).
__call__
Parameters
batch_size (int
, optional, defaults to 1) — The number of images to generate.
output_type (str
, optional
, defaults to "pil"
) — The output format of the generated image. Choose between PIL.Image
or np.array
.
Returns
The call function to the pipeline for generation.
( images: typing.Union[typing.List[PIL.Image.Image], numpy.ndarray] )
Parameters
images (List[PIL.Image.Image]
or np.ndarray
) — List of denoised PIL images of length batch_size
or NumPy array of shape (batch_size, height, width, num_channels)
.
Output class for image pipelines.
( batch_size: int = 1num_inference_steps: int = 2000generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = Noneoutput_type: typing.Optional[str] = 'pil'return_dict: bool = True**kwargs ) → or tuple
generator (torch.Generator
, optional
) — A to make generation deterministic.
return_dict (bool
, optional, defaults to True
) — Whether or not to return a instead of a plain tuple.
or tuple
If return_dict
is True
, is returned, otherwise a tuple
is returned where the first element is a list with the generated images.