Sushant Kumar

DDPM: Denoising Diffusion Probabilistic Models

Diffusion models, such as Stable Diffusion and Midjourney, are taking the world by the storm with their ability to generate high-quality images. For the curious minds, this blog post is a deep dive into understanding the basic building blocks of Denoising Diffusion Probabilistic Models (DDPM) and code implementation using PyTorch.

To keep things simple, let's first understand Diffusion through a simple example of real life.

Diffusion Process

Diffusion is a process where particles spread out from a high concentration to a low concentration. It is a natural phenomenon that can be observed in our daily lives. For example, when you pour a drop of ink into a glass of water, the ink particles spread out from the point of origin to the entire glass of water. This is the diffusion process.

Diffusion Process

In the context of images, the diffusion process is used to model the noise in the images. The noise is added to the image at each time step, and the image is diffused over time to remove the noise. The diffusion process is modeled using a series of steps, where the noise is added to the image at each step, and the image is diffused over time to remove the noise.

Forward Diffusion Process

As you can see in the figure above, the forward diffusion process starts with a clean image x_0 and adds noise at each time step t to generate a noisy image x_t till we reach x_T which is complete noise. The amount of noise added at each time step is controlled by the diffusion process and is modeled using a diffusion model.

  • We slowly and iteratively add noise to (corrupt) the images in our training set such that they “move out or move away” from their existing subspace.

  • What we are doing here is converting the unknown and complex distribution that our training set belongs to into one that is easy for us to sample a (data) point from and understand.

  • At the end of the forward process, the images become entirely unrecognizable. The complex data distribution (an image) is wholly transformed into a chosen simple distribution (pure-noise). Each image gets mapped to a space outside the data subspace.

Reverse Diffusion Process

Conversely, the reverse diffusion process starts with a noisy image x_T and removes the noise at each time step t to generate a clean image x_0. And, this is the process that we are interested in.

Reverse Diffusion Process

We want to model the reverse diffusion process so that we can start with complete noise (x_T) and remove it iteratively to generate high-quality images.

Now, every time we want to generate a new image, we can easily sample some random noise and run the reverse diffusion process to generate a high-quality image. Hence, generative models based on the diffusion process are called Denoising Diffusion Probabilistic Models (DDPM).

Denoising Diffusion Probabilistic Models (DDPM)

Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that use the diffusion process to model the noise in the images. The basic idea is to model the noise in the images as a diffusion process, and then use this model to generate high-quality images and denoise them.

References

  1. Jonathan Ho, Ajay Jain, Pieter Abbeel. Denoising Diffusion Probabilistic Models. arXiv:2006.11239