One-Sentence Summary

A diffusion model is a generative model that learns how to create data by reversing a gradual noising process.

Why It Matters

Diffusion models are widely used for image, audio, video, and 3D generation because they can produce high-quality samples while giving the model a stable learning objective.

Core Ideas

  • Forward process: gradually add noise to clean data.
  • Reverse process: train a model to remove noise step by step.
  • Conditioning: guide generation with text, images, labels, or other signals.
  • Sampling: start from noise and repeatedly denoise until a sample appears.

Placeholder Example

For image generation, the model starts with random noise and gradually turns it into a coherent image according to the prompt or condition.

Notes to Expand Later

  • Add a simple noise-to-image diagram.
  • Explain the difference between DDPM and latent diffusion.
  • Add a short section on why denoising is easier than direct generation.

一句话总结

Diffusion model 是一种生成模型,它通过学习“如何反向去噪”来生成数据。

为什么重要

Diffusion model 常用于图像、音频、视频和 3D 生成,因为它可以生成高质量样本,同时训练目标相对稳定。

核心概念

  • Forward process: 从干净数据开始,逐步加入噪声。
  • Reverse process: 训练模型一步一步去除噪声。
  • Conditioning: 用文本、图像、标签或其他信息引导生成过程。
  • Sampling: 从随机噪声开始,反复去噪,直到得到最终样本。

占位例子

在图像生成中,模型一开始面对的是随机噪声,然后根据 prompt 或其他条件逐步把噪声变成一张有结构的图像。

之后可以扩展

  • 加一个从噪声到图像的简单示意图。
  • 解释 DDPM 和 latent diffusion 的区别。
  • 写一小节说明为什么“去噪”比直接生成更容易建模。