What Is a Diffusion Model?
One-Sentence Summary
A diffusion model is a generative model that learns how to create data by reversing a gradual noising process.
Why It Matters
Diffusion models are widely used for image, audio, video, and 3D generation because they can produce high-quality samples while giving the model a stable learning objective.
Core Ideas
- Forward process: gradually add noise to clean data.
- Reverse process: train a model to remove noise step by step.
- Conditioning: guide generation with text, images, labels, or other signals.
- Sampling: start from noise and repeatedly denoise until a sample appears.
Placeholder Example
For image generation, the model starts with random noise and gradually turns it into a coherent image according to the prompt or condition.
Notes to Expand Later
- Add a simple noise-to-image diagram.
- Explain the difference between DDPM and latent diffusion.
- Add a short section on why denoising is easier than direct generation.
一句话总结
Diffusion model 是一种生成模型,它通过学习“如何反向去噪”来生成数据。
为什么重要
Diffusion model 常用于图像、音频、视频和 3D 生成,因为它可以生成高质量样本,同时训练目标相对稳定。
核心概念
- Forward process: 从干净数据开始,逐步加入噪声。
- Reverse process: 训练模型一步一步去除噪声。
- Conditioning: 用文本、图像、标签或其他信息引导生成过程。
- Sampling: 从随机噪声开始,反复去噪,直到得到最终样本。
占位例子
在图像生成中,模型一开始面对的是随机噪声,然后根据 prompt 或其他条件逐步把噪声变成一张有结构的图像。
之后可以扩展
- 加一个从噪声到图像的简单示意图。
- 解释 DDPM 和 latent diffusion 的区别。
- 写一小节说明为什么“去噪”比直接生成更容易建模。