Diffusion Models in Vision: A Survey

计算机科学 人工智能 扩散 扩散过程 机器学习 噪音(视频) 降噪 算法 知识管理 物理 创新扩散 图像(数学) 热力学
作者
Florinel-Alin Croitoru,Vlad Hondru,Radu Tudor Ionescu,Mubarak Shah
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [Institute of Electrical and Electronics Engineers]
卷期号:45 (9): 10850-10869 被引量:87
标识
DOI:10.1109/tpami.2023.3261988
摘要

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion process, step by step. Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens, i.e., low speeds due to the high number of steps involved during sampling. In this survey, we provide a comprehensive review of articles on denoising diffusion models applied in vision, comprising both theoretical and practical contributions in the field. First, we identify and present three generic diffusion modeling frameworks, which are based on denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. We further discuss the relations between diffusion models and other deep generative models, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing flows. Then, we introduce a multi-perspective categorization of diffusion models applied in computer vision. Finally, we illustrate the current limitations of diffusion models and envision some interesting directions for future research.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
3秒前
iiirving发布了新的文献求助10
4秒前
5秒前
小飞侠发布了新的文献求助10
5秒前
大橘发布了新的文献求助10
6秒前
一手灵魂完成签到,获得积分10
6秒前
Somnolence咩发布了新的文献求助10
7秒前
sun发布了新的文献求助10
9秒前
高贵薯片完成签到,获得积分10
10秒前
俞若枫完成签到,获得积分10
10秒前
10秒前
啥都不会完成签到,获得积分10
10秒前
12秒前
iiirving完成签到,获得积分20
13秒前
清楚发布了新的文献求助10
17秒前
322628完成签到,获得积分10
18秒前
高大苑睐完成签到 ,获得积分20
21秒前
31秒前
研友_5Zl9D8发布了新的文献求助10
36秒前
美文完成签到 ,获得积分10
42秒前
有机小虾米完成签到 ,获得积分10
42秒前
Jackpu完成签到,获得积分10
42秒前
一只小羊完成签到,获得积分10
43秒前
45秒前
李D发布了新的文献求助20
48秒前
常一斩发布了新的文献求助50
48秒前
50秒前
50秒前
50秒前
51秒前
奔流的河发布了新的文献求助10
54秒前
54秒前
涨芝士发布了新的文献求助10
55秒前
完美世界应助美文采纳,获得10
57秒前
欢呼雍完成签到,获得积分20
58秒前
59秒前
59秒前
cw777完成签到,获得积分10
1分钟前
1分钟前
高分求助中
Thermodynamic data for steelmaking 3000
Teaching Social and Emotional Learning in Physical Education 900
Comparative Rhetoric: A[n] Historical and Cross-Cultural Introduction 500
Cardiology: Board and Certification Review 400
[Lambert-Eaton syndrome without calcium channel autoantibodies] 300
Transformerboard III 300
Incidence and two-year neurodevelopmental outcomes of small-for-gestational-age preterm infants: how do they relate to using different neonatal anthropometric charts? 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2358956
求助须知:如何正确求助?哪些是违规求助? 2066259
关于积分的说明 5160371
捐赠科研通 1795285
什么是DOI,文献DOI怎么找? 896626
版权声明 557601
科研通“疑难数据库(出版商)”最低求助积分说明 478572