Diffusion Models: Unlocking the “4 secrets” of High-quality Image Generation

Tao Zhou
zhe zhang
Mingzhe Zhang
Wenwen Chai
Yong Xia
fuyuan Hu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Diffusion Model (DM) is a hot topic in deep generative models, and it is widely applied in the image generation fields. In diffusion models, there are 4 main “secrets” that affect the generation high-quality image generation: constructing diffusion model, improving the sampling speed, designing diffusion process, and guiding diffusion models. However, how to construct the diffusion model? How to improve the sampling speed? How to design the diffusion process? How to guide diffusion models? These are critical to enhancing the performance of diffusion models. However, so far, most of the review papers are summarized from the application aspect, and the 4 key technologies of diffusion model are few. In response to the above issues, this paper summarizes 4 key technologies and 6 applications. The main innovative works are as following: Firstly, how to construct diffusion models? The basic principle of the diffusion model are summarized from 3 aspects: denoising diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Secondly, how to improve the sampling speed? Some key techniques for improving the sampling speed are summarized from 3 aspects: non-Markovian sampling, knowledge distillation, and discrete optimization. Thirdly, how to design the diffusion process? This paper summarizes how to design diffusion process from 3 aspects: Latent Space, diffusion process based on transformer and non-Euclidean space. Fourthly, how to guide diffusion models? This paper summarizes how to guide the diffusion model from 3 aspects: classifier guidance, classifier-free guidance, and multimodal guidance. Fifthly, the applications of diffusion models in various fields are discussed from 6 aspects: image fusion, medical image segmentation, image restoration, text-to-image generation, image super-resolution and text-to-video generation. Finally, this paper discusses the challenges faced by diffusion model in image generation, compares the diffusion model with other generation models, and looks forward to the future development direction of diffusion model. This paper systematically points out the "4 secrets" of diffusion models in the image generation fields, providing significant reference value for their research in this field.

Version published to 10.21203/rs.3.rs-5455299/v1 on Research Square
Dec 13, 2024

Edge-Aware Diffusion for Mobile Photo Enhancement: ASystematic Review and Comparative Latency Analysis

This article has 2 authors:
1. Vinodya Athukorala
2. R.G.N. Meegama
This article has no evaluationsLatest version Mar 12, 2026
Enhancing Quantum Diffusion Models for Complex Image Generation

This article has 5 authors:
1. Jeongbin Jo
2. Santanam Wishal
3. Shah Md Khalil Ul
4. Shan Zeng
5. Dikshant Dulal
This article has no evaluationsLatest version Feb 24, 2026
Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion

This article has 4 authors:
1. Yongli Xian
2. Heng Zhou
3. Zhijie Gong
4. Congzheng Wang
This article has no evaluationsLatest version Mar 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Edge-Aware Diffusion for Mobile Photo Enhancement: ASystematic Review and Comparative Latency Analysis

Enhancing Quantum Diffusion Models for Complex Image Generation

Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion