Diffusion Models: Unlocking the “4 secrets” of High-quality Image Generation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Diffusion Model (DM) is a hot topic in deep generative models, and it is widely applied in the image generation fields. In diffusion models, there are 4 main “secrets” that affect the generation high-quality image generation: constructing diffusion model, improving the sampling speed, designing diffusion process, and guiding diffusion models. However, how to construct the diffusion model? How to improve the sampling speed? How to design the diffusion process? How to guide diffusion models? These are critical to enhancing the performance of diffusion models. However, so far, most of the review papers are summarized from the application aspect, and the 4 key technologies of diffusion model are few. In response to the above issues, this paper summarizes 4 key technologies and 6 applications. The main innovative works are as following: Firstly, how to construct diffusion models? The basic principle of the diffusion model are summarized from 3 aspects: denoising diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Secondly, how to improve the sampling speed? Some key techniques for improving the sampling speed are summarized from 3 aspects: non-Markovian sampling, knowledge distillation, and discrete optimization. Thirdly, how to design the diffusion process? This paper summarizes how to design diffusion process from 3 aspects: Latent Space, diffusion process based on transformer and non-Euclidean space. Fourthly, how to guide diffusion models? This paper summarizes how to guide the diffusion model from 3 aspects: classifier guidance, classifier-free guidance, and multimodal guidance. Fifthly, the applications of diffusion models in various fields are discussed from 6 aspects: image fusion, medical image segmentation, image restoration, text-to-image generation, image super-resolution and text-to-video generation. Finally, this paper discusses the challenges faced by diffusion model in image generation, compares the diffusion model with other generation models, and looks forward to the future development direction of diffusion model. This paper systematically points out the "4 secrets" of diffusion models in the image generation fields, providing significant reference value for their research in this field.