Novel Nesting of Deep Learning Domain Transfer and Hybrid Video Coding for Video Compression
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Efficient video compression is crucial for addressing the exponential growth of video content, which now constitutes a significant portion of global internet traffic. Traditional compression standards mainly include H.264 and H.265, while the current research trend is to partially or completely replace the architectures of these traditional methods with deep learning techniques. However, these two approaches are not mutually exclusive. Based on the idea, this paper proposes a new direction that combines rhythmically traditional video compression methods with deep learning techniques to achieve higher compression efficiency and improve reconstruction quality. We adopt a two-stage compression framework, where video frames are firstly down-sized using bicubic downsampling and then encoded using traditional codecs such as H.264 or H.265. Subsequently, we employ a deep learning-based Video Super-Resolution model to restore skillfully the compressed video frames. Furthermore, it is a challenge to construct structured temporal priors at different semantic levels to better model implicitly the abstraction process from local to global representation. Aiming at this, in our Video Super-Resolution model, we have made a specially designed domain to adaptively process the structured temporal priors for different semantic levels. Besides, unlike traditional compression methods, deep learning-based compression algorithms have high demands on computational resources. Currently, most research results are unable to execute 2160P video compression tasks on a single RTX 4090. Based on this, we design a Hierarchical Simplified Attention-Net to reduce model complexity, which can perform compression tasks at resolutions up to 2160P on a single RTX 4090 GPU. Finally, our model achieves more remarkable results on benchmark datasets such as UVG, MCL-JCV, and HEVC Classes B, C, D, and E.