GPU-Accelerated Parallel Intra Prediction for High-Efficiency Video Coding (HEVC)

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The increasing demand for high-resolution and real-time video applications has made efficient and accelerated compression methods crucial. While the High Efficiency Video Coding (HEVC) standard significantly reduces the bit rate compared to previous systems, it incurs high computational costs, particularly in the intra prediction and transformation stages. This study presents a GPU-accelerated parallel intra prediction and discrete cosine transform (DCT) application for the HEVC standard. The proposed method increases processing efficiency by utilizing CUDA-based thread-level parallelism and shared memory optimization. All 35 intra prediction modes can be computed simultaneously on GPU threads; additionally, two different GPU-based DCT variants are proposed to reduce memory access latency. Experiments conducted on the NVIDIA Jetson Xavier NX platform demonstrate up to 162x speedup compared to CPU implementations while preserving PSNR and visual quality metrics. The integration of parallel intra prediction and DCT into a single GPU-optimized pipeline demonstrates the high potential of the proposed approach for real-time visual computing, embedded imaging systems, and high-performance image processing applications. The source code and dataset used in the study have been made openly available on the Zenodo platform to ensure transparency and reproducibility (DOI: 10.5281/zenodo.17286247).

Article activity feed