Low-Complexity CU Partitioning for 3D-HEVC Depth Maps via SE Attention and Ensemble Learning

Erlin Tian
JiaBao Zhang
Qiuwen Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Efficient Coding Unit (CU) partitioning is critical to reducing the computational complexity of 3D High Efficiency Video Coding (3D-HEVC), especially for depth maps. While many existing methods rely on either handcrafted features or deep learning, they often suffer from limited feature expressiveness and insufficient focus on structurally significant regions. To address these limitations, we propose a symmetry-aware, two-stage CU partitioning framework. First, a Convolutional Neural Network (CNN) equipped with a channel-wise Squeeze-and-Excitation (SE) attention mechanism is employed to extract multi-scale texture features, highlighting symmetry-relevant patterns. These deep features are then unified via Spatial Pyramid Pooling (SPP) and combined with handcrafted descriptors—such as neighborhood RDcost, directional gradients, and variance—and fed into a Bagged Tree classifier for final prediction. Additionally, a weighted voting strategy is adopted to replace conventional majority voting in the ensemble, enhancing robustness near decision boundaries. Experimental results show that the proposed method achieves an average 52.49% reduction in encoding time, with only a 0.39% increase in Bjøntegaard delta bitrate (BDBR), achieving an excellent trade-off between complexity and performance.

Version published to 10.21203/rs.3.rs-7021250/v1 on Research Square
Oct 27, 2025

Enhancing ConvNeXt for efficient small-size image classification

This article has 4 authors:
1. Jianwei Feng
2. Jinguo Mo
3. Hengliang Tan
4. Shuo Yang
This article has no evaluationsLatest version Nov 17, 2025
Real-Time Deepfake Detection via Frame-Level EfficientNet Ensemble and Client-Server Deployment

This article has 7 authors:
1. Vishwakalyan Patil
2. Akshay Sarapure
3. Harshal Poriwade
4. Jyoti Kamble
5. Sonam Bhandurge
6. Dhanashree Kulkarni
7. Anand Deshpande
This article has no evaluationsLatest version Nov 4, 2025
FiT: Feature Integration Transformer with Universal Language Interface for Multi-Task Vision

This article has 6 authors:
1. Sana Cheema
2. Ghulam Gilanie
3. Tariq Alsahfi
4. Sami Alesawi
5. Raed Alsini
6. Ali Daud
This article has no evaluationsLatest version Oct 9, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Enhancing ConvNeXt for efficient small-size image classification

Real-Time Deepfake Detection via Frame-Level EfficientNet Ensemble and Client-Server Deployment

FiT: Feature Integration Transformer with Universal Language Interface for Multi-Task Vision