OG-HFYOLO: Orientation Gradient guidance and Heterogeneous feature fusion for deformation table cell instance segmentation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Table structure recognition is a key task in document analysis.However, geometric deformations in deformed tables weaken the correlation between content and structural information, thereby hindering downstream tasks’ ability to extract accurate content. To address this challenge, we propose the OG-HFYOLO model for fine-grained cell coordinate localization. The model integrates a Gradient-Orientation-Aware Extractor to enhance edge detection and introduces Heterogeneous Kernel Cross Fusion module to boost multi-scale feature learning, improving feature expression accuracy. Combined with a Scale-aware Loss function for better scale feature adaptation during training and mask-driven non-maximal suppression replacing traditional bounding-box suppression post-processing, the model achieves refined feature representation and superior localization performance. We further propose a data generator to address dataset limitations for fine-grained deformation table cell localization and construct the large-scale Deformation Wired Table (DWTAL) dataset. Experiments demonstrate that OG-HFYOLO achieves superior segmentation accuracy compared to all mainstream instance segmentation models on the DWTAL dataset. The dataset and the source code are open source: https://github.com/justliulong/OGHFYOLO.