Tablecert: YOLO and TATR Enhanced Models to Boost Table Detection and Recognition in Legacy Documents
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The digital transformation of legacy documents remains challenging, as these documents are often unstructured and contain complex table layouts (e.g., watermarks, spaced headers, closely spaced tables, nested structures, and double borders) that degrade the performance of conventional table detection and recognition systems. We propose a modular, plug-and-play adaptation framework for YOLO-based table detection and Table Transformer (TATR)-based structure recognition, combining parameter-efficient LoRA fine-tuning with lightweight architectural modules (e.g., frequency-domain filtering and structural refinements). We evaluate the framework on a dataset of calibration certificates using a controlled training and evaluation protocol with standard detection and structure metrics. The adapted models outperform their respective baselines, mitigating layout-related challenges, and achieve F1-scores of 0.9999 (YOLO) and 0.9640 (TATR), alongside reduced validation loss. The best YOLO adaptation improves robustness in table detection under challenging visual artifacts, whereas the TATR-V6 yields stronger structural recognition. Finally, we show that the proposed FreqFilter2D module is a promising drop-in component for other computer vision architectures.