SemiTabDETR: End-to-End Semi-Supervised Table Detection with Transformer-based Enhanced Query Approach

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Table detection recognizes and accurately determines the position of tables within document images. This task involves both classification and precise localization of these table elements. Conventional table detection methods usually depend on extensive labeled data, creating a challenge in generating high-quality labels for training. To address this, many semi-supervised approaches are proposed. These methods utilize either CNN-based networks, which rely on anchor generation and NMS (Non-Maximum Suppression), or employ transformer-based models with performance linked to the quality of object queries. In this paper, we propose a transformer-based semi-supervised approach that improves the quality of object queries. We take high-level query features from unlabeled images and find similarities with decoder original queries by an enhanced query selection network. It provides high-quality, refined queries, allowing the model to make precise predictions or classifications with minimal labeled data. Results on benchmarks such as Publaynet, DocBank, PubTables, and ICDAR 19 have demonstrated that this innovative approach significantly outperforms traditional supervised and semi-supervised methods. On just 1% label data, our approach obtains 98.1% mAP, 96.4% mAP, and 82.8% mAP on PubTables, PubLayNet, and DocBank datasets, respectively. The state-of-the-art results show the effectiveness of our approach.

Article activity feed