SemiTabDETR: End-to-End Semi-Supervised Table Detection with Transformer-based Enhanced Query Approach

Tahira Shhezadi
Didier Stricker
Muhammad Zeshan Afzal

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Table detection recognizes and accurately determines the position of tables within document images. This task involves both classification and precise localization of these table elements. Conventional table detection methods usually depend on extensive labeled data, creating a challenge in generating high-quality labels for training. To address this, many semi-supervised approaches are proposed. These methods utilize either CNN-based networks, which rely on anchor generation and NMS (Non-Maximum Suppression), or employ transformer-based models with performance linked to the quality of object queries. In this paper, we propose a transformer-based semi-supervised approach that improves the quality of object queries. We take high-level query features from unlabeled images and find similarities with decoder original queries by an enhanced query selection network. It provides high-quality, refined queries, allowing the model to make precise predictions or classifications with minimal labeled data. Results on benchmarks such as Publaynet, DocBank, PubTables, and ICDAR 19 have demonstrated that this innovative approach significantly outperforms traditional supervised and semi-supervised methods. On just 1% label data, our approach obtains 98.1% mAP, 96.4% mAP, and 82.8% mAP on PubTables, PubLayNet, and DocBank datasets, respectively. The state-of-the-art results show the effectiveness of our approach.

Version published to 10.21203/rs.3.rs-5305546/v1 on Research Square
Oct 25, 2024

BERT-FRIDE: An Efficient Approach for Front-End Issue Detection and Extraction from User Reviews

This article has 1 author:
1. Muhammad Sohaib
This article has no evaluationsLatest version Mar 22, 2026
StyleMamba: Efficient Image Style Transfer with Bidirectional Selective Scan Vision Mamba

This article has 5 authors:
1. Jian Liu
2. Jun Yang
3. DiWei Wu
4. Hewen Liu
5. Jun Liu
This article has no evaluationsLatest version Feb 27, 2026
Visual Question Answering Based on Visual Contentand Query Enhancement

This article has 5 authors:
1. Longbao Wang
2. Yuxin Shao
3. Jinhao Zhang
4. Meng Ding
5. Hongmin Gao
This article has no evaluationsLatest version Feb 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

BERT-FRIDE: An Efficient Approach for Front-End Issue Detection and Extraction from User Reviews

StyleMamba: Efficient Image Style Transfer with Bidirectional Selective Scan Vision Mamba

Visual Question Answering Based on Visual Contentand Query Enhancement