Transformed DETR: Leveraging Dense Prior and Focal Attention for Enhanced Visual Recognition

Hakim Nasaoui
Insaf Bellamine
Hassan Silkan

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Object detection is a pivotal task in computer vision, with applications in numerous areas including surveillance, autonomous driving, and image analysis. Traditional methods for object detection required handcrafted feature extraction and the use of classifiers to label objects within images. With the advent of deep learning, these processes have been significantly automated, leading to advancements in both accuracy and efficiency of object detection systems. In this work, we introduce a novel architecture that incorporates a Dense Prior module and Focal Self-attention mechanism into the DEtection TRansformer (DETR) framework. This Transformed DETR model is designed to direct greater focus to regions of interest in images, thereby improving detection performance. We demonstrate the effectiveness of our approach on the COCO dataset, achieving a Mean Average Precision (mAP) of 53.8%, which indicates a considerable improvement over existing methods. Our architecture captures and focus on the key regions within an image, resulting in a more powerful object detection model.

Version published to 10.21203/rs.3.rs-4233038/v1 on Research Square
Apr 11, 2024

The Art of Seeing: A Computer Vision Journey into Object Detection

This article has 2 authors:
1. Mohammad Salman Khan
2. Ayesha Imran
This article has no evaluationsLatest version May 6, 2024
CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder-Decoder

This article has 5 authors:
1. Jinlong Chen
2. Kejian Xu
3. Yi Ning
4. Lianyuan Jiang
5. Zhi Xu
This article has no evaluationsLatest version Apr 9, 2024
Adaptive Multi-Scale Feature Fusion with Spatial Translation for semantic Segmentation

This article has 2 authors:
1. Hongru Wang
2. Haoyu Wang
This article has no evaluationsLatest version Apr 12, 2024

Listed in

Abstract

Article activity feed

Related articles

The Art of Seeing: A Computer Vision Journey into Object Detection

CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder-Decoder

Adaptive Multi-Scale Feature Fusion with Spatial Translation for semantic Segmentation