AYOLO: Development of a Real-Time Object Detection Model for the Detection of Secretly Cultivated Plants

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

AYOLO introduces a novel fusion architecture that integrates unsupervised learning techniques with Vision Transformers, leveraging the YOLO series models as its foundation. This innovation enables the effective utilization of rich, unlabeled data, establishing a new pretraining methodology tailored to YOLO architectures. On a custom dataset comprising 80 images of poppy plants, AYOLO achieved a remarkable Average Precision (AP) of 38.7% while maintaining a high rendering speed of 239 FPS (Frames Per Second) on a Tesla K80 GPU. Real-time performance is demonstrated by achieving 239 FPS, and feature fusion optimally combines spatial and semantic information across scales. This performance surpasses the previous state-of-the-art YOLO v6-3.0 by +2.2% AP while retaining comparable speed. AYOLO exemplifies the potential of integrating advanced information fusion techniques with supervised pretraining, significantly enhancing precision and efficiency for object detection models optimized for small, specialized datasets.

Article activity feed