SAF-YOLO: Super-Resolution Augmented Detection Model with Visual State Space Enhancement for Safflower Filament Picking
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cluttered backgrounds, variable shooting angles, and fluctuating lighting in safflower fields frequently induce missed or false detections of filaments by picking robots, especially for small and imbalanced targets. To address these inherent limitations, we propose SAF-YOLO, a novel detector tailored for safflower filament detection. It incorporates three complementary innovations: (1) A causal Visual State Space Model (VSSM)-based VSS-SPPF module integrated into the Backbone, enhancing spatial context modeling to separate filaments from noisy backgrounds; (2) An Asymptotic Feature Pyramid Network (AFPN) structure in the Neck, optimizing feature adaptive aggregation to boost multi-scale targets sensitivity; (3) An auxiliary Super-Resolution Self-Supervised (SRSS) branch, addressing small and imbalanced target distribution by enabling fine-grained feature learning via high-resolution reconstruction during training, while being discarded at inference to avoid computational overhead. Experimental results demonstrate SAF-YOLO achieves 90.1% Precision, 85.9% Recall, and 93.3% mAP. This outperforms the popular YOLO variants, including YOLOv5/v7/v8/v11 (mAP +4.1%-8.0%), and mainstream small-object detectors (e.g., SSD, Faster R-CNN, CFINet, CFPT, and InSPyReNet; mAP +7.9%-26.2%). Our SAF-YOLO can effectively solve safflower filament detection challenges in complex fields, supporting robotic precision picking.