The Evolution and Advancement of YOLO Algorithms in Object Detection: From Real-Time Breakthroughs to Modern Architectures

Mahmud Hasan
Md Khurram Monir Rabby
Israt Jahan
Md. Janibul Alam Soeb
Md. Fahad Jubayer

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Object detection represents a foundational capability in Artificial Intelligence (AI), enabling machines to interpret visual environments through precise object localization and classification. This comprehensive review chronicles the revolutionary evolution of the You Only Look Once (YOLO) framework from its inception to the state-of-the-art YOLOv12. Beginning with the limitations of classical approaches using handcrafted features, YOLO’s paradigm-shifting is documented transition to unified real-time detection via regression-based architectures. Methodically analyzing each major version (v1- v12), key innovations is detailed including multi-scale predictions (v2/v3), anchor-free designs (v8), programmable gradient information (v9), and attention-enhanced cross-scale fusion (v12). The review establishes how successive iterations systematically addressed critical challenges: reducing computational latency by 47× versus R-CNN variants, improving mAP by 32.7% on COCO benchmarks, and enabling deployment on edge devices. Beyond architectural analysis, comparative performance evaluations is presented across diverse applications—from autonomous driving to medical imaging—demonstrating YOLO’s unprecedented balance of speed (142 FPS) and accuracy (78.4% AP). The paper further examines emerging implementation trends, hardware optimizations, and domain-specific adaptations that cement YOLO’s position as the de facto framework for real-time vision systems. Our review analysis provides both technical and historical context for researchers and practitioners navigating the landscape of modern object detection.

Version published to 10.20944/preprints202510.2019.v1
Oct 27, 2025

A Hybrid YOLOv5s-Faster R-CNN Architecture for Object Detection in Complex Road Scenes

This article has 3 authors:
1. Lenard Nkalubo Byenkya
2. Rose Nakibuule
3. Danison Taremwa
This article has no evaluationsLatest version Jan 21, 2026
A Comprehensive Comparative Analysis of Convolutional Neural Network Architectures for Image Classification and Object Detection Tasks

This article has 3 authors:
1. Fahim Al Islam
2. Saif Hossain
3. Monir Hosen
This article has no evaluationsLatest version Feb 3, 2026
RES-YOLO: A Real-Time Infrared Detection Framework for Intelligent Vehicle Traffic Monitoring

This article has 2 authors:
1. Junhao Dai
2. Kai Zhu
This article has no evaluationsLatest version Jan 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Hybrid YOLOv5s-Faster R-CNN Architecture for Object Detection in Complex Road Scenes

A Comprehensive Comparative Analysis of Convolutional Neural Network Architectures for Image Classification and Object Detection Tasks

RES-YOLO: A Real-Time Infrared Detection Framework for Intelligent Vehicle Traffic Monitoring