AI-Enhanced Rescue Drone with Multi-Modal Vision and Cognitive Agentic Architecture

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In post-disaster search and rescue (SAR) operations, unmanned aerial vehicles (UAVs) are essential tools, yet the large volume of raw visual data often overwhelms human operators by providing isolated, context-free information. This paper presents an innovative system with a novel cognitive–agentic architecture that transforms the UAV from an intelligent tool into a proactive reasoning partner. The core innovation lies in the LLM’s ability to perform high-level semantic reasoning, logical validation, and robust self-correction through internal feedback loops. A visual perception module based on a custom-trained YOLO11 model feeds the cognitive core, which performs contextual analysis and hazard assessment, enabling a complete perception–reasoning–action cycle. The system also incorporates a physical payload delivery module for first-aid supplies, which acts on prioritized, actionable recommendations to reduce operator cognitive load and accelerate victim assistance. This work, therefore, presents the first developed LLM-driven architecture of its kind, transforming a drone from a mere data-gathering tool into a proactive reasoning partner and demonstrating a viable path toward reducing operator cognitive load in critical missions.

Article activity feed