Autonomous Rescue Drone with Multi-Modal Vision and Cognitive Agentic Architecture

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In post-disaster search and rescue (SAR) operations, unmanned aerial vehicles (UAVs) are essential tools, yet the large volume of raw visual data often overwhelms human operators by providing isolated, context-free information. This paper presents an innovative system with a cognitive-agentic architecture that transforms the UAV into an intelligent and proactive partner. The proposed modular architecture integrates specialized software agents for reasoning, coordinated by a large language model (LLM) acting as an orchestrator to handle high-level reasoning, logical validation, and self-correction feedback loops. A visual perception module based on a custom trained YOLO11 model feeds the cognitive core, enabling a complete perception–reasoning–action cycle. The system also incorporates a physical payload delivery module for first-aid supplies, reducing operator cognitive load and accelerating victim assistance through prioritized, actionable recommendations. This work, therefore, presents the first developed LLM-driven architecture of its kind that transforms a drone from a mere data-gathering tool into a proactive reasoning partner, demonstrating a viable path toward reducing operator cognitive load in critical missions.

Article activity feed