From Prompts to Paths: Large Language Models for Zero-Shot Planning and Simulation

Kelvin Olaiya
Giovanni Delnevo
Chan-Tong Lam
Giovanni Pau
Paola Salomoni

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper explores the capability of Large Language Models (LLMs) to perform zero-shot planning through multimodal reasoning, with a particular emphasis on applications to Autonomous Mobile Robots (AMR) and unmanned systems. We present a modular system architecture that integrates a general-purpose LLM with visual and spatial inputs for adaptive planning to iteratively guide robot behavior. To assess performance, we employ a continuous evaluation metric that jointly considers distance and orientation, offering a more informative and fine-grained alternative to binary success measures. We evaluate three foundational LLMs (i.e., GPT-4.1-nano, GPT-4o-mini, and Gemini 2.0 Flash) on a suite of zero-shot navigation and exploration tasks in simulated environments. Our findings show that LLMs exhibit encouraging signs of goal-directed spatial planning and partial task completion, even in a zero-shot setting. However, inconsistencies in plan generation across models highlight the need for task-specific adaptation or fine-tuning. The findings support the use of multimodal inputs as key enablers for advancing LLM-based autonomy in AMR and unmanned systems.

Version published to 10.20944/preprints202510.0846.v1
Oct 11, 2025

When AI Tells the Truth? Evaluating Different LLM Approaches to Reliable Trip Planning

This article has 4 authors:
1. Vitalii Morskyi
2. Dawid Jaworski
3. Paweł Kuraś
4. Patryk Organiściak
This article has no evaluationsLatest version Oct 6, 2025
LLM-DWA: A Hybrid Path Planning FrameworkCombining Large Language Models with the Dynamic Window Approach

This article has 3 authors:
1. Jeonghee Seo
2. Eunsung Kim
3. Andrew Jaeyong Choi
This article has no evaluationsLatest version Sep 25, 2025
SUSHI: A Vision System for Reactive, Uninformed ASV Navigation via Multi-Field Path Planning and Visual Exploration

This article has 6 authors:
1. Hamze Hammami
2. Mohamad Abban
3. Abdul Maajid Aga
4. Laith Mohamed
5. Saif Alsaad
6. Nidhal Abdulaziz
This article has no evaluationsLatest version Sep 19, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

When AI Tells the Truth? Evaluating Different LLM Approaches to Reliable Trip Planning

LLM-DWA: A Hybrid Path Planning FrameworkCombining Large Language Models with the Dynamic Window Approach

SUSHI: A Vision System for Reactive, Uninformed ASV Navigation via Multi-Field Path Planning and Visual Exploration