SUSHI: A Vision System for Reactive, Uninformed ASV Navigation via Multi-Field Path Planning and Visual Exploration
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Vision offers richer context than traditional marine sensors (e.g., LiDAR, DVL, sonar) but is harder to interpret on water due to reflections, glare, and dynamic surfaces. SUSHI is a vision-first navigation system for Autonomous Surface Vehicles (ASVs) that fuses detection, water segmentation, and monocular depth to produce camera-centric navigation grids for planning and control. The proposed perception methods improve our existing segmentation model to 90% accuracy from 60% on a previously tested method with only 30 minutes of training, implement a dataset that achieved 91% accuracy for trash and obstacle detection in simulation using YOLO, and benchmark a monocular depth method that solves the issue of reflective surfaces and can work universally. Path planning uses a Multi-Field Synthesis (MFS) approach: a locally reactive artificial-potential-field component blended adaptively with a global wavefront flow field, mitigating local minima while preserving real-time responsiveness. A behavior layer prioritizes target seeking and mask-based visual exploration when explicit goals are absent. Validation was performed in the TOAST simulator and in a pool environment, demonstrating robust goal targeting and exploration using cameras with minimal side sensing for emergency avoidance.