Laser-Prompted Real-Time Object Segmentation on Smartphones via Cloud Computing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Prompt-based control is a powerful paradigm for directing computer vision systems to solve diverse and complex tasks. In single object segmentation, a laser pointer can serve as an intuitive and precise navigation module to generate a corresponding object mask. However, the computational demands of these systems have traditionally confined them to stationary hardware, limiting their portability and widespread applicability. To address this challenge, this study introduces a real-time, laser-guided object segmentation system built on a mobile cloud architecture. The system utilizes a smartphone as a portable client while offloading computationally intensive tasks to a cloud-based server. The proposed architecture integrates a custom YOLOv5n-Laser-Point-Detection (YOLOv5n-LPD) model: a lightweight laser spot detector with only 65,000 parameters that achieves 2.5 ms inference time while maintaining high accuracy. For the segmentation task, the highly efficient and accurate FastSAM model was selected based on experimental evaluation. When tested with distant cloud servers, the system operated with an end-to-end latency of approximately 120 ms, while the client application maintained a frame rate of 10-20 FPS. This work demonstrates the viability of combining mobile devices with cloud-based controlled computer vision, opening new possibilities for everyday, industrial, and academic applications.

Article activity feed