LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects

William Liu
Liang Liu
Yaxuan Guo
Han Xiao
Weifeng Lin
Yuxiang Chai
Shuai Ren
Xiaoyu Liang
Linghao Li
Wenhao Wang
Tianze Wu
Yong Liu
Hao Wang
Hongsheng Li
Guanjing Xiong

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

With the rapid rise of large language models (LLMs), phone automation has undergone transformative changes. This paper systematically reviews LLM-driven phone GUI agents, highlighting their evolution from script-based automation to intelligent, adaptive systems. We first contextualize key challenges, (i) limited generality, (ii) high maintenance overhead, and (iii) weak intent comprehension, and show how LLMs address these issues through advanced language understanding, multimodal perception, and robust decision-making. We then propose a taxonomy covering fundamental agent frameworks (single-agent, multi-agent, plan-then-act), modeling approaches (prompt engineering, training-based), and essential datasets and benchmarks. Furthermore, we detail task-specific architectures, supervised fine-tuning, and reinforcement learning strategies that bridge user intent and GUI operations. Finally, we discuss open challenges such as dataset diversity, on-device deployment efficiency, user-centric adaptation, and security concerns, offering forward-looking insights into this rapidly evolving field. By providing a structured overview and identifying pressing research gaps, this paper serves as a definitive reference for researchers and practitioners seeking to harness LLMs in designing scalable, user-friendly phone GUI agents. Project Homepage: github.com/PhoneLLM/Awesome-LLM-Powered-Phone-GUI-Agents

Version published to 10.20944/preprints202501.0413.v1
Jan 6, 2025

Tool and Agent Selection for Large Language Model Agents in Production: A Survey

This article has 9 authors:
1. Elias Lumer
2. Anmol Gulati
3. Faheem Nizar
4. Dzmitry Hedroits
5. Atharva Mehta
6. Henry Hwangbo
7. Vamse Kumar Subbiah
8. Pradeep Honaganahalli Basavaraju
9. James A. Burke
This article has no evaluationsLatest version Dec 12, 2025
Large Language Models: A Survey of Architectures, Training Paradigms, and Alignment Methods

This article has 5 authors:
1. Deepshikha Bhati
2. Fnu Neha
3. Devi Sri Bandaru
4. Matthew Weber
5. Ishan Dilipbhai Gajera
This article has no evaluationsLatest version Jan 15, 2026
Evaluation and Benchmarking of Generative and Agentic AI Systems: A Comprehensive Survey

This article has 1 author:
1. Manish Shukla
This article has no evaluationsLatest version Dec 16, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Tool and Agent Selection for Large Language Model Agents in Production: A Survey

Large Language Models: A Survey of Architectures, Training Paradigms, and Alignment Methods

Evaluation and Benchmarking of Generative and Agentic AI Systems: A Comprehensive Survey