A Student-Centric Evaluation Survey to Explore the Impact of LLMs on UML Modeling

Bilal Al-Ahmad
Anas Alsobeh
Omar Meqdadi
Nazimuddin Shaikh

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Unified Modeling Language (UML) diagrams serve as essential tools for visualizing system structure and behavior in software design. With the emergence of Large Language Models (LLMs) that automate various phases of software development, there is growing interest in leveraging these models for UML diagram generation. This study presents a comprehensive empirical investigation into the effectiveness of GPT-4-turbo in generating four fundamental UML diagram types: Class, Deployment, Use Case, and Sequence diagrams. We developed a novel rule-based prompt-engineering framework that transforms domain scenarios into optimized prompts for LLM processing. The generated diagrams were then synthesized using PlantUML and evaluated through a rigorous survey involving 121 computer science and software engineering students across three U.S. universities. Participants assessed both the completeness and correctness of LLM-assisted and human-created diagrams by examining specific elements within each diagram type. Statistical analyses, including paired t-tests, Wilcoxon signed-rank tests, and effect size calculations, validate the significance of our findings. The results reveal that while LLM-assisted diagrams achieve meaningful levels of completeness and correctness (ranging from 61.1% to 67.7%), they consistently underperform compared to human-created diagrams. The performance gap varies by diagram type, with Sequence diagrams showing the closest alignment to human quality and Use Case diagrams exhibiting the largest discrepancy. This research contributes a validated framework for evaluating LLM-generated UML diagrams and provides empirically-grounded insights into the current capabilities and limitations of LLMs in software modeling education.

Version published to 10.3390/info16070565
Jul 1, 2025
Version published to 10.20944/preprints202505.2054.v1
May 26, 2025

Behavioral Augmentation of UML Class Diagrams: An Empirical Study of Large Language Models for Method Generation

This article has 2 authors:
1. ROUABHIA Djaber
2. HADJADJ Ismail
This article has no evaluationsLatest version Jun 19, 2025
Software Engineering Challenges in the Deployment of Generative AI Models at Scale

This article has 1 author:
1. Devisharan Mishra
This article has no evaluationsLatest version Jun 17, 2025
Lossless compaction of model execution traces

This article has 5 authors:
1. Fazilat Hojaji
2. Bahman Zamani
3. Abdelwahab Hamou-Lhadj
4. Tanja Mayerhofer
5. Erwan Bousse
This article has no evaluationsLatest version Jul 4, 2025

Listed in

Abstract

Article activity feed

Related articles

Behavioral Augmentation of UML Class Diagrams: An Empirical Study of Large Language Models for Method Generation

Software Engineering Challenges in the Deployment of Generative AI Models at Scale

Lossless compaction of model execution traces