Integrating Large Language Models with Cloud-Native Observability for Automated Root Cause Analysis and Remediation

Chen Wang
Tingzhou Yuan
Cancan Hua
Lu Chang
Xiao Yang
Zhimin Qiu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Cloud-native systems based on microservices, containers, and server- less architectures present unprecedented challenges for observabil- ity and incident management. Traditional rule-based monitoring and manual root cause analysis are increasingly inadequate for han- dling the complexity and scale of modern distributed systems. This paper presents a novel framework that leverages large language models (LLMs) to enhance cloud-native observability, enabling automated root cause analysis and self-healing capabilities. Our system integrates OpenTelemetry-based telemetry collection with a domain-adapted LLM capable of performing multimodal analysis over metrics, logs, and traces. Through fine-tuning on operational data and chain-of-thought reasoning, the LLM generates explain- able root cause hypotheses and actionable remediation plans. Exper- imental evaluation on public microservice datasets demonstrates that our approach reduces mean time to resolution (MTTR) by 84.2% compared to rule-based methods, achieving 95% F1-score in anomaly detection while maintaining low computational overhead. The system successfully automated 91% of common incidents with- out human intervention, significantly improving service reliability and reducing operational burden.

Version published to 10.20944/preprints202512.1926.v1
Dec 22, 2025

SymExplainer: An Integrated Framework for Interpretable ERC Violation Detection in Smart Contracts

This article has 3 authors:
1. Daniel Tang
2. Gavin Alexander
3. Kenneth Walker
This article has no evaluationsLatest version Dec 23, 2025
Domain Knowledge-Infused Synthetic Data Generation for LLM-Based ICS Intrusion Detection: Mitigating Data Scarcity and Imbalance

This article has 6 authors:
1. Seokhyun Ann
2. Hongeun Kim
3. Suhyeon Park
4. Seong-je Cho
5. Joonmo Kim
6. Harksu Cho
This article has no evaluationsLatest version Jan 14, 2026
A Discovery Technique for Expressive Yet Sound Process Models

This article has 3 authors:
1. Humam Kourani
2. Gyunam Park
3. Wil M.P. van der Aalst
This article has no evaluationsLatest version Jan 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

SymExplainer: An Integrated Framework for Interpretable ERC Violation Detection in Smart Contracts

Domain Knowledge-Infused Synthetic Data Generation for LLM-Based ICS Intrusion Detection: Mitigating Data Scarcity and Imbalance

A Discovery Technique for Expressive Yet Sound Process Models