LLM as HPC Expert: Extending RAG Architecture for HPC Data

Yusuke Miyashita
Patrick Kin Man Tung
Johan Barthelemy

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

High-Performance Computing (HPC) is crucial for performing advanced computational tasks, yet their complexity often challenges users, particularly those unfamiliar with HPC-specific commands and workflows. This paper introduces Hypothetical Command Embeddings (HyCE), a novel method that extends Retrieval-Augmented Generation (RAG) by integrating real-time, user-specific HPC data, enhancing accessibility to these systems. HyCE enriches large language models (LLM) with real-time, user-specific HPC information, addressing the limitations of fine-tuned models on such data. We evaluate HyCE using an automated RAG evaluation framework, where the LLM itself creates synthetic questions from the HPC data and serves as a judge, assessing the efficacy of the extended RAG with the evaluation metrics relevant for HPC tasks. Additionally, we tackle essential security concerns, including data privacy and command execution risks, associated with deploying LLMs in HPC environments. This solution provides a scalable and adaptable approach for HPC clusters to leverage LLMs as HPC expert, bridging the gap between users and the complex systems of HPC.

Version published to 10.32388/imkkvr
Feb 14, 2025

Efficient and Scalable Data Pipelines: The Core of Data Processing in Gig Economy Platforms

This article has 1 author:
1. Junjie Chen
This article has no evaluationsLatest version Feb 14, 2025
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective

This article has 4 authors:
1. Rakshit Aralimatti
2. Syed Abdul Gaffar Shakhadri
3. Kruthika KR
4. Kartik Angadi
This article has no evaluationsLatest version Feb 27, 2025
Quantization of a Llama Language Model for improved Efficiency and Inference

This article has 4 authors:
1. S Madhanegha
2. V Vishnuvaradhan
3. R Arun
4. I Surenther
This article has no evaluationsLatest version Feb 17, 2025

Listed in

Abstract

Article activity feed

Related articles

Efficient and Scalable Data Pipelines: The Core of Data Processing in Gig Economy Platforms

Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective

Quantization of a Llama Language Model for improved Efficiency and Inference