SAIA: A Seamless Slurm-Native Solution for HPC-Based Services

Ali Doosthosseini
Jonathan Decker
Hendrik Nolte
Julian Kunkel

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Recent developments indicate a shift toward web services that employ ever larger AI models, e.g., Large Language Models (LLMs), requiring powerful hardware for inference. High-Performance Computing (HPC) systems are commonly equipped with such hardware for the purpose of large scale computation tasks. However, HPC infrastructure is inherently unsuitable for hosting real-time web services due to network, security and scheduling constraints. While various efforts exist to integrate external scheduling solutions, these often require compromises in terms of security or usability for existing HPC users. In this paper, we present SAIA, a Slurm-native platform consisting of a scheduler and a proxy. The scheduler interacts with Slurm to ensure the availability and scalability of services, while the proxy provides external access, which is secured via confined SSH commands. We have demonstrated SAIA’s applicability by deploying a large-scale LLM web service that has served over 50,000 users.

Version published to 10.21203/rs.3.rs-6648693/v1 on Research Square
Jul 29, 2025

User-friendly scheduler Using a hybrid architecture and supercomputing for big data processing

This article has 5 authors:
1. Patrick McKeever
2. Varun Mittal
3. Bryce Fukuda
4. Ka Yee Yeung
5. Ling-Hong Hung
This article has no evaluationsLatest version Sep 3, 2025
Scalable Cost-Optimized HPC Cluster on Google Cloud Platform

This article has 12 authors:
1. Dogukan Teber
2. Tristan Kosciuch
3. Elchin E. Jafarov
4. Benjamin C. Maglio
5. Brendan M. Rogers
6. Chu-Chun Chang
7. Helene Genet
8. Ruth Rutter
9. Susan Natali
10. Tobey Carman
11. Trevor Smith
12. Valeria Briones
This article has no evaluationsLatest version Aug 21, 2025
Parallel Hybrid Metaheuristic and Branch-and-Bound Scheduler

This article has 1 author:
1. Youcef BENMOUNA
This article has no evaluationsLatest version Jul 29, 2025

Listed in

Abstract

Article activity feed

Related articles

User-friendly scheduler Using a hybrid architecture and supercomputing for big data processing

Scalable Cost-Optimized HPC Cluster on Google Cloud Platform

Parallel Hybrid Metaheuristic and Branch-and-Bound Scheduler