Characterization of high-resolution AI data center training workloads on single and multiple GPU nodes

Ahmed Abd Elaziz Elsayed
Abdullah Azhar Al-Obaidi
Hany E.Z. Farag

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid advancement of Artificial Intelligence (AI) is driving unprecedented computational demands, posing significant challenges to datacenter infrastructure and threatening the stability and resilience of modern power grids. This study presents an open-access dataset featuring a diverse set of AI training sessions recorded at sub-second resolution, designed to advance research on the energy consumption profiles of AI workloads and their interactions with power grid dynamics in datacenter environments. The dataset contains 32 training sessions on high-performance H100 and B200 8-GPU nodes and 40 sessions on consumer-grade NVIDIA GeForce RTX 3060 GPUs, encompassing over 1.8 million samples. Each session records power demand, CPU and GPU utilization, per-GPU power, memory usage, and temperature across diverse AI tasks, including forecasting, classification, reinforcement learning, and text and image generation. Data quality was verified through detailed technical validation, including timing accuracy, hardware limit conformance, and cross-metric correlation analysis. Measurements remained within manufacturer-specified thermal and power envelopes, and observed correlations among power, utilization, temperature, and current were consistent with established processor and GPU behavior. The dataset provides a robust foundation for modeling AI datacenter energy behavior, system-level performance analysis, and power grid connection impact assessment studies.

Version published to 10.21203/rs.3.rs-7943457/v1 on Research Square
Oct 29, 2025

Adaptive Dataflow and Precision Optimization for Deep Learning on Configurable Hardware Architectures

This article has 3 authors:
1. Gulnaz Rati
2. Rafael Mendes
3. Aisha Noor
This article has no evaluationsLatest version Oct 8, 2025
3CBench: A Unified Benchmarking Framework for the Computing Capacity of Heterogeneous AI Clusters

This article has 10 authors:
1. Weixing Zhang
2. Xizhi Wang
3. Jun Yan
4. Jiasun Feng
5. Yiying Liu
6. Haiyan Li
7. Qun Chen
8. Zhe Tang
9. Xin Cui
10. Fei Yang
This article has no evaluationsLatest version Oct 9, 2025
Racing to Idle: Energy Efficiency of Matrix Multiplication on Heterogeneous CPU and GPU Architectures

This article has 1 author:
1. Mufakir Qamar Ansari
This article has no evaluationsLatest version Oct 21, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Adaptive Dataflow and Precision Optimization for Deep Learning on Configurable Hardware Architectures

3CBench: A Unified Benchmarking Framework for the Computing Capacity of Heterogeneous AI Clusters

Racing to Idle: Energy Efficiency of Matrix Multiplication on Heterogeneous CPU and GPU Architectures