Adaptive Pulsating Workload Scheduling for Cosmic Simulation: A Thermal-Aware Distributed Computing Architecture

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We propose an adaptive pulsating workload scheduling architecture for cosmic simulations, which dynamically balances computational intensity with thermal and memory bandwidth efficiency in distributed GPU clusters. The core innovation lies in a self-regulating algorithm inspired by pulsating heat pipe dynamics, where workload intensity alternates between high and low phases to maintain optimal operating conditions. The scheduler integrates real-time thermal feedback from infrared sensors and on-die probes, adjusting task parallelism and frequency scaling to prevent overheating while meeting computational deadlines. Moreover, a memory bandwidth optimizer dynamically switches between strided prefetching and cache-aware data reorganization, further improving efficiency. The system employs a transformer-based reinforcement learning agent to predict optimal pulsation cycles, minimizing energy consumption, deadline misses, and thermal violations. Unlike conventional static schedulers, our method achieves significant improvements in both performance and hardware longevity, particularly for magnetohydrodynamic cosmology and dark matter distribution simulations. Experimental results demonstrate that the proposed architecture reduces energy consumption by up to 27\% while maintaining 98\% deadline adherence under thermal constraints. This work bridges the gap between high-performance computing and sustainable resource utilization, offering a scalable solution for next-generation cosmic simulations.

Article activity feed