CloudPulse: A Computational Intelligence Method for Energy-Efficient Data Center Steering UsingMulti-Armed Bandits
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Modern data centers, built around interconnected server clusters, underpin large-scale cloud computing but face challenges like resource heterogeneity, workload imbalance, and energy consumption. To address these, this research introduces CloudPulse, a smart scheduling framework integrating Reinforcement Learning (RL) and Federated Learning (FL) to optimize workload distribution and task migration in real time. CloudPulse monitors power states, operational requirements, and constraints to manage server utilization and costs, with FL enabling decentralized training to reduce data transfer energy, and RL enhancing adaptive decisions to minimize environmental impact. It also addresses consumer over-or underestimation of resources by suggesting sustainable allocations that align demand with workload needs. The RL approach refines scheduling even for low-power or resource-intensive workloads, improving efficiency and performance. CloudPulse features a Streamlit app for real-time monitoring with secure ngrok access. Experiments show it outperforms traditional approaches by placing work-loads holistically across racks and servers with an FL model to maximize median or trimmed mean efficiency, minimizing inefficiency through computational steering that allocates cores per task based on effectiveness measures such as 1/(T/NC) and 1/(T/NC·NS), balancing server load and prolonging resource life without frequent data center changes.