Development of a Machine Learning-Based Clustering Framework for Energy Management on a University Campus

Salim Oyinlola

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The increasing demand for reliable and efficient energy distribution in educational institutions necessitates the adoption of intelligent energy management systems. This research develops a machine learning-based framework for load management for a university campus using the University of Lagos as a case study due to its metropolitan nature. The study specifically seeks to estimate hourly power consumption for individual buildings and transformers while introducing clustering algorithms to group buildings according to their energy consumption patterns. A dataset comprising 3,648 hourly timestamps across 55 occupied structures was collected over a half-a-year period and analyzed. Building-level load estimation models were first developed to establish hourly consumption profiles. Thereafter, several clustering techniques were evaluated, including K-Means, Hierarchical Clustering, Gaussian Mixture Models (GMM), Spectral Clustering, Mini-Batch K-Means, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Among these, Mini-Batch K-Means achieved the best performance, segmenting buildings into three optimal groups: high-, medium-, and low-demand clusters. The algorithm achieved a Silhouette Score of 0.461, on a scale of -1 to 1, where higher values indicate more distinct clusters, a Davies–Bouldin Index of 0.767, where lower values represent better clustering, and a Calinski–Harabasz Index of 42.0, where higher scores indicate well separated clusters. Given the duration of the dataset, short‑term load forecasting (STLF) was performed using Meta’s Prophet, Seasonal Autoregressive Integrated Moving Average (SARIMA), Autoregressive Integrated Moving Average (ARIMA), Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models on both the whole‑campus series and on cluster‑specific series. ARIMA produced the lowest point‑forecast errors in all evaluations: whole‑campus Mean Absolute Percentage Error (MAPE) = 7.2% and Root Mean Square Error (RMSE) = 118.7. Furthermore, in cluster-specific metrices, cluster 0 had MAPE = 3.8%; cluster 1 had MAPE = 4.6% (RMSE = 44.7) and cluster 2 had MAPE = 5.4%. This reduction pattern was consistent across all evaluated algorithms. These quantitative results indicate ARIMA as the preferred baseline for point forecasting in this dataset and confirm that consumption‑based clustering is essential to achieve consistent, large reductions in both relative and absolute forecast error. Overall, this study demonstrates the feasibility of applying machine learning for institutional load management, offering a scalable and adaptable framework for other university campuses environments.

Version published to 10.31237/osf.io/qd6kg_v1 on OSF Preprints
Nov 14, 2025

Modeling and Forecasting of Industrial Consumers’ Load Using Fuzzy Clustering and Advanced Neural Networks

This article has 3 authors:
1. SeyedHamed MirMohammadAli Roudaki
2. Amin Helmzadeh
3. Aliakbar Hajnoouzi
This article has no evaluationsLatest version Oct 17, 2025
Towards Prediction of Energy Use: A Generalized AI-Based Model for Non-Residential Buildings

This article has 6 authors:
1. Anna Romanska-Zapala
2. Marek Dudzik
3. Piotr Dudek
4. Mariusz Górny
5. Sabina Kuc
6. Mark Bomberg
This article has no evaluationsLatest version Nov 12, 2025
Optimizing Cloud Resources By Anomaly Detection And Machine Learning For Smarter Power Consumption And Execution Time Predictions

This article has 2 authors:
1. G. Prabhu
2. P. S. Ambili
This article has no evaluationsLatest version Oct 13, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Modeling and Forecasting of Industrial Consumers’ Load Using Fuzzy Clustering and Advanced Neural Networks

Towards Prediction of Energy Use: A Generalized AI-Based Model for Non-Residential Buildings

Optimizing Cloud Resources By Anomaly Detection And Machine Learning For Smarter Power Consumption And Execution Time Predictions