Intelligent Hydraulic Flow Unit Mapping: Leveraging Unsupervised and Supervised Learning on Large-Scale Core Data

Mohammed Joobayear Hossain
Minhaz Chowdhury
Amad Hussen
Md Shofiqul Islam

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate identification of hydraulic flow units (HFUs) is fundamental for reservoir characterization; however, conventional approaches, such as histogram analysis, log-log plots of reservoir quality index (RQI) versus porosity index (ϕz), and Z-score probability tests, often suffer from subjectivity, data overlap, and limited scalability across large datasets. To address these limitations, this study introduces a hybrid machine learning workflow that integrates unsupervised clustering and supervised classification models to automate the identification and prediction of HFUs. In the first phase, unsupervised models including K-means, K-medoids, Fuzzy C-means (FCM), and Gaussian mixture models (GMM) were employed to detect the optimal number of HFUs. The GMM demonstrated superior clustering performance (R² = 0.9278, RMSE = 0.3365) compared to K-means and K-medoids, whereas FCM underperformed. In the second phase, supervised learning models were applied to predict HFUs using laboratory-derived core features. Among the tested models, k-nearest neighbors (KNN), random forest classifier (RFC), support vector machine (SVM), gradient boosting (GB), extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), and a stacking hybrid were evaluated. RFC outperformed the others with robust generalization (training accuracy: 0.983; testing accuracy: 0.972), while SVM showed moderate success, and KNN exhibited overfitting. Boosting models, such as XGBoost and GB, achieved high training accuracy but suffered from overfitting. In contrast, AdaBoost demonstrated relatively lower performance but stronger generalization capabilities. The stacking model, though highly accurate in training, also displayed overfitting during testing. Computational efficiency analysis further highlighted the trade-off between training time and predictive performance, with KNN and SVM being the fastest but also the least reliable. At the same time, RFC provided the most balanced accuracy–time outcome. Overall, the proposed workflow establishes an effective and scalable methodology for HFU classification, offering greater consistency, objectivity, and applicability to large reservoir datasets in the Norwegian sector of the North Sea.

Version published to 10.21203/rs.3.rs-9370518/v1 on Research Square
Apr 10, 2026

Machine Learning-Driven Subsurface Zonation and Connectivity Mapping for Sustainable Reservoir Management

This article has 3 authors:
1. Fossong Guilianno
2. Kingsley Onyekwere Okengwu
3. Ugochi Adaku Okengwu
This article has no evaluationsLatest version Apr 2, 2026
Artificial Neural Networks as a Decision-Support System for Predicting the Quality Attributes of Thermally Modified Wood

This article has 2 authors:
1. Özlem BOZKURT
2. Günay ÖZBAY
This article has no evaluationsLatest version Apr 1, 2026
Multi-Sensor Monitoring of Wetland Inundation Using a Machine Learning and Data Fusion Framework

This article has 4 authors:
1. Jenna Abrahamson
2. Josh Gray
3. Mirela Tulbure
4. Erin Schliep
This article has no evaluationsLatest version Apr 4, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine Learning-Driven Subsurface Zonation and Connectivity Mapping for Sustainable Reservoir Management

Artificial Neural Networks as a Decision-Support System for Predicting the Quality Attributes of Thermally Modified Wood

Multi-Sensor Monitoring of Wetland Inundation Using a Machine Learning and Data Fusion Framework