Discovering Hidden Reservoir Physics Using Explainable Machine Learning for Permeability Prediction in Carbonate Reservoirs With Noisy Legacy Datasets

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate permeability prediction is essential for reservoir characterization and production forecasting; however, traditional physics-based models often fail to capture the complex pore systems typical of carbonate reservoirs. This study aims to demonstrate how explainable machine learning can uncover hidden reservoir physics while improving permeability prediction using noisy legacy datasets. The work focuses on integrating physics-based modeling with explainable AI to reinterpret legacy well logs and extract previously unresolved petrophysical relationships. A physics-guided machine learning framework was developed using legacy well data from a carbonate reservoir in southern Iraq. The original dataset contained more than 2,400 samples from eight wells and exhibited significant contamination from incorrectly imputed or interpolated permeability values. A rigorous cleaning workflow combining statistical filtering and zonation-based quality control reduced the dataset to 1,132 physically consistent samples. A baseline permeability model derived from well log–estimated porosity and porosity–permeability transform served as the physical reference. Machine learning discrepancy models were trained using five physics-informed features—neutron porosity (NPHI), gamma ray (GR), bulk density (RHOB), sonic travel time (DTP), and depth. Six algorithms (Random Forest, Linear Regression, Polynomial Regression, Support Vector Machines, XGBoost, and CatBoost) were evaluated using five-fold cross-validation. Tree-based gradient boosting algorithms demonstrated the strongest predictive performance, with XGBoost producing the most accurate discrepancy corrections. When integrated with the baseline physics model through a multiplicative boosting framework, the hybrid model significantly improved predictive accuracy, reducing RMSE by approximately 80% and increasing adjusted R² from a negative baseline to greater than 0.95. Beyond predictive improvement, explainable AI analysis using SHAP values and Partial Dependence Plots provided insights into previously unresolved reservoir physics. The algorithm identified dual-porosity behavior by applying strong positive permeability corrections in tight, low-porosity intervals associated with fracture networks while penalizing high-porosity intervals dominated by ineffective microporosity. Interaction analysis revealed strong coupling between neutron porosity, bulk density, and sonic travel time, enabling the model to implicitly synthesize a Secondary Porosity Index that captures deviations between acoustic porosity and effective flow capacity. Depth-dependent analysis further revealed a distinct permeability enhancement zone below approximately 4,000 m, indicating a likely diagenetically enhanced flow unit. In addition, the minimal influence of gamma ray logs confirmed a clean carbonate system where permeability is primarily controlled by secondary porosity rather than shale content. This study demonstrates that explainable machine learning can move beyond predictive modeling to reveal hidden reservoir physics in complex carbonate systems. By combining physics-guided discrepancy modeling with explainable AI, the workflow converts machine learning models into transparent diagnostic tools capable of identifying fracture-dominated flow regimes, ineffective microporosity, and stratigraphically controlled flow units. The approach provides a practical framework for reinterpreting legacy well logs, improving reservoir characterization, and identifying bypassed pay in datasets traditionally considered too noisy for reliable analysis.

Article activity feed