TOC Prediction from Well Logs Using Gradient Boosting and Neural Network in the Santos Basin, SE Brazil
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of total organic carbon (TOC) in subsurface formations is crucial for evaluating source rock quality and optimizing exploration strategies in hydrocarbon prolific basins. Traditional methods like the ΔlogR technique often require local calibration and may fail to capture the non-linear relationships between well-log parameters and TOC, leading to inaccuracies. This study applies three machine learning (ML) models—Gradient Boosting Decision Trees (GBDT), Extreme Gradient Boosting (XGBoost), and Multi-Layer Perceptron (MLP)—to predict TOC from well-log data in the Santos Basin, Brazil's largest offshore basin. We employed robust data preprocessing techniques, including outlier detection using Density-Based Spatial Clustering and feature reduction through Principal Component Analysis. Bayesian optimization was utilized for hyperparameter tuning to enhance model performance. The results indicate that all ML models outperformed the traditional ΔlogR method, with GBDT achieving the highest prediction accuracy. This study demonstrates the potential of ML models in capturing complex, non-linear relationships in geophysical data and highlights the challenges of generalizing these models across diverse geological settings. The findings contribute to improved TOC estimation and can enhance exploration strategies in similar geological contexts.