Explainable AI Reveals a Critical Interplay Between Lignin and Moisture in Limiting Biochemical Methane Potential

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The Biochemical Methane Potential (BMP), a fundamental parameter for assessing feedstock viability in anaerobic digestion (AD), is conventionally determined through time-consuming experimental assays. To address this limitation, machine learning (ML) presents a promising alternative for rapid prediction; however, the "black-box" nature of many algorithms has hindered their adoption. This study demonstrates an explainable AI (XAI) framework that moves beyond prediction to function as a hypothesis-generation engine. A Random Forest (RF) model was developed and trained on a heterogeneous public dataset of 127 diverse feedstocks. While interpretation was prioritized over predictive power, the optimized model achieved a reasonable performance (5-fold cross-validated R 2  = 0.56, RMSE = 43.26 Nm 3 CH 4 /t DM), which is notable given the dataset's diversity. Using SHapley Additive exPlanations (SHAP), the model was first validated by confirming its alignment with established biochemical principles, such as the profound negative impact of lignin. The primary contribution, however, was the use of SHAP interaction analysis to uncover a critical and robust interaction between Dry Matter (DM) and lignin content. This finding revealed that the impact of moisture on BMP is highly dependent on the feedstock's lignification, leading to the formulation of a specific, testable hypothesis: optimizing the feedstock’s moisture state (e.g., via conditioning) is a more effective strategy for enhancing methane yield from low-lignin biomass, whereas for high-lignin biomass, this factor is secondary to targeted delignification. This work serves as a case study for how XAI can bridge the gap between data science and experimental biotechnology to guide more efficient research.

Article activity feed