Interpretable Machine Learning on Soybean Multi Omics Data Reveals Drought-Driven Shifts of Plant-Microbe Interactions

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Plant-microbe interactions in the rhizosphere are central to plant growth, nutrient acquisition, and stress resilience. Multi-omics approaches enable comprehensive profiling of different biological layers, yet integrating these data to understand the mechanisms underlying plant-microbe symbiosis, particularly under drought stress, remains a major challenge.

Results

We integrated genomic, metabolomic, and microbiome data from 198 soybean accessions grown under both control and drought conditions to identify environment-specific predictive features of plant phenotypes. We compared best linear unbiased prediction (BLUP), genome-wide association study (GWAS), and a nonlinear machine learning model (Random Forest; RF) to evaluate their ability to detect informative features. RF models provided flexible variable selection and outperformed linear models in capturing nonlinear dependencies. Model interpretation using SHapley Additive exPlanations (SHAP) revealed that the isoflavone derivative daidzin and the drought-tolerant Candidatus Nitrosocosmicus are major contributors to phenotypic variation, specifically under drought stress. SHAP-based interaction networks revealed cross-omics links, such as connections between daidzin, gamma-aminobutyric acid (GABA), and Paenibacillus .

Conclusion

The application of an interpretable machine learning approach to the plant phenotype prediction framework identifies multi-omics biomarkers and interactions, providing insights into plant adaptation to drought stress through environment-dependent rhizosphere networks and symbiotic associations.

Article activity feed