An Explainable Machine Learning Framework for Predicting Hybrid Maize Performance Using Genomic and Phenotypic Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

An application-oriented explainable machine learning framework is being used to predict hybrid maize performance based on integrated genomic and phenotypic data. Ensemble models based on tree-based learners and multilayer perceptron networks were developed to predict yield under both drought tolerance and disease resistance. To include model interpretability SHAP-based feature attributions were applied to detect the major genomic regions contributing to variation in trait attributes. Hybrid rankings on competing agronomic objectives were derived using a Pareto-based multi-trait optimization strategy to enhance decision-making for breeders. The framework performed moderately to highly in predicting different traits (R² = ~0.75) and is transparent in shedding light on how the models behave and trade-offs across traits are made. While the results are encouraging in establishing practical utility of explainable machine learning hybrid selection, larger datasets across multi-environments and field levels are needed for robust appraisal under different growing conditions.

Article activity feed