Distributional Data Analysis Uncovers Hundreds of Novel and Heritable Phenomic Features from Temporal Cotton and Maize Drone Imagery
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Genomic and phenomic analyses suggest additional heritable phenomic features can improve modeling of important end traits like senescence or yield. Field phenotyping generally uses trait values averaged across individual experimental units (plants or numerous plants within plots), ignoring the full distributional pattern of collected measures. Images of plants or plots, as captured by drones (unoccupied aerial vehicles / UAVs / drones), can be viewed as individual distribution functions that capture biological information. This study introduces and validates distributional data analysis in two crops and experiment types – cotton ( Gossypium hirsutum L.) single plant vegetation index (VI) analysis and maize ( Zea mays L.) plot-level yield predictions. In both crops, the concept of within-day variance decomposition was demonstrated. In cotton, genotypes exerted significant influences on temporal quantile functions of VIs. Maize yield prediction using distributional data with elastic-net regression indicated improvements in yield prediction between 12.7%-21.6% with quantiles outside the conventionally used median responsible for added predictive power. A novel data visualization method for per-pixel heritability allowed distributional features to be explainable and interpretable. These results have implications for future plant phenomic studies, indicating that distributional data analysis applied across temporal imagery captures novel, heritable, and interpretable biological signal that is lost when working with conventional measures of central tendency such as mean or median summary values of experimental units.
Significance
Repeated aerial imaging of agricultural experiments produces image data sets that capture plant development in high spatial and temporal resolutions. Frequently, images are summarized by measures of central tendency, such as mean or median values. Here, functional data distributional methods were applied to cotton ( Gossypium hirsutum L.) and maize ( Zea mays L.) image data, capturing more information than standard approaches. Cotton genotypes significantly impacted distributional spectral data while in maize, distributional data enabled more accurate predictions of grain yield versus models trained with median data alone. Distributional data were more explainable by genetics, with novel data visualization techniques able to shine light on specific parts of plant imagery with high and low genetic variance.