Predicting plant biomass accumulation from image-derived parameters

This article has been Reviewed by the following groups

Read the full article

Abstract

No abstract available

Article activity feed

  1. Image

    **Reviewer 2: Christian Fournier **

    Reviewer Comments to Author, version 1: The authors investigate the ability of deriving plant biomass (both fresh and dry mass) from 2D image-based features acquired with visible, fluorescent and NIR multi-view imaging systems operating on an automated high throughput phenotyping platform. In a first part, several multivariate statistical models are compared for their ability at predicting biomass for two treatments within a single experiment, on three independent datasets, detailed results being presented for one experiment. One of the best model, the random forest, is then further investigated for its capacity at making prediction across experiments, being trained on one experiment at a time or on one treatment of one experiment at a time. Finally, the relative importance of individual image-based traits in the prediction of either fresh or dry weight is presented for two treatments of one dataset.

    Models and methods for model evaluation are clearly presented, and the overall quality of the text and Figure makes the paper easy to follow. The inclusion of other than visible images, the objective selection of image-based traits, the comparison of models and the use of 3 independent datasets clearly distinguish this paper from previous publications on the same subject. It provides the reader very valuable information on the current prediction capacity of the approach, together with a consistent methodology for analyzing other related practices.

    However, I have two major concerns on the current version of this manuscript.

    First, I think that some conclusions highlighted in the abstract or in the text are not completely in line (or at least sufficiently tempered) with what is demonstrated in the text or shown on the figures. In the abstract (line 19-20), it is highlighted that 'The results proved that plant biomass can be accurately predicted from image-based parameters using a random forest model'. To me this conclusion is clearly supported by data in the case of within experiment predictions, but not fully in the case of the cross experiment test (i.e. quite opposite to what is stressed line 21). My impression, given results presented Figure 5, is that in one case out of two, a model trained on one experiment alone could not accurately (or at least with not the same accuracy) predict the biomass, despite a repeated protocol. This result is per se very interesting, as it demonstrates an important limitation of the approach. It can however not be summarized by what is written line 19-21, 201-202, 209-210 or 253-257. On another occasion (line 148 and line 248), I found the conclusion ('the RF model largely outperformed other models') a bit exaggerated, as, on Figure 3, depending on the criteria, RF model performs very similar to MARS model for example.

    Second, I did not manage to test the models, nor to reproduce the analysis with the provided data and source code. Concerning the data, image traits are provided for all experiments, but manual measurement on Dry Weight are missing. Concerning the code, the R-script provided does not fit to the provided dataset, thus making it difficult to test. More important, model code runs with errors at runtime ('not defined' errors). I also think, but this is only a suggestion, that, in addition to raw image files, providing binary masks of plants, that are of high importance for all traits analyzed here, could improve the re-use of this nice dataset,.

    Other minor points or comments for specific parts of the texts are provided bellow:

    Line 72-74: I think this sentence would be better be placed in the Potential application section Line 85: Do you mean that some image traits are more sensitive to physiological traits ? I do not see why Fig 1B is illustrative for this point. Line 98: In the context of phenotyping, it might also be useful to add Spearman rank correlation to the assessment Line 108: Fig 1B is only a heatmap image. May be a list of traits should be provided, or a reference to the supplementary data should be added here. Line 117: Figure 2B is poorly informative as traits are not identified. This figure is also not commented in the text, I suggest removing it. Line 144: I would find useful to make here perfectly clear that all the models were trained on the control + stress plants, to avoid any confusion with the 'cross treatment test' later on (Figure 6) Line 146-151: I found the analysis a bit confusing as, in the details, the ranking of the different methods varies, and I do not clearly see why RF 'largely outperforms' other methods (especially MARS). Line 152-155: The comparison with the widely used 'single feature' method is very interesting. Can you consider to add its score/line on the R2 and RMSRE ? Line 178: May be it is also worth noting in the text that geometric + color traits trust 13 out of 15 (FW) and 15 out of 15 (DW) first places, as these two types of data are widely available among phenotyping platform and yet not so often used in biomass predictions. Line 201 - 211: The text seems to me a bit too optimistic regarding the cross experiment predictions. Exp3 clearly shows a non-conservation of the relationship obtained in Exp1 or 2, and a clear loss of predictive power compared to within experiment training. Line 281: typo: sophisticated Line 349: could you give an idea of the amount of such filled missing values? Line 400: the formulation is a bit strange as it sounds like a conclusion already. Line 426: DW data are missing. Line 535: legend of figure 5 did not really apply to these figures. A complete legend should be added.

    Re-review:

    I thank the authors for the work done on the new manuscript and on Github, that address most of the concerns I raised in my first review.

    The pipeline published on GitHub now works nicely and allows to reproduces the different analyses. I only had to install manually two packages (earth and e1071). They could be easily added to the list of dependency in the R script to completely automatize the installation. The authors also clarify their analysis of the comparison of models, and the overstatement concerning the RF model has been corrected.

    I however still think that the abstract should be amended to better match the conclusions of the cross experiment test. The author acknowledged, in their response and in the text (line 226) that one cross experiment test leads to a loss of predictive accuracy.

    It seems also obvious, from Figure 5, and this should probably be added to the text, that this loss of accuracy is not linked to a greater random dispersion of the points, but to a systematic model bias. I agree with the authors that this may be due to some changes in the experimental conditions. My point is that these changes are not completely captured by the model, even with the inclusion of non structural traits. I therefore still think that there is some overstatement/ambiguity in the abstract, in particular in the sentence' The high prediction accuracy based on this model, in particular the cross experiment performance, will contribute to relieve the phenotyping bottleneck in biomass measurement in breeding applications' . This may however be easily fixed.

  2. Abstract

    This paper has been published under an Open Access CC-BY 4.0 license in the journal GigaScience, which includes Open Peer Reviews published under the same license. These are as follows:

    Reviewer 1: Malia Gehan

    Reviewer Comments to Author, version 1: Image datasets are available and are a valuable community resources. The code is available, which is great. While I definitely appreciate the authors work, I don't think the data support some of the statement throughout the paper, especially when it comes to the wording regarding MLR vs other models, unless further clarification can be provided (Figure 3). In some of the conditions (stress for example) MLR looks better than the other models. The inclusion of color, NIR, and Fluor traits into models is interesting.

    Lines 14-15: I think this statement needs to be qualified by saying that it is a challenge to find a predictive biomass model across experiments, not that it is a challenge to find a biomass model 'in the context of high-throughput phenotyping', which is vague and I don't think accurate without further clarification considering the number of previous papers that model biomass from images with high correlation to ground truth measurements.

    Lines 34 to 40: lacking in citations of literature. Introduction in general needs improvement in terms of the previous literature that it cites.

    The second paragraph of the intro is a very limited short review of the literature but there are a number of papers that model biomass using ht-phenotyping that are not represented including Yang et al 2014 (nature communications), Montest et al. 2011 (Field Crops Research), Fahlgren et al. 2015 (Molecular Plant) to name a few.

    Line 45: "On the other hand, to produce reliable assessments, suitable model types needs to be established and model construction requires integration of many components such as efficient mathematical analysis and representative data." Very vague.

    Line 58: Please clarify this statement: "Another concern is that the number of traits used in these studies were quite limited and perhaps not representative enough. Therefore, a more effective and powerful model is needed to overcome these limitations and to allow better utilization of the image-based plant features which are obtained from non-invasive phenotyping approaches." Not sure what this means exactly, very vague considering that the papers mentioned do have models of biomass that are not 'perfect' but do have high heritability and correlation with ground truth measurements.

    I think the authors need to adjust the justification of their research to stress that there needs to be biomass models that can be used across experiments/environment/treatments, which they do say, but needs to be stated more clearly. In general, many of the justification statements, which are pointed out in points 3 and 4 above are obscure to the point that they lose meaning.

    Line 146 : "Although the performance of these models was roughly similar, RF, SVR and MARS methods had better performance than the MLR method for prediction of both FW (Fig. 3B) and DW (Fig. 3D), implying a nonlinear relationship between image-based phenotypic profiles and biomass output." This doesn't seem accurate, it looks like MLR has just as good predictive power in many of the situations presented. I don't think you can say that MLR and the others are roughly similar and then say that this implies a nonlinear relationship. Can this conclusion be clarified? It seems like there are only small differences between the models.

    Regardless of whether or not random forest is the 'best' model, the data doesn't seem to support the statement that the RF model 'largely' outperformed the other models. This only seems accurate under the control condition, can this be clarified?

    Line 238: "Although previous attempts have been made to estimate plant biomass from image data, most of these studies consider only a single image-based feature or very few features in their models which are often linear-based, ignoring the fact that the phenotypic components underlying biomass accumulation are presumably complex. Accurately predicting biomass from image data requires efficient mathematical models as well as representative image-derived features." I disagree with the authors on this point, if biomass can be modeled with a few features with high correlation why does it matter if they presume that it is complex? Their more complex models were still decreased in R2 with environmental differences and between experiments and I don't find the data suggesting that RF model outperforming other models (particularly MLR) convincing without further clarification.

    Re-review: Chen et al, appear to have addressed each reviewer comment, below are some minor language changes for the revised sections.

    Minor changes (language changes)

    1. Line 47: remove "some other traits" seems unnecessary
    2. Line 64: change "they" to Buesmeyer et al. 2013, and change "make it a question" to "question"
    3. Line 73 change besides to "Further"
    4. Line 75 change to "due to a lack of datasets for assessment"