1. Reviewed by eLife

    Reviewer #2:

    In this study, the authors perform an impressive field phenotyping experiment on three grafted grapevines all with a common scion cultivar 'Chambourcin' alongside an ungrafted control to assess the associations between rootstock and leaf traits. The traits collected include ionomics, metabolomics, transcriptomics, leaf morphology and physiology. In addition, the authors collect these samples at three phenological stages to incorporate seasonal variation. The authors apply a combination of classification and machine learning methods to test whether features within each phenotypic measurement are predictive of genotype. In some cases, such as the ionomics data, certain ions are predictive of rootstock genotype but only at certain seasonal time points. The datasets presented here are extensive and will be of value to the horticulture field since grafting is such a common technique used in cultivating many crops. Considering the scale of this experiment, the manuscript is at times disconnected, in large part because each dataset is analyzed independently without any integration across phenotypes. The results presented do highlight more of an effect of phenology rather than rootstock on the phenotypes measured.

    Major comments:

    1. It would be very helpful to have a diagram with the layout in the field and the sampling strategy or a more detailed explanation. This would help to associate which phenotypic data was collected at the same time and on the same plants. For example, it would expand on what is mentioned on line 348 "row 8 sampled early in the day". It would help to know what time of day the samples in each row were collected. Additionally, how do the different irrigation treatments factor into the sampling? A better introduction of the experimental design is needed at the start of the results section along with a description of the genotypes and why they were selected.

    2. I understand why running a PCA before the LDA can help reduce the dimensionality of the space to be able to invert the covariance matrix (if that was the motivation?) but is this because there were issues with running LDA alone? I wonder if you've lost important discriminating information between the classes by doing this. Was the LDA run on the datasets first prior to the PCA? This may uncover additional classification that was eliminated by the PCA.

    3. For the Random Forest analysis, the authors might consider using k-fold cross validation rather than partitioning the dataset, this is especially beneficial when working with smaller datasets and might improve the predictions. Could all the importance scores be reported rather than just the couple mentioned in the text (line 296).

    4. In reference to Figure 1B and C, it would be helpful to indicate on the plots which comparisons are significant based on their model tests. The full test results are presumably in the excel spreadsheet referred to in the reporting form although it was not found with the manuscript materials.

    5. Throughout the text there is very little mention of the various grafted genotypes and what is known about the lines. The authors should consider introducing these genotypes and why they were selected for the grafting experiment. What is different among these lines? There is very little discussion of the comparisons between genotypes and what phenotypes are significantly different between the lines and what the implications are for the plant as a whole.

    6. Line 287 refers to a post-hoc analysis of the ions, do the ions showing significant variation explained by rootstock and phenology match the ions identified in the ML as important classifiers?

    7. For such a large metabolomic dataset, it is surprising that the authors do not present any identification of the metabolites highlighted. The identification of the metabolite features that were found to influence the rootstock main effect would be of interest and might reveal interesting biology. How did these metabolites differ between genotypes? On line 501 in the discussion there is mention of flavanols and stilbenes yet these weren't highlighted in the results section.

    8. What is the reasoning for not simply applying a linear modeling approach such as limma on the gene expression data first instead of only applying it to the PCs in order to identify differentially expressed genes between the genotypes? If phenological stage is the strongest effect, what if you run the analysis within each stage to look specifically at the differential responses between grafted lines at each stage? The analysis of the gene expression data, similar to the metabolomics data, seems to be missing an opportunity to uncover underlying biological mechanisms contributing to any genotype effects of grafting, a stated goal of the study. What genes are differentially expressed and do they relate to the metabolomic or ionomic data?

    9. In the methods, there are three irrigation treatments described yet this is not mentioned in the results section. While it seems as though rainfall mitigated much of the irrigation effect there does appear to be differences in water availability to the vines as described in the provided github page. Were various irrigation treatment sets sampled for all phenotypes? Or were the ionomics, metabolomics and transcriptome analysis done on the same irrigation treatments? If not, was this effect considered in the analysis? This is yet another variable that would greatly influence the response and should be considering when assessing the effects of grafting. Further detail about the sampling and conditions is needed to clarify.

    10. In figure 1 there is information about leaf age. For the metabolomics a mature leaf was sampled, transcriptomics the youngest leaf, and physiology it is not specified. Could you clarify the leaves that were sampled and how they relate across phenotypes. This is an important point to mention given the differences observed for the ionomics data.

    11. In reference to the vine physiology, were these all collected from the same irrigation treatment? Was the sampling of each genotype spread out over the 3h window to account for time of day variation? It would be helpful to have the significant comparisons indicated in the figure. What are the letters referring to on lines 402-403 with the p. values? This section would be greatly improved by additional clarity in the text.

    12. Given the focus on grafting, the analysis presented in Figure 6 does not seem to contribute to this objective. Could this be expanded on to look within and across genotypes to see if different phenotypes covary and to compare the dimensions of variation across genotypes rather than combining them all together? This would complement the previous analyses and hopefully reveal the differences that were highlighted in the earlier sections.

    13. The results section is very disjointed and the datasets are presented almost as completely separate studies. To improve clarity in the results section, the authors might consider expanding on the findings of the LDA and ML analysis for each phenotype and connecting them together.

    Read the original source
    Was this evaluation helpful?
  2. Reviewed by eLife

    Reviewer #1:

    In this manuscript, the authors look at the influence of root stock genotype on a single scion genotype in Vitis. This includes a lovely highly replicated design including differential water availability. While the experimental design is very elegant, I'm less sure that using general PCs or ML is the best approach to grab the signal of interest.

    Is there evidence that the top 20 PCs of the metabolome or the top 100 PCs are an end point of gaining new information about the system. For example, if the top 20 PCs are all different descriptions of the water availability, then PC 21 might start to grab more information about the root-scion relationship. For example in this dataset, PC2-10 were largely about temporal block (line 314-316). In large genomic datasets like this, they have an immense amount of variation such that r2 is not a meaningful way to capture what is in a PC. I can understand the desire to minimize the statistical analysis but if the goal is to fully interrogate the dataset, the authors should provide an empirical reason for stopping at pre-ordained PCs. Or possibly better would be to grab the lsmeans for the main factors in the model to exclude factors of blocking and then run the PCs as that is the underlying interest in the experiment.

    The focus on PCs or using ML on the full dataset also hinders the ability to get at the underlying root/scion and water availability connection. Given that phenology and blocking are the main sources of variance, using these approaches rather than a direct GLM or PC on lsmeans/BLUPs weakens the authors ability to use the power in their experimental design. PC and ML can only capture the largest components of variance while GLMS that account for these larger sources of variance can begin to dive into the underlying questions. There is a possibility that the authors did attempt these directed GLMS with no luck but that was not stated.

    I think the use of PCs is maybe my biggest hindrance on the manuscript as the section on lines 409-430 which is the capstone of the paper but ends up being correlations of faceless PCs. Unfortunately this leaves the reader with the idea that phenology is simply too strong to obtain any information about the root/scion connection or the water availability connection.

    Read the original source
    Was this evaluation helpful?
  3. Reviewed by eLife

    Summary: Experimentally, this is a very solid and nicely replicated experimental design that provides a strong ability to interrogate the questions at hand. Both reviewers had a concern that the use of PCs was underpowering the analysis to test the key questions that were the goal of the experiment. The manuscript could also be improved by working to interleave the different omics datasets to develop a deeper insight.

    Read the original source
    Was this evaluation helpful?