Polygenic prediction of school performance in children with and without psychiatric disorders

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Summary: This is an interesting study researching how educational achievement (EA) can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. The study is well powered using an impressive and representative sample and offers insights into the etiology of associations between psychiatric traits and educational achievement. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Suboptimal school performance is often seen in children with psychiatric disorders and is influenced by both genetics and the environment. Educational attainment polygenic score (EA-PGS) has been shown to significantly predict school performance in the general population. Here we analyze the association of EA-PGS with school performance in 18,495 children with and 12,487, without one or more of six psychiatric disorders and show that variance explained in the school performance by the EA-PGS is substantially lower in children with attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). Accounting for parents’ socioeconomic status obliterated the variance difference between ADHD–but not ASD–and controls. Given that a large proportion of the prediction performance of EA-PGS originate from family environment, our findings hint that family environmental influences on school performance might differ between ADHD and controls; studying the same further will open new avenues to improve the school performance of children with ADHD.

Article activity feed

  1. Reviewer #2:

    The manuscript addresses an interesting question: whether genetic effects of common variants on educational attainment (EA) differ between individuals with and without psychiatric diagnoses. The dataset they use is ideally suited for such an analysis. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.

    My main concern with the paper is the statistical analyses used to support the authors' conclusions. The authors draw conclusions from dividing individuals into subgroups and comparing the R^2 of the EA PGS between those subgroups. This analysis is liable to bias due to range restriction: if the subgroups have been selected based on low/high education, then the R^2 of a predictor will tend to be lower in the subgroups than in the overall sample. Furthermore, here the selection into the subgroup (here diagnosis with ASD or ADHD) itself is related to both education and the EA PGS, which could be contributing to the differences in R^2 the authors observe between subgroups.

    A more powerful and robust analysis would be to fit an interaction model in the full sample. The authors could regress individual's EA jointly onto their EA PGS, their diagnoses coded as binary variables, and the interactions between the EA PGS and the diagnoses codings. The authors could do this jointly for all diagnoses in the full sample, which would account for comorbidities between psychiatric disorders. If the influence of the EA PGS is truly weaker in ASD and ADHD cases, there should be a negative interaction effect between the EA PGS and ASD and ADHD diagnoses, which can be tested with a simple statistical test for a non-zero interaction effect.

    It could also be worth first regressing the EA PGS onto the psychiatric diagnoses, and taking the residuals before assessing whether there are interactions between the EA PGS and ADHD/ASD diagnosis. It is possible that correlation between the EA PGS and ADHD/ASD diagnosis could generate a spurious interaction effect in the above analysis.

    It is interesting that controlling for SES appears to mediate the (potential) interaction between EA PGS and ADHD diagnosis. However, I worry again that this could be a function of SES influencing ADHD diagnosis. SES and its interaction with both EA PGS and ADHD diagnosis could also be included in a full interaction model that could help interpret this finding.

    The authors construct the PGS by using a pruning and thresholding approach. This is known to be suboptimal, which may explain why their R^2 is lower than in other studies. The authors could use LD-pred or other methods that account for linkage disequilibrium and non-infinitesimal genetic architectures. In the EA GWAS from which the score was constructed, the best R^2 was found by applying LD-pred to all variants without p-value thresholding.

    The hypothesis that indirect genetic effects differ between psychiatric cases and controls is interesting. Do the authors have sufficient sibling data within their samples to test this?

    Line 581: Closely related individuals were removed from the analysis. Why? How many were removed? Could inclusion of these help assess the hypothesis about indirect genetic effects and improve power? The authors could use a mixed model regression to control for relatedness without having to throw individuals out of their sample.

    The grammar in the writing of the paper is a little odd at times. Often, definite or indefinite articles are omitted preceding nouns, such as in 'association of EA-PGS' in the abstract, which should be 'association of the EA-PGS'.

    line 54: 'strongly influences', I think this is a little overconfident in its assignment of causality to highest level of education, perhaps 'strongly associated' would be better

    Paragraph 3 of the introduction: the authors should mention population stratification and assortative mating as possible mediators of the association between EA PGS and EA, especially when referencing the drop in association strength in within-family designs

    I found the decile based analyses a bit pointless. By arbitrarily dividing a continuous outcome into discrete subgroups, the authors are losing power and not gaining much compared to simply performing linear regression, which they already do. I would relegate these to supplementary figures.

    Line 452: I think that the stated equivalence between low EA PGS and learning difficulties goes a bit too far here. I understand the point the authors are trying to make, but I think it should be phrased more carefully.

    The authors used an MAF threshold of 5% for construction of the score. Typically, a threshold of 1% is used for construction of PGS from summary statistics by software such as LD-pred.

    Line 580: the authors state that an EA PGS based on summary statistics from European samples cannot be used to predict EA in non-European samples. This is not true. It is true that the prediction accuracy is attenuated, but it is not zero.

  2. Reviewer #1:

    This is overall a well written and methodologically sound study researching how educational achievement can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. I think that it is a very important study area, the study is well powered using a fantastic representative sample and offers some insights into aetiology of associations between psychiatric traits and educational achievement.

    I suggest some minor adjustments for the authors to consider, mainly addressing the conclusions and implications of the findings. I also recommend some clarifications in the methods and the results sections; these suggestions might require some very modest additional analyses and rethinking/rewording some of the conclusions.

    • The major issue I have is that you discuss family SES as a purely environmental factor throughout the manuscript. However, we know that this is not the case and that there is substantial heritability for SES. It follows from what SES composite is made out of, in your case parental education and occupation, both of which are highly heritable (as you rightly note in the manuscript yourself). This needs to be addressed and discussed throughout the manuscript.

    • The major conclusion in the manuscript, even if you acknowledge that this is speculation, is that the attenuation of the association between EA-PGS and school grades after correcting for SES can be explained by genetic nurture. I agree, this can be one of the explanations, however, here you also control (partially) for transmitted genes, that is educationally related genetic variants present in both generations (so without genotyped trios here you cannot distinguish between direct and indirect genetic effects). In addition, this attenuation can also be explained by gene and environment correlation (not only passive which is addressed by genetic nurture hypothesis) but also active and evocative rGE. In addition, in your design, you need to consider assortative mating. I suggest directly addressing this in the manuscript.

    • I also think that you should address that you are dealing with diagnosed disorders only. It is a great strength of the paper, and you are using a fantastic resource, but we know that these disorders are quantitative traits and your study does not allow to take that into account, so there are possibly individuals with high ADHD symptoms are included in the control group; similarly, you cannot take into account the symptom severity. In terms of symptom level data, I see you have referenced Selzam et al., 2019 paper that, among other things, related EA-PGS to ADHD symptoms and vice versa, and also controlled for SES.

    • In the introduction, you rightly state that individual differences are explained by genetic and environmental factors and the interplay between them, however, I suggest rephrasing it, because "much of the variance can be explained" is incorrect, all of the individual differences can be explained by the combination of these factors.

    • You report low rG between schizophrenia and E1, can you specify how this was calculated

    • You state that your prediction in the control sample is lower than the other studies and offer a possible solution of the inclusion or exclusion of 23andMe data in the summary statistics, please note that other studies have not used 23ndme statistics either (for example TEDS publications). You also discuss genetic heterogeneity; I think that the difference can be explained by both genetic and environmental heterogeneity. What is the rG between EA in your sample and GWAS sample?

    • I think that the conclusion that the impact of low EA-PGS is comparable to the impact of ADHD is too strong, your data does not support this strong conclusion. I suggest rephrasing it, especially as we're not aware of the associated mechanisms. Note that people with ADHD in your sample also have lower EA-PGS compared to control conditions. In addition, symptom severity of ADHD varies greatly.

    • I also do not agree with the statement that having wealthy parents does not boost the performance as much for children with ADHD as compared to children without for the reasons mentioned above.

    • I think that you have fantastic data, and you have data available about how many of your participants have multiple diagnoses. I suggest adding a stratified group with multiple diagnoses to the analyses, that is adding groups with 2, 3 or 4 and more psychiatric diagnoses and checking their polygenic score prediction to EA.

    • I suggest making it clearer what covariates were used in every analysis (you say first that you added psychiatric diagnoses as covariate among the usual covariates, but later only that covariates were included 'as before', I assume you did not include diagnoses in later analyses, but this is not clear). In addition, it is not clear to me why you control for psychiatric diagnoses in the first set of analyses, I would have wanted to see full results without this covariate.

    Overall, this is a beautiful study and it was a pleasure to read/review it.

  3. Summary: This is an interesting study researching how educational achievement (EA) can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. The study is well powered using an impressive and representative sample and offers insights into the etiology of associations between psychiatric traits and educational achievement. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.