Common host variation drives malaria parasite fitness in healthy human red cells

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This paper investigates the role of common human genetic variation in explaining the relationship between host genetics, red blood cell physiology, and susceptibility to Plasmodium falciparum (the parasite responsible for malaria). It finds evidence that common variants in a small set of red blood cell proteins predict parasite invasion and growth rates. Contrary to hypotheses about ancestry-associated malaria selection, these variants are not more common in African ancestry populations. The approach used to select host factors that impact parasite fitness is pragmatic especially in the context of a small sample size, but the high predictive accuracy (despite moderate within-subject assay replicability) and the uncertain influence of including closely related family members in the analysis, raises some concerns about generalizability beyond the study sample.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The replication of Plasmodium falciparum parasites within red blood cells (RBCs) causes severe disease in humans, especially in Africa. Deleterious alleles like hemoglobin S are well-known to confer strong resistance to malaria, but the effects of common RBC variation are largely undetermined. Here, we collected fresh blood samples from 121 healthy donors, most with African ancestry, and performed exome sequencing, detailed RBC phenotyping, and parasite fitness assays. Over one-third of healthy donors unknowingly carried alleles for G6PD deficiency or hemoglobinopathies, which were associated with characteristic RBC phenotypes. Among non-carriers alone, variation in RBC hydration, membrane deformability, and volume was strongly associated with P. falciparum growth rate. Common genetic variants in PIEZO1 , SPTA1/SPTB , and several P. falciparum invasion receptors were also associated with parasite growth rate. Interestingly, we observed little or negative evidence for divergent selection on non-pathogenic RBC variation between Africans and Europeans. These findings suggest a model in which globally widespread variation in a moderate number of genes and phenotypes modulates P. falciparum fitness in RBCs.

Article activity feed

  1. Author Response:

    Reviewer #1 (Public Review):

    [...] However, I also have some concerns about the main predictive model result. Although the parasite invasion/growth phenotypes are arguably simpler than an overall in vivo malaria disease phenotype, the reported 40 - 80% variance explained by the LASSO models strikes me as concerningly optimistic. Notably, the correlation in the growth phenotype for repeated samples from the same individuals (sampled weeks apart) is only rho = 0.34 (and for invasion, it is only 0.05). Given that a trait's repeatability is the upper limit to its heritability, and genetic prediction is based on a trait's heritable component, I do not understand how the trait prediction can be as strong as currently reported. Because the result is so striking, it will be crucial to perform true out-of-sample prediction to evaluate predictive accuracy and generalization error.

    We agree with the reviewer that the high values of variance explained in an earlier version of this work may have reflected overfitting of the LASSO models, even in randomized data. We have now reanalyzed the data in a k-folds cross-validation framework, as described in Essential Revisions. As expected, we observe lower predictive accuracy in smaller test datasets than in larger train datasets. Nonetheless, real data and malaria-associated genes produce models that are significantly more predictive of P. falciparum fitness in test data than expected from permutation or random RBC genes. We note that noise across repeated measurements from the same individuals, taken weeks or months apart, is likely to reflect variation from technical inconsistencies as well as environment-dependent biology.

    Assuming out-of-sample prediction holds up, it is interesting that the genotype data add substantially to predictive accuracy even after directly considering RBC phenotypes themselves. As the authors note, this result suggests that the mechanisms through which the genetic effects act are independent of the measured phenotypes. This prediction should be further evaluated (e.g., by assessing genotype-RBC phenotype correlations).

    We agree with the reviewer that some of the observed genetics effects must be mediated through phenotypes that we did not measure, which is quite interesting given the large number of phenotypes that we did measure. Additional phenotypes of interest include quantitative proteomics, transcriptomics, and metabolomics, among others, as addressed in the revised discussion. We plan to evaluate correlations between RBC genotypes and such phenotypes in future work, as this is outside of the scope of the current manuscript.

    Finally, although the results suggest no polarization of allele frequencies by European versus African ancestry, this result should be interpreted with caution throughout the manuscript, since it's unlikely that the predictive variants identified by LASSO are in fact causal.

    We agree with the reviewer that given our SNPs are likely to be imperfectly linked to the causal SNPs, some marginal signal of ancestry polarization of the causal SNPs could be lost. In the discussion, we agree that the predictive variants identified by LASSO may merely be linked to the true causal variants. However since linked alleles have correlated frequencies within populations, we think this is unlikely to substantially impact our conclusions about African and European ancestry with regard to small-effect alleles. We discuss how the lack of enrichment for most protective alleles in Africans is also supported by recent GWAS for severe malaria (MalariaGEN, 2019) and patterns of RBC trait variation observed here and in other studies. We provide several possible explanations for this consistent observation, including extensive pleiotropy of small-effect alleles (see Boyle, Li, and Pritchard 2017 and correlations with other phenotypes in Figure 5-Source Data 3).

    Reviewer #2 (Public Review):

    [...] 1. The authors note that there is one family (mother and five children) are not carriers of known genetic loci. Figure 5-figure supplement 4 shows that they have significantly different distributions than other non-carriers with regards to principal components and parasitic invasion and growth rate. My concern is that many of the tests in the manuscript assume independent observations and related individuals violate this assumption. The children should be removed from all analyses to test for the sensitivity of results to this structure in the data.

    We have revised the analysis after excluding the five siblings and verifying that the remaining donors are unrelated.

    1. This is also related to the increase in % variance explained in their lasso models when including genetics. It would be useful to know how much of the outcome variation was from the inclusion of the principal components specifically (capturing the family) versus the variants of interest.

    In the prior analysis, the PCs specific to the family explained up to 24% of the variation in invasion and 3% in growth in non-carriers. In the current analysis with the children excluded, PCs no longer have predictive power for growth or invasion. This change reflects the genetic uniqueness of the family, which directly produced the prior associations.

    1. It would be helpful to know some more about the variants that were included from exome sequencing. This would include their allele and genotype frequencies, as well as the comparison with reference population frequencies.

    We have added Figure 1-source data 1, which contains this information for ~160,000 exome variants that passed our quality filters.

    1. Are the frequencies of known RBC disease alleles consistent with population estimates? It would be useful to assess the representativeness of the sample.

    This information is now provided in Figure 1-source data 1. The frequencies of RBC disease alleles in our sample of African and admixed individuals are consistent with estimates from African populations.

    1. I would appreciate knowing a bit more about the difference between the two strains, one lab adapted and one clinical. Is it known how the lab strain was adapted or how representative it is to circulating strains? If so, may be worth describing in the discussion to explain the differences in results between the strains.

    We have added more details on the two divergent strains to the results and methods. We also discuss the strong correlations between the strains, including for specific phenotypes and genotypes, which suggest that our results may be generalizable. Finally, we note the interesting differences between the strains for African ancestry and HbAC carriers.

  2. Evaluation Summary:

    This paper investigates the role of common human genetic variation in explaining the relationship between host genetics, red blood cell physiology, and susceptibility to Plasmodium falciparum (the parasite responsible for malaria). It finds evidence that common variants in a small set of red blood cell proteins predict parasite invasion and growth rates. Contrary to hypotheses about ancestry-associated malaria selection, these variants are not more common in African ancestry populations. The approach used to select host factors that impact parasite fitness is pragmatic especially in the context of a small sample size, but the high predictive accuracy (despite moderate within-subject assay replicability) and the uncertain influence of including closely related family members in the analysis, raises some concerns about generalizability beyond the study sample.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

  3. Reviewer #1 (Public Review):

    Understanding the genetic basis of infectious disease is a challenge, and even large-scale GWAS for important public health burdens like malaria explain only a small percentage of overall trait variability. An alternative approach is to isolate aspects of disease progression that can be studied using in vitro assays. Here, Ebel et al adopt such an approach to investigate the contribution of common genetic variants (beyond well-established malaria-associated/RBS disease alleles) to variance in Plasmodium falciparium invasion and growth rates. In a panel of 121 donors, they show that RBC phenotypes and parasite invasion/growth rates in non-disease allele carriers largely overlap with the range of phenotypes measured in carriers. Most remarkably, their results suggest that RBC phenotype data on traits like fragility and deformability, combined with genotype data for genetic variants in a small set of 23 RBC proteins, can predict up to 83% of variance in Plasmodium growth rates.

    This manuscript has several strengths. First, it highlights the importance of the normal range of phenotypic variation in non-disease allele carriers for explaining variation in malaria infection and growth, and shows that this variation is often not well-correlated with ancestry. Second, it suggests how in vitro studies of malaria infection might be useful in identifying genetic variants that are not captured in GWAS (e.g., because the disease trait modeled in GWAS reflects a complex mixture of many component processes that may be better isolated in experimental models). Third, it suggests that the expectation of higher malaria resistance allele frequencies in African ancestry populations relative to European ancestry populations may be overly simplistic, which serves as an important reminder of the complexity of human biological variation.

    However, I also have some concerns about the main predictive model result. Although the parasite invasion/growth phenotypes are arguably simpler than an overall in vivo malaria disease phenotype, the reported 40 - 80% variance explained by the LASSO models strikes me as concerningly optimistic. Notably, the correlation in the growth phenotype for repeated samples from the same individuals (sampled weeks apart) is only rho = 0.34 (and for invasion, it is only 0.05). Given that a trait's repeatability is the upper limit to its heritability, and genetic prediction is based on a trait's heritable component, I do not understand how the trait prediction can be as strong as currently reported. Because the result is so striking, it will be crucial to perform true out-of-sample prediction to evaluate predictive accuracy and generalization error.

    Assuming out-of-sample prediction holds up, it is interesting that the genotype data add substantially to predictive accuracy even after directly considering RBC phenotypes themselves. As the authors note, this result suggests that the mechanisms through which the genetic effects act are independent of the measured phenotypes. This prediction should be further evaluated (e.g., by assessing genotype-RBC phenotype correlations).

    Finally, although the results suggest no polarization of allele frequencies by European versus African ancestry, this result should be interpreted with caution throughout the manuscript, since it's unlikely that the predictive variants identified by LASSO are in fact causal.

  4. Reviewer #2 (Public Review):

    In this comprehensive study, the authors investigate the association of P. falciparum's (pathogen responsible for malaria) ability to invade and grow in red blood cell with red blood cell traits and host genetics. By directly measuring the RBC phenotypes, exome sequencing, and parasite growth rate on these samples, they can better understand how genetic variation directly influences a parasites ability to invade and replicate within RBCs, as well as investigate the role of African ancestry in this relationship. I found the paper well-written with many of the conclusions supported by the evidence presented in the main text. However, I had a few concerns that should be addressed.

    1. The authors note that there is one family (mother and five children) are not carriers of known genetic loci. Figure 5-figure supplement 4 shows that they have significantly different distributions than other non-carriers with regards to principal components and parasitic invasion and growth rate. My concern is that many of the tests in the manuscript assume independent observations and related individuals violate this assumption. The children should be removed from all analyses to test for the sensitivity of results to this structure in the data.

    2. This is also related to the increase in % variance explained in their lasso models when including genetics. It would be useful to know how much of the outcome variation was from the inclusion of the principal components specifically (capturing the family) versus the variants of interest.

    3. It would be helpful to know some more about the variants that were included from exome sequencing. This would include their allele and genotype frequencies, as well as the comparison with reference population frequencies.

    4. Are the frequencies of known RBC disease alleles consistent with population estimates? It would be useful to assess the representativeness of the sample.

    5. I would appreciate knowing a bit more about the difference between the two strains, one lab adapted and one clinical. Is it known how the lab strain was adapted or how representative it is to circulating strains? If so, may be worth describing in the discussion to explain the differences in results between the strains.

  5. Reviewer #3 (Public Review):

    The aim of this study by Ebel et al. was to investigate host genetic and phenotypic factors underlying variation in Plasmodium falciparum fitness in the red blood cell (RBC). They found that variation in common RBC phenotypes such as MCV, deformability, and hydration status contribute substantially to parasite fitness. The RBC phenotypic variations together with genetic variants in genes encoding important RBC proteins, explained 71-83% of variation in parasite growth. Importantly they did not identify large ancestry-specific effects on parasite fitness, highlighting that majority of common human genetic variation is shared among all human populations.

    The conclusions of this paper are well supported by their data and analytical approach, but some aspects of the experimental design need to be clarified and expanded.

    Strengths:

    The main strength of this study is the pragmatic approach used to identify genetic and phenotypic factors underlying variation in parasite fitness in a small sample size: utilising the Lasso approach to guide their variable selection. Their hypothesis-led approach also ensures that their significant results are reliable.

    Weaknesses:

    The limitation to the targeted variable selection approach, which the authors have acknowledged in their discussion, is that it relies heavily on prior knowledge of genetic variants and RBC phenotypes, which might miss additional host factors that influence parasite fitness.

    The authors used one laboratory parasite strain and one field isolate in the study, which might limit their conclusions on parasite fitness to strains that follow a similar invasion and growth profile.

    The population studied (admixed African ancestry individuals in a non-malaria endemic setting), might also limit their conclusions on population divergence in genetic variants and RBC phenotypes with substantial effects on parasite fitness.