Improving statistical power in severe malaria genetic association studies by augmenting phenotypic precision

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The fundamental premise of genome wide association studies for severe malaria is to take a population with confirmed severe malaria and compare with a control group who do not have severe malaria. This paper presents a novel and valuable method for improving power for severe malaria genetic association studies. The method would also be useful for studies of other disease where there is a clinical definition that sometimes includes people who do not truly have the disease.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Severe falciparum malaria has substantially affected human evolution. Genetic association studies of patients with clinically defined severe malaria and matched population controls have helped characterise human genetic susceptibility to severe malaria, but phenotypic imprecision compromises discovered associations. In areas of high malaria transmission, the diagnosis of severe malaria in young children and, in particular, the distinction from bacterial sepsis are imprecise. We developed a probabilistic diagnostic model of severe malaria using platelet and white count data. Under this model, we re-analysed clinical and genetic data from 2220 Kenyan children with clinically defined severe malaria and 3940 population controls, adjusting for phenotype mis-labelling. Our model, validated by the distribution of sickle trait, estimated that approximately one-third of cases did not have severe malaria. We propose a data-tilting approach for case-control studies with phenotype mis-labelling and show that this reduces false discovery rates and improves statistical power in genome-wide association studies.

Article activity feed

  1. Evaluation Summary:

    The fundamental premise of genome wide association studies for severe malaria is to take a population with confirmed severe malaria and compare with a control group who do not have severe malaria. This paper presents a novel and valuable method for improving power for severe malaria genetic association studies. The method would also be useful for studies of other disease where there is a clinical definition that sometimes includes people who do not truly have the disease.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    When an outcome is sometimes misclassified, it can blur an association between the treatment and the outcome and reduce the power of a study of the effect of the treatment on an outcome. This is a problem in studies of the effect of genotypes on severe malaria when the standard clinical definition of severe malaria is used because the standard clinical definition of severe malaria prioritizes sensitivity over specificity (because the loss from failing to treat a child for severe malaria is much greater than the loss from treating a child who doesn't have severe malaria). In this study, the authors use standardly available clinical data -- platelet count and white blood cell count -- to increase the specificity of the definition of severe malaria in studies of the effect of genotypes on severe malaria. The authors then use a data tilting approach to put more weight on clinically defined severe malaria cases that meet this more specific case definition of severe malaria. The authors show that their approach reduces false discovery rates in an empirical study. The authors also report the interesting finding that approximately one third of clinically defined severe malaria cases in a study of Kenyan children did not have severe malaria.

    This paper presents a novel and valuable method for improving power for severe malaria genetic association studies that would also be useful for studies of other disease where there is a clinical definition that lacks high specificity.

  3. Reviewer #2 (Public Review):

    The fundamental premise of genome wide association studies for severe malaria is to take a population with confirmed severe malaria and compare with a control group who do not have severe malaria. The author's hypothesis is that in areas with high levels of malaria transmission the severe malaria group gets diluted by patients who have been mis-classified with severe malaria (but are ill with something else). This dilution of the severe malaria group then dilutes the effect size for differences between the control group.

    The authors propose a statistical method for correcting for the diluted severe malaria group via an approach of data tilting. The consequences of this adjustment are then followed through to a logical and sensible conclusion, namely that correcting for this dilution can lead to more hits in GWAS studies and greater effect sizes. I'm not an expert in genetic association studies, but to my untrained eye, this portion of the analysis checks out (roughly speaking Figures 4 - 6). Instead I will focus my attention on the probabilistic diagnostic model (roughly speaking Figures 1 - 3).

    Something I struggled with was keeping track of the different datasets. To this extent, a table summarizing the cohorts with summary statistics such as geographic location, age, symptom severity, and other relevant epidemiological information would be very useful.

    My primary concern is on the comparability of the training data (Asian adults, Asian children, African children with high PfHRP2) and testing data (Kenyan). It's crucial that the model trained on the Asian adult data (highly specific) is valid for application on African children. What I would like to see is a more explicit demonstration that what we observe about severe malaria in Asian adults applies to Asian children, applies to African children. There is evidence for this in Figure 1B and Figure S2, but there are so many different data sets, that my tired mind found it difficult to follow.

    Figure 1B. For the grey line fitted to the FEAST data, does this also include the PfHRP2 = 1 data. As this was non-detectable, is this a valid thing to do?

    Figure 3. Can you check the panel labels? What's the horizontal dashed line?

    Were they significant associations between parasite density and the probability of severe malaria.