Associations of ABO and Rhesus D blood groups with phenome-wide disease incidence: A 41-year retrospective cohort study of 482,914 patients

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This important analysis helps to shed light on the relationship between blood type and the occurrence of ICD-based phenotypes in a hospital setting. A particularly compelling strength is the analysis' reliance on a population-based patient registry. The results would be further strengthened by an exploration as to whether these phenotypes are driven by patient characteristics (e.g. ethnicity, SES) and not just blood type. Additionally, differences across blood types are driven, in part, by differences in prevalence, somewhat limiting the scope of the analytical findings.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Whether natural selection may have attributed to the observed blood group frequency differences between populations remains debatable. The ABO system has been associated with several diseases and recently also with susceptibility to COVID-19 infection. Associative studies of the RhD system and diseases are sparser. A large disease-wide risk analysis may further elucidate the relationship between the ABO/RhD blood groups and disease incidence.

Methods:

We performed a systematic log-linear quasi-Poisson regression analysis of the ABO/RhD blood groups across 1,312 phecode diagnoses. Unlike prior studies, we determined the incidence rate ratio for each individual ABO blood group relative to all other ABO blood groups as opposed to using blood group O as the reference. Moreover, we used up to 41 years of nationwide Danish follow-up data, and a disease categorization scheme specifically developed for diagnosis-wide analysis. Further, we determined associations between the ABO/RhD blood groups and the age at the first diagnosis. Estimates were adjusted for multiple testing.

Results:

The retrospective cohort included 482,914 Danish patients (60.4% females). The incidence rate ratios (IRRs) of 101 phecodes were found statistically significant between the ABO blood groups, while the IRRs of 28 phecodes were found statistically significant for the RhD blood group. The associations included cancers and musculoskeletal-, genitourinary-, endocrinal-, infectious-, cardiovascular-, and gastrointestinal diseases.

Conclusions:

We found associations of disease-wide susceptibility differences between the blood groups of the ABO and RhD systems, including cancer of the tongue, monocytic leukemia, cervical cancer, osteoarthrosis, asthma, and HIV- and hepatitis B infection. We found marginal evidence of associations between the blood groups and the age at first diagnosis.

Funding:

Novo Nordisk Foundation and the Innovation Fund Denmark

Article activity feed

  1. eLife assessment

    This important analysis helps to shed light on the relationship between blood type and the occurrence of ICD-based phenotypes in a hospital setting. A particularly compelling strength is the analysis' reliance on a population-based patient registry. The results would be further strengthened by an exploration as to whether these phenotypes are driven by patient characteristics (e.g. ethnicity, SES) and not just blood type. Additionally, differences across blood types are driven, in part, by differences in prevalence, somewhat limiting the scope of the analytical findings.

  2. Reviewer #1 (Public Review):

    This study analyses associations between different blood groups and 1,312 hospital diagnosis codes, among >480,000 Danish patients who had their blood type determined in hospitals. While biological relationships between blood types and disease are of substantial interest, unfortunately, the analyses do not adjust for ethnicity (which is correlated with both blood types and many diseases). Thus it is unclear to which extent disease associations represent relationships with the blood types, as opposed to possible differences in disease incidence or severity between people with different ethnic backgrounds (which could also be due to socioeconomic differences as well as any other factors correlated with ethnicity).

  3. Reviewer #2 (Public Review):

    The authors are building on previous work by Dahlén et al testing for phenome-wide associations between ABO/RhD blood groups. This is important for identifying potential disease mechanisms related to the blood groups, and for identifying blood groups that may be at higher risk of certain diseases. As we begin to create predictive models across diseases for precision medicine approaches in clinical care, this type of information informs the inclusion of blood groups as predictors in these models.

    Notably, this study looks at each subset of A, B, AB, and O versus the remaining groups as compared to other studies which focus on comparing O and non-O blood groups. This paper successfully estimates the incidence rate ratios for 1,312 phecodes for A, AB, B, O, and RhD blood groups. The authors also tested for associations between the age of diagnosis and blood groups. The study's conclusions largely summarize these associations, which are important for the community to browse and interpret. However, the conclusion that ABO/RhD groups are the result of selective pressure driven partially by robustness to disease is not well founded simply from the significant association statistics within the paper.

    As in all studies, there are inherent limitations in the data. The Danish National Patient Registry (DNPR) is a population-level cohort, so findings may be generalizable to Denmark or European countries. However, ascertainment biases may exist from what subset of the DNPR also had blood group determination (patients who may need blood transfusions during their hospital stay) and from the use of diagnoses from a hospital setting (most severe diseases) rather than the primary care setting.

    The statistical model used to identify these associations is sound, although additional sensitivity analyses and rationale descriptions would add clarity to the appropriateness of this model and variable selection. The authors carefully note that, based on the study design, any associations here are not to be causally interpreted. The study is well powered with nearly 500,000 patients and a median follow-up time of 40.8 years. Multiple testing burden is accounted for using FDR-adjusted p-values. The established method of phecode mapping is used for this phenome-wide approach.

  4. Reviewer #3 (Public Review):

    This article analyzes retrospective follow-up data from 482914 patients in the Danish National Patient Registry, with the goal of characterizing the association between blood type, as measured by the ABO and RhD blood group systems, and the incidence of ICD-based phenotypes ('phecodes'). The primary statistical tool employed is a log-linear model, fit separately for each phecode, with the outcome being the number of recorded phecodes per person over the follow-up period. Because the ABO blood group systems contains four subgroups, the authors choose to compare each subgroup - one at at time - against all others. The primary findings are described in Manhattan plots (one for each subgroup), which visually identify statistically significant associations between that blood group and the phecode.

    This study has a number of strengths. By using the Danish National Patient Registry, the study population is better characterizable than most phenome-wide association studies. The statistical models employed are appropriate. And the findings are clearly and concisely communicated.

    A weakness of the underlying approach is that, by separately modeling each ABO blood subgroup one at a time and collapsing the remaining subgroups, the interpretation of the resulting estimated rate ratio is based upon an assumption that the remaining subgroups have a common incidence. But this cannot be simultaneously true unless all four subgroups have a common incidence, i.e. unless the null scenario holds everywhere. The number of statistically significant phecodes in each of the ABO subgroups reflects the underlying prevalence of each subgroup (more cases allows for greater precision in estimation and therefore smaller p-values) but does not necessarily reflect actual differences in the incidence.