The relationship between BMI and COVID-19: exploring misclassification and selection bias in a two-sample Mendelian randomisation study

Gemma L Clayton
Ana Gonçalves Soares
Neil Goulding
Maria Carolina Borges
Michael V Holmes
George Davey Smith
Kate Tilling
Deborah A Lawlor
Alice R Carter

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Objective

To use the example of the effect of body mass index (BMI) on COVID-19 susceptibility and severity to illustrate methods to explore potential selection and misclassification bias in Mendelian randomisation (MR) of COVID-19 determinants.

Design

Two-sample MR analysis.

Setting

Summary statistics from the Genetic Investigation of ANthropometric Traits (GIANT) and COVID-19 Host Genetics Initiative (HGI) consortia.

Participants

681,275 participants in GIANT and more than 2.5 million people from the COVID-19 HGI consortia.

Exposure

Genetically instrumented BMI.

Main outcome measures

Seven case/control definitions for SARS-CoV-2 infection and COVID-19 severity: very severe respiratory confirmed COVID-19 vs not hospitalised COVID-19 (A1) and vs population (those who were never tested, tested negative or had unknown testing status (A2)); hospitalised COVID-19 vs not hospitalised COVID-19 (B1) and vs population (B2); COVID-19 vs lab/self-reported negative (C1) and vs population (C2); and predicted COVID-19 from self-reported symptoms vs predicted or self-reported non-COVID-19 (D1).

Results

With the exception of A1 comparison, genetically higher BMI was associated with higher odds of COVID-19 in all comparison groups, with odds ratios (OR) ranging from 1.11 (95%CI: 0.94, 1.32) for D1 to 1.57 (95%CI: 1.57 (1.39, 1.78) for A2. As a method to assess selection bias, we found no strong evidence of an effect of COVID-19 on BMI in a ‘no-relevance’ analysis, in which COVID-19 was considered the exposure, although measured after BMI. We found evidence of genetic correlation between COVID-19 outcomes and potential predictors of selection determined a priori (smoking, education, and income), which could either indicate selection bias or a causal pathway to infection. Results from multivariable MR adjusting for these predictors of selection yielded similar results to the main analysis, suggesting the latter.

Conclusions

We have proposed a set of analyses for exploring potential selection and misclassification bias in MR studies of risk factors for SARS-CoV-2 infection and COVID-19 and demonstrated this with an illustrative example. Although selection by socioeconomic position and arelated traits is present, MR results are not substantially affected by selection/misclassification bias in our example. We recommend the methods we demonstrate, and provide detailed analytic code for their use, are used in MR studies assessing risk factors for COVID-19, and other MR studies where such biases are likely in the available data.

Summary

What is already known on this topic

Mendelian randomisation (MR) studies have been conducted to investigate the potential causal relationship between body mass index (BMI) and COVID-19 susceptibility and severity.

There are several sources of selection (e.g. when only subgroups with specific characteristics are tested or respond to study questionnaires) and misclassification (e.g. those not tested are assumed not to have COVID-19) that could bias MR studies of risk factors for COVID-19.

Previous MR studies have not explored how selection and misclassification bias in the underlying genome-wide association studies could bias MR results.

What this study adds

Using the most recent release of the COVID-19 Host Genetics Initiative data (with data up to June 2021), we demonstrate a potential causal effect of BMI on susceptibility to detected SARS-CoV-2 infection and on severe COVID-19 disease, and that these results are unlikely to be substantially biased due to selection and misclassification.

This conclusion is based on no evidence of an effect of COVID-19 on BMI (a ‘no-relevance control’ study, as BMI was measured before the COVID-19 pandemic) and finding genetic correlation between predictors of selection (e.g. socioeconomic position) and COVID-19 for which multivariable MR supported a role in causing susceptibility to infection.

We recommend studies use the set of analyses demonstrated here in future MR studies of COVID-19 risk factors, or other examples where selection bias is likely.

ScreenIT
Mar 8, 2022
SciScore for 10.1101/2022.03.03.22271836: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected.
Sex as a biological variable not detected.
Randomization not detected.
Blinding not detected.
Power Analysis not detected.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
Strengths and limitations of the study: To our knowledge this is the first MR study of COVID-19 outcomes to take a systematic approach to explore selection and misclassification bias in this area, including a novel application of ‘no-relevance’ analyses and exploiting time-varying associations. In our illustrative example and in the ‘no-relevance’ analyses we have …
SciScore for 10.1101/2022.03.03.22271836: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected.
Sex as a biological variable not detected.
Randomization not detected.
Blinding not detected.
Power Analysis not detected.
Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
Strengths and limitations of the study: To our knowledge this is the first MR study of COVID-19 outcomes to take a systematic approach to explore selection and misclassification bias in this area, including a novel application of ‘no-relevance’ analyses and exploiting time-varying associations. In our illustrative example and in the ‘no-relevance’ analyses we have conducted appropriate sensitivity analyses to assess the plausibility of the MR assumptions, including comparing estimates from a BMI GWAS without UK Biobank participants included and a negative control design to explore possible bias due to population stratification (7). The summary statistics used for COVID-19 are from the largest available GWAS, incorporating many global studies with differing selection processes (31). The ‘no-relevance’ analysis results may have been biased by weak instruments, as there were far fewer SNPs for any COVID-19 instruments than for BMI (the exposure in our ‘real’ illustrative example). Weak instrument bias in two-sample MR is expected to bias estimates towards the null. However, the F-statistics for the ‘no-relevance’ COVID-19 instruments varied from 22 to 134, suggesting weak instrument bias is unlikely. For some of the COVID-19 exposures in the ‘no-relevance’ analyses we had to relax the p-value threshold in order to have genetic instruments, and this could result in increased likelihood of horizontal pleiotropy. Consistent results from MR-Egger, weighted median analyses, and the ‘...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
Read the original source
Version published to 10.1101/2022.03.03.22271836 on medRxiv
Mar 5, 2022

Determining Predictive Relationships Between <i>AGTR1</i> and <i>ACE2</i> Polymorphisms with Hypertension and COVID-19 in Patients at A Tshwane Academic Hospital

This article has 8 authors:
1. Joseph Musonda Chalwe
2. Retsilisitsoe Raymond Moholisa
3. Ndimo Rahab Modipane
4. Saidon Mbambara
5. Relebohile Matthew Matobole
6. Boitumelo Moetlhoa
7. Mike Machaba Sathekge
8. Mankgopo Kgatle
This article has no evaluationsLatest version Jan 2, 2026
Diabetes Mellitus as an Independent Predictor of Severe COVID-19 Disease: A Kenyan Hospital Study

This article has 9 authors:
1. Felix Pius Omullo
2. Annepatricia Waithera Kamau
3. Kipsang Ignatius
4. Otsyulah Holyne Fridah
5. Adam Osman Abdimajid
6. Wachiye Wanyonyi Alex
7. Muraria Kalen Makena
8. Laura Anyango Oweke
9. Morema Cynthia
This article has no evaluationsLatest version Feb 2, 2026
Sociodemographic and Clinical Predictors of Chronic Disease Outcomes in a Colombian Population: A Cross-Sectional Analysis of 2495 Patients

This article has 6 authors:
1. Adriana Guzmán Sánchez
2. Lilibeth Sánchez-Guette
3. Armando Monterrosa-Quintero
4. Yaneth Herazo-Beltrán
5. Narledis Nuñez-Bravo
6. Carlos Andrés Collazos Morales
This article has no evaluationsLatest version Dec 18, 2025

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.