Pseudo-p-Value-Based Clumping Enhanced Proteome-wide Mendelian Randomization for Identifying Coronary Heart Disease-Associated Plasma Proteins
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mendelian randomization (MR) is a powerful tool for causal inference in epidemiology. However, the presence of weak instrumental variables (IVs) and pleiotropy can lead to biased causal effect estimates. To address these issues, we develop MR-GMM, a novel MR method based on a Gaussian Mixture Model. MR-GMM classifies IVs into four categories--invalid, valid, invalid&null, and null IVs--and models their effects using a two-dimensional spike-and-slab distribution. Simulation studies demonstrate the high efficiency and robustness of MR-GMM compared to existing methods. More importantly, we propose a pseudo-p-value-based linkage disequilibrium (LD) clumping procedure to address selection bias. This refined procedure is capable of enhancing the performance of MR-GMM as well as many existing MR methods in real-world scenarios. Applying MR-GMM in a large-scale proteome-wide MR study, we identify 45 coronary heart disease-associated plasma proteins. Subsequent network and enrichment analyses highlight the potential of these proteins as biomarkers for disease diagnosis and therapeutic development.