Improving GWAS performance in underrepresented groups by appropriate modeling of genetics, environment, and sociocultural factors

Chelsea C. Cataldo-Ramirez
Meng Lin
Aislinn Mcmahon
Christopher R. Gignoux
Timothy D. Weaver
Brenna M. Henn

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Genome-wide association studies (GWAS) and polygenic score (PGS) development are typically constrained by the data available in biobank repositories in which European cohorts are vastly overrepresented. Here, we increase the utility of non-European participant data within the UK Biobank (UKB) by characterizing the genetic affinities of UKB participants who self-identify as Bangladeshi, Indian, Pakistani, “White and Asian” (WA), and “Any Other Asian” (AOA), towards creating a more robust South Asian sample size for future genetic analyses. We assess the relationships between genetic structure and self-selected ethnic identities resulting in consistent patterns of clustering used to train a support vector machine (SVM). The SVM model was utilized to reassign n = 1,853 AOA and WA participants at the subcontinental level, and increase the sample size of the UKB South Asian group by 1,381 additional participants. We then leverage these samples to assess GWAS performance and PGS development. We further include environmental covariates in the height GWAS by implementing a rigorous covariate selection procedure, and compare the outputs of two GWAS models: GWAS _null and GWAS _env . We show that PGS performance derived from environmentally adjusted GWAS yields comparable prediction to PGS models developed with an order of magnitude larger training dataset ( R ² =0.021 vs 0.026). Models with 7 - 8 environmental covariates double the variance explained by PGS alone. In summary, we demonstrate how GWAS performance can be improved by leveraging ambiguous ethnicity codes, ancestry matched imputation panels, and including environmental covariates.

Version published to 10.1101/2024.10.28.620716v1 on bioRxiv
Oct 29, 2024

Population-Specific Polygenic Risk Scores Developed for the Han Chinese

This article has 128 authors:
1. Hung-Hsin Chen
2. Chien-Hsiun Chen
3. Ming-Chih Hou
4. Yun-Ching Fu
5. Ling-Hui Li
6. Che-Yu Chou
7. Erh-Chan Yeh
8. Ming-Fang Tsai
9. Chun-houh Chen
10. Hsin-Chou Yang
11. Yen-Tsung Huang
12. Yi-Min Liu
13. Chun-yu Wei
14. Jen-Ping Su
15. Wan-Jia Lin
16. Elin H.F. Wang
17. Chi-Lu Chiang
18. Jeng-Kai Jiang
19. I-Hui Lee
20. Kung-Hao Liang
21. Wei-Sheng Chen
22. Hung-Cheng Tsai
23. Shih-Yao Lin
24. Fu-Pang Chang
25. Hsiang-Ling Ho
26. Yi-Chen Yeh
27. Wei-Cheng Tseng
28. Ming-Hwai Lin
29. Hsiao-Ting Chang
30. Ling-Ming Tseng
31. Wen-Yih Liang
32. Paul Chih-Hsueh Chen
33. Yu-Cheng Hsieh
34. Yi-Ming Chen
35. Tzu-Hung Hsiao
36. Ching_Heng Lin
37. Yen-Ju Chen
38. I-Chieh Chen
39. Chien-Lin Mao
40. Shu-Jung Chang
41. Yen-Lin Chang
42. Yi_Ju Liao
43. Chih-Hung Lai
44. Wei-Ju Lee
45. Hsin Tung
46. Ting-Ting Yen
47. Hsin-Chien Yen
48. Ming-Yao Chen
49. Ying-Chin Lin
50. Yung-Ta Kao
51. Bi-Zhen Kao
52. Jing-Er Lee
53. Chi-Li Chung
54. Ju-Chi Liu
55. Paul Chan
56. Chang-Hsien Lin
57. Chen Chia-Hsin
58. I-Chen Wu
59. Lung-Chang Lin
60. Jiunn-Wei Wang
61. Shen-liang Shih
62. Sun-Wung Hsieh
63. Chih-Hsing Hung
64. Wei-Ming Li
65. Chih-Jen Yang
66. Cheng-Shin Yang
67. Ru-Hui Weng
68. Yu-Chi Chen
69. Chun-Ping Chang
70. Tai-Hsun Wu
71. Yu-Chang Lin
72. Yi-Jing Sheen
73. Shi-Heng Wang
74. Sye-Pu Chen
75. Timothy Raben
76. Erik Widen
77. Stephen Hsu
78. Feng-Jen Hsieh
79. Dong-Ru Ho
80. Yu-Huei Huang
81. Chung-Han Yang
82. Yu-Shu Huang
83. Yen-Fu Chen
84. Hsien-Ming Wu
85. Ping-Han Tsai
86. Kuan-Gen Huang
87. Chih-Yen Chien
88. Yi-Lwun Ho
89. Ming-Shiang Wu
90. Jia-Horng Kao
91. Yen-Bin Liu
92. Jyh-Ming Jimmy Juang
93. Mao-Hsin Lin
94. Yen-Hung Lin
95. Ji-Yuh Lee
96. Hsueh-Ju Lu
97. Chieh-Hua Lu
98. An-Chieh Feng
99. Jhih-Syuan Liu
100. Chien-Ping Chiang
101. Nain-Feng Chu
102. Jung-Chun Lin
103. Yi-Wei Yeh
104. En Meng
105. Chih-Yang Huang
106. Chi-Cheng Li
107. Tso-Fu Wang
108. Kuei-Ying Su
109. Jia-Kang Wang
110. Mei-Hsiu Chen
111. Hua-Fen Chen
112. Gwo-Chin Ma
113. Ting-Yu Chang
114. Fu-Tien Chiang
115. Hsing-Jung Chang
116. Kuo-Jang Kao
117. Chen-Fang Hung
118. Ching-Yao Tsai
119. Po-Yueh Chen
120. Kochung Tsui
121. Pui-Yan Kwok
122. Wayne Huey-Herng Sheu
123. Shun-Fa Yang
124. Jyh-Ming Liou
125. Jaw-Yuan Wang
126. Jeng-Fong Chiou
127. Jer-Yuarn Wu
128. Cathy S.-J. Fann
This article has no evaluationsLatest version Oct 15, 2024
Cross-ancestry analysis identifies genes associated with obesity risk and protection

This article has 2 authors:
1. Deepro Banerjee
2. Santhosh Girirajan
This article has no evaluationsLatest version Oct 16, 2024
Genome-wide association study for circulating metabolites in 619,372 individuals

This article has 12 authors:
1. Ralf Tambets
2. Jaanika Kronberg
3. Erik Abner
4. Urmo Võsa
5. Ida Rahu
6. Nele Taba
7. Anastassia Kolde
8. Estonian Biobank Research Team
9. Krista Fischer
10. Tõnu Esko
11. Kaur Alasoo
12. Priit Palta
This article has no evaluationsLatest version Oct 31, 2024

Listed in

Abstract

Article activity feed

Related articles

Population-Specific Polygenic Risk Scores Developed for the Han Chinese

Cross-ancestry analysis identifies genes associated with obesity risk and protection

Genome-wide association study for circulating metabolites in 619,372 individuals