Clinical Applications of AI in Sexually Transmitted Infection and Anogenital Dermatoses in Sexual Health: A Systematic Review and Meta-Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Artificial intelligence (AI) excels in dermatology. However, its applications to sexually transmitted infections (STIs) remain unclear. We assessed the performance of AI algorithms and their applications in detecting STIs and anogenital dermatoses in sexual health. Methods We followed the PRISMA guidelines and searched six databases from January 1, 2010, to April 12, 2024, for studies using AI to identify STIs and anogenital dermatoses. We used a modified QUADAS-2 tool and the CLEAR Derm checklist for quality assessment. We conducted a bivariate random-effect meta-analysis to estimate the pooled sensitivity and specificity of AI applications for the conditions where sufficient data existed. Subgroup analysis and meta-regression were conducted to explore potential heterogeneity sources for mpox studies. Results Of 5,381 studies screened, 141 met the inclusion criteria. Most studies reported on mpox (111, 62.4%), while anogenital conditions, including syphilis, genital herpes, genital warts, scabies, psoriasis, lichenoid changes, and molluscum contagiosum, received less attention (each < 6.0% of the studies). Meta-analyses showed high performance of AI for mpox identification (pooled sensitivity: 0.96 [95% CI: 0.93–0.97], pooled specificity: 0.98 [0.97–0.99]), herpes simplex (0.91 [0.71–0.98], 0.97 [0.94–0.98]), genital warts (0.87 [0.67–0.96], 0.98 [0.95–0.99]), psoriasis (0.90 [0.78–0.95], 0.98 [0.96–0.99]), and scabies (0.89 [0.84–0.93], 0.98 [0.95–0.99]). We could not pool the sensitivity and specificity for other conditions due to insufficient data points. Meta-regression for mpox studies revealed higher pooled sensitivity in models with larger datasets (≥ 1000 images) and binary classification approaches compared to those with smaller datasets and multiclass predictions (p < 0.05). Study quality was variable and our assessment identified high risk of bias across the population selection (76.1%), reference standards (76.1%), and index tests (20.0%). Most studies relied on open-source datasets (87.8%), lacked external validation, and remained at the proof-of-concept stage without clinical implementation. Conclusions While AI shows potential promising performance for identifying STIs and anogenital dermatoses, significant research gaps exist. Future work should prioritise understudied STI and differential conditions, while improving data quality, conducting external validation, and validating in clinical settings. Clear policy guidance and standards are needed to determine how best to implement AI tools for diagnostic purposes and to provide clear performance criteria and frameworks for AI developers, healthcare providers, and clients.