Machine Learning-Based Prediction of TPPA Confirmation Results in Blood Donor Syphilis Screening: A Large-Scale Multi-Algorithm Comparative Study

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

BACKGROUND Syphilis screening in blood banks relies on enzyme immunoassays (EIAs) with treponemal particle agglutination (TPPA) confirmation. This study aimed to develop and compare machine learning models for predicting TPPA confirmation results to optimize screening workflows. METHODS This retrospective cohort study analyzed 762,655 blood donor specimens from December 2020 to July 2025. Signal-to-cutoff (s/co) ratios and dual-reagent screening results were evaluated. Logistic regression, random forest, and gradient boosting models were compared using receiver operating characteristic curves and decision curve analysis. Hyperparameter optimization was performed using grid search with cross-validation. Feature importance was assessed using SHAP values. RESULTS The overall positive rate was 0.157% (1,196/762,655). TPPA confirmation rates were 89.6% for dual-reagent positive versus 26.0% for single-reagent positive samples (relative risk, 3.45; p < 0.0001). The s/co ratio demonstrated excellent predictive value (area under the curve AUC , 0.909); at the optimal threshold of 7.75, sensitivity was 81.9%, specificity was 96.7%, and positive predictive value was 98.2%. The gradient boosting model achieved the best performance (AUC, 0.946), outperforming random forest (AUC, 0.928) and logistic regression (AUC, 0.933). Decision curve analysis demonstrated higher net benefit for the gradient boosting model across clinically relevant threshold probabilities. CONCLUSION Machine learning models, particularly gradient boosting, significantly improve prediction of TPPA confirmation results. Implementation of s/co stratified management and machine learning-assisted decision systems can enhance blood safety efficiency while reducing unnecessary confirmatory testing costs.

Article activity feed