Research on Multicenter Ovarian Cancer Diagnosis Based on Federated Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Due to the challenges of early diagnosis and high heterogeneity, ovarian cancer urgently requires precise diagnostic methods integrating multi-center data. This study establishes a cross-institutional collaboration framework based on federated learning (FL) to develop an auxiliary diagnostic model for benign and malignant ovarian cancer. Methods A total of 1,449 patients (752 benign, 697 malignant) from five hospitals were included. Forty-four laboratory indicators were extracted, and federated learning based on the FedAvg algorithm was conducted on a privacy computing platform developed by Healink to evaluate and compare the performance of four models: logistic regression, Softmax regression, neural network, and XGBoost. Results XGBoost showed the best performance on the test set, with an area under the curve (AUC) of 0.881 (95% CI: 0.864–0.898), an optimal threshold point (FPR = 0.237, TPR = 0.870), and a Youden index of 0.633, significantly outperforming other models (P < 0.05). The neural network demonstrated robust generalization ability, with the smallest AUC difference (0.002) between the training and test sets. Feature importance analysis showed that lactate dehydrogenase (LDH, SHAP value + 0.28 ± 0.12) and platelet count (PLT, SHAP value + 0.25 ± 0.09) were the core predictive indicators, reflecting tumor metabolic activity and coagulation activation respectively, which were highly consistent with the pathological mechanisms of ovarian cancer. Conclusion The federated learning framework effectively integrates multi-center data, and the XGBoost model provides a reliable tool for pre-surgical auxiliary diagnosis of ovarian cancer. Incorporating more clinical features is needed in the future to improve accuracy. Meanwhile, through the ICER economic benefit analysis, it can be proved that the AI diagnostic model improves the health quality of hospitals and patients after treatment. Establishing a more complete long-term disease change model can provide a more comprehensive economic benefit analysis.

Article activity feed