Interpretable machine-learning-derived diagnostic scoring panel for endometriosis identification: a study of serum amino acid profiling
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Delayed diagnosis of endometriosis (EM) can significantly hinder treatment options and patient prognosis, adversely affecting fertility and increasing the risk of malignancy. Current diagnostic methods are invasive. In our study, ultra-high-performance liquid chromatography-tandem mass spectrometry-based serum amino acid profiling was used to compare amino acid abundance between 137 normal controls (NCs) and 167 patients with EM. Subsequently, five machine learning algorithms (random forest, neural network, extreme gradient boosting, support vector classification, and naïve Bayes) were used to establish the efficacy and stability of amino acid for diagnosing. Ultimately, seven amino acids, 3-methyl-L-histidine, kynurenine, leucine, N6-acetyl-L-lysine, phenylalanine, theanine, and tyrosine, were successfully identified as valuable variables using the SHapley Additive exPlanations (SHAP) method and were included in the diagnostic scoring panel. The areas under the receiver operating characteristic and precision-recall curves of the diagnostic scoring panel were greater than 0.8. Additionally, we characterized the EM group into several subclasses using K-Means clustering and analyzed differences between the subclasses. In conclusion, This study developed a non-invasive diagnostic scoring panel with excellent predictive ability. Consequently, it enhances the accuracy of EM diagnosis, mitigates the risk of malignancy, and facilitates the formulation of subsequent personalized treatment and prevention strategies.