A Data-Driven Approach to Polycystic Ovary Syndrome Diagnosis: Evaluating Machine Learning Models

Payam Mohammadi
Najmeh Parvaz
Mohammad Masoud Eslam
Sara Zareei

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

PCOS is recognized as a major health concern affecting women around the world. Early detection and treatment of PCOS significantly reduce implications in the future. Conventional diagnostic methods are resource-intensive and may be prone to inaccuracies. We should utilize early diagnostic techniques to reduce the severity and overall impact. Machine learning offers a promising approach to improving PCOS detection by analyzing clinical and demographic data efficiently.

Methods

This study utilized a dataset of 539 women, including 176 PCOS-positive cases, sourced from the Kaggle repository. Thirty-eight features, categorized into anthropometric, symptom-based, test result, and demographic variables, were analyzed. The most important Feature importance was assessed using the Mean Squared Error metric. Six machine learning models were employed to classify PCOS cases.

Results

Significant differences were observed in multiple clinical and anthropometric variables between PCOS-positive and PCOS-negative cases, including BMI, waist-to-hip ratio, antral follicle count, AMH levels, and menstrual cycle length. The most predictive features were antral follicle count, hair growth, skin pigmentation, weight gain, and fast-food consumption. Among all models, Random Forest, the highest-performing model, demonstrated the efficacy of machine learning in PCOS prediction with a 93% accuracy and 86% high sensitivity.

Conclusions

Machine learning can improve early and accurate PCOS detection, providing a cost-effective and efficient substitute for traditional methods of diagnosis. The integration of predictive models into clinical practice could facilitate timely interventions, improving patient outcomes and reducing the healthcare burden associated with PCOS.

Version published to 10.1101/2025.07.13.25331465 on medRxiv
Jul 14, 2025

Diagnosis of PCOS in Adolescent Girls Using Traditional and Ensemble Machine Learning Methods

This article has 2 authors:
1. Priyanka Pariyawala
2. Pushpal Y Desai
This article has no evaluationsLatest version Aug 8, 2025
PCOD Disease Detection Using Machine Learning

This article has 5 authors:
1. SULEKH KUMAR
2. Momita Kundu
3. Kamla Kumari
4. Prabhat Purushottam
5. MD. Shamsher Alam
This article has no evaluationsLatest version Aug 4, 2025
Interpretable machine-learning-derived diagnostic scoring panel for endometriosis identification: a study of serum amino acid profiling

This article has 13 authors:
1. Moyuan Li
2. Sujuan Xu
3. Yiwei Cao
4. Feiyang Li
5. Nuo Ye
6. Yiran Xu
7. Jian Cao
8. Aiyuan Yue
9. Tiantian Fan
10. Yichen Guo
11. Zhen Gong
12. Dake Li
13. Pengfei Xu
This article has no evaluationsLatest version Jul 29, 2025

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed

Related articles

Diagnosis of PCOS in Adolescent Girls Using Traditional and Ensemble Machine Learning Methods

PCOD Disease Detection Using Machine Learning

Interpretable machine-learning-derived diagnostic scoring panel for endometriosis identification: a study of serum amino acid profiling