Enhancing Thyroid Disease Prediction Using Machine Learning: A Comparative Study of Ensemble Models and Class Balancing Techniques
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Thyroid disease is a significant medical condition affecting approximately 20 million Americans. The thyroid gland regulates metabolism through hormones such as triiodothyronine (T3) and thyroxine (T4), with disorders typically manifesting as hypothyroidism or hyperthyroidism. This study evaluates the performance of various machine learning models in predicting and diagnosing thyroid disease, including Logistic Regression, Decision Trees, Random Forest, XGBoost, Support Vector Machines, Neural Networks, Bagging, and Stacking methods. The bagging model utilizing three decision trees achieved the highest F1 Score of 0.9766, outperforming both Random Forest and XGBoost. Additionally, experiments on class balancing through undersampling and regrouping significantly improved model performance, particularly for stacking models with XGBoost, which attained an F1 Score of 0.9944.