Machine Learning-Based Symptom-Disease Prediction: A Comprehensive Analysis of Multi-Class Classification Models in Healthcare Decision Support Systems

Shouvik Sharma

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Healthcare decision support systems require accurate and efficient methods for disease prediction based on patient symptoms. This study presents a comprehensive analysis of machine learning approaches for multi-class disease classification using both synthetic and real healthcare datasets. We evaluate three machine learning algorithms: Logistic Regression, Random Forest, and Gradient Boosting, achieving classification accuracies of 96.5%, 96.2%, and 96.0% respectively on real clinical data. Our analysis reveals significant symptom-disease relationship patterns, with loss of taste/smell, cough, and fatigue emerging as the most predictive features in real data. The Logistic Regression model demonstrated superior performance with an AUC of 0.999, indicating exceptional discriminative ability across multiple disease classes. We provide detailed feature importance analysis, symptom correlation matrices, and demographic insights that can inform clinical decision-making processes. The real dataset exhibits realistic disease prevalence patterns with 5,000 patients across 10 disease categories and 32 symptom features. Our findings demonstrate the feasibility of automated symptom-based disease prediction systems and provide a foundation for developing clinical decision support tools. This work contributes to the growing body of literature on AI-assisted healthcare diagnostics and establishes benchmarks for future research in symptom-disease prediction models using real clinical data. In addition, we introduce a novel Adaptive Hierarchical Ensemble (AHE) model that achieves substantial computational efficiency (76.5% feature reduction) while maintaining high accuracy (93.5

Version published to 10.21203/rs.3.rs-7153025/v1 on Research Square
Jul 28, 2025

A Multi-Model Evaluation Framework for Accurate and Interpretable Heart Disease Prediction Using Ensemble Machine Learning and Low-Code Deployment Tools

This article has 1 author:
1. Mohammad Subhi Al-Batah
This article has no evaluationsLatest version Aug 7, 2025
Comparative Study of Machine Learning Techniques for Diabetes Forecasting

This article has 2 authors:
1. Abdul Aamir Khan
2. Bk Sharma
This article has no evaluationsLatest version Jul 22, 2025
Construction and Validation of an Interpretable Machine Learning Model for Predicting Diabetes Risk in COPD Patients

This article has 13 authors:
1. Lingpin Pang
2. Siyan Xu
3. Yingxin Wang
4. Tao Huang
5. Qian Xian
6. Wenjia Lin
7. Haowen Pang
8. Zhirui Chen
9. Bozhi Zhong
10. Hui Miao
11. Hui Chen
12. Xishi Sun
13. Jie Sun
This article has no evaluationsLatest version Aug 19, 2025

Listed in

Abstract

Article activity feed

Related articles

A Multi-Model Evaluation Framework for Accurate and Interpretable Heart Disease Prediction Using Ensemble Machine Learning and Low-Code Deployment Tools

Comparative Study of Machine Learning Techniques for Diabetes Forecasting

Construction and Validation of an Interpretable Machine Learning Model for Predicting Diabetes Risk in COPD Patients