Health Indicator Predictions from lifestyle and biometric data using Machine Learning Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study investigates the ability of machine learning models, specifically neural networks and tree-based classification, to predict the likelihood of being healthy versus having a disease using a dataset comprising 100,000 surveyed individuals, based on lifestyle, biometric, behavioral, and demographic factors. Despite applying feature scaling and data preprocessing to the dataset, the models were unable to predict with high accuracy whether individuals were healthy or diseased based on the input features provided. The findings accentuate the importance of rich and comprehensible feature input and effective data integration in enhancing prediction accuracy. These results have further been compared to existing studies, such as those by Kim et al. (2024) and Effiok et al. (2022), which link predictive models to real patient data, demonstrating that real-world scenarios require richer, more diverse, and comprehensive input data.

Article activity feed