Machine Learning Based-Prediction of Health Application Effectiveness on Google Play Store

Nathan Andrie Ama

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objectives: This study aims to evaluate the effectiveness of health applications on the Google Play Store by analyzing app metadata using machine learning classification models. It investigates which application features—such as AI classification, app category, update status, and version—are associated with higher user ratings.Methods: A total of 305 health-related applications were selected from the Google Play Store using keyword filters for “Health & Fitness” and “Medical.” Key metadata were extracted and preprocessed, including Classification (AI vs. Non-AI), Category, Reviews, Developer Type, Version, Release Year, and Recent Update. To address class imbalance, the SMOTE technique was applied, and three machine learning models—Naïve Bayes, K-Nearest Neighbors (KNN), and Binomial Logistic Regression—were used to predict user ratings. Results: The KNN model achieved the most balanced performance with 75.89% accuracy, 82.22% precision, and an AUC of 0.849, while Logistic Regression produced the highest precision (100%) and overall accuracy (76.32%) but lower recall (52.63%). Logistic regression analysis also showed that apps categorized under Health & Fitness, those recently updated, and AI-based apps were more likely to receive high user ratings.Conclusion: Machine learning models, particularly KNN and Logistic Regression, can reliably predict app effectiveness based on metadata. Regular updates, AI integration, and fitness-focused design are key factors linked to higher user approval, providing useful insights for developers and digital health stakeholders. Future research should consider larger and more diverse datasets and explore additional features (e.g., user sentiment from reviews, app permissions) to further improve model performance.

Version published to 10.20944/preprints202507.1535.v1
Jul 21, 2025

Comparative Study of Machine Learning Techniques for Diabetes Forecasting

This article has 2 authors:
1. Abdul Aamir Khan
2. Bk Sharma
This article has no evaluationsLatest version Jul 22, 2025
Development of a Hypertension Risk Prediction Model using Nationally Representative Survey Data: A Machine Learning Approach and Web Application Deployment

This article has 5 authors:
1. Sandip Pandey
2. Asmit Pandey
3. Aakash Neupane
4. Deepak Subedi
5. Aashish Guragain
This article has no evaluationsLatest version Sep 3, 2025
A Multi-Model Evaluation Framework for Accurate and Interpretable Heart Disease Prediction Using Ensemble Machine Learning and Low-Code Deployment Tools

This article has 1 author:
1. Mohammad Subhi Al-Batah
This article has no evaluationsLatest version Aug 7, 2025

Listed in

Abstract

Article activity feed

Related articles

Comparative Study of Machine Learning Techniques for Diabetes Forecasting

Development of a Hypertension Risk Prediction Model using Nationally Representative Survey Data: A Machine Learning Approach and Web Application Deployment

A Multi-Model Evaluation Framework for Accurate and Interpretable Heart Disease Prediction Using Ensemble Machine Learning and Low-Code Deployment Tools