Exploring Novel Biomarkers for Early Detection of Osteoporosis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose/Introduction
Osteoporosis is characterized by diminished BMD and deteriorated bone microstructure, significantly increasing fracture susceptibility. This study leverages machine learning for identifying key predictors of osteoporosis, potentially improving non-invasive screening tools.
Methods
We utilized the “Bone Mineral Density” dataset from Harvard University, categorizing variables into demographics, bone loss prevention treatments, and lifestyle/associated pathologies. A Boolean variable “OP” was defined to indicate presence of osteoporosis. Data preprocessing included imputation for missing values and normalization. Statistical analyses such as Welch’s T-test and multiple linear regression were conducted to identify significant markers. Additionally, unsupervised learning techniques, like Louvain clustering, were used to uncover natural data patterns, while supervised learning models aimed to predict osteoporosis.
Results
The study identified significant markers associated with osteoporosis, including magnesium levels, high- and low-density lipoproteins, chronic obstructive pulmonary disease, and chronic kidney disease. Multiple linear regression analysis revealed robust associations between osteoporosis and biomarkers such as ALT, BUN, CREA, LDL-C, and Mg, even after controlling for age, height, and weight. Among supervised learning models, Random Forest and Gradient Boosting algorithms showed the highest accuracy in predicting osteoporosis, with Random Forest identified as the preferred model.
Conclusions
In our work, it was possible to identify the main biochemical parameters and clinical histories that are potential predictors of osteoporosis. The findings suggest that integrating machine learning techniques with clinical data can enhance early detection and intervention strategies, ultimately improving patient outcomes. Further refinement is recommended to validate these results across diverse populations and improve predictive models for broader application.